While it may not be obvious from all my sysadmin-style tinkering discussed in this blog, I am also a software developer. Lately I've been working on a Java GUI program that has to save and load a complex tree of settings. While the project is nearing its completion, I've decided to occasionally spend some time trying to find and fix various issues and performance bottlenecks. This is the story of one of those bottlenecks...
A common way to save and load application preferences in Java, is to use the Preferences API. This API uses a different back-end implementation on each operating system, but essentially provides a tree view for storing preferences. Its implementation is also intended to make the backing-store somewhat transparent, so you just have to set your preferences. There is no need to perform a "save" operation afterwards.
Now first of all, I did want to make saving an explicit operation in my application. So while I maintained my own configuration tree, writing it to nodes in the Java Preferences tree was a user-triggered operation. Originally I also wrote out the Preferences tree to a file on disk, which was loaded in place of Java's Preferences store. However, upon realizing that was a needless waste of resources, I removed the external configuration file.
One major part of my application involves configuring a number of "thingies," where each "thingie" has a potentially large number of individual parameters directly under its configuration node. A problem I noticed was that as I increased the number of "thingies" in the configuration, save and load became dramatically slower. However, here's the really interesting part. It only really became slower on MacOSX! On my Linux test machine, the performance impact wasn't even noticeable!
Thanks to the wonderful profiler in NetBeans, I was able to track down this issue to functions under "java.util.prefs.AbstractPreferences", or more particularly "java.util.prefs.MacOSXPreferences". Apparently the MacOSX implementation of the node() and various put() methods can be quite slow. If you have enough of them, it really adds up.
So how did I fix it? Well, something a bit less elegant than how I was doing things before. You see, the save and load methods for "thingie" configurations really just involved converting items between a Map and Preferences nodes. Since the Map really just managed access to an object which contained a collection of simple types (String, Integer, etc.), I got an idea. Why not just serialize the Map directly, and store it as a byte array in a Preferences node? Sure, it may seem like a bit of an inelegant solution, but it worked! Not only did it work, but it resulted in a MASSIVE performance increase.
So how much of a performance improvement does this make? Reading the program's configuration used to take 2195ms, according to the profiler. It now only takes 956ms. However, writing used to take 7314ms. Now that number is down to 645ms!
So, what did we learn?
1. Java performance characteristics can vary wildly across operating systems.
2. Object serialization to a byte array node in the Java Preferences system can be significantly faster than creating an elegant structure within the Java Preferences API to store a lengthy configuration, especially on MacOSX.