Generally, the JavaOne technical sessions were very good, but I didn’t have any interesting interactions with other attendees. I posted on a few finance LinkedIn groups looking for other attendees who do high frequency trading, but didn’t get any responses. Since I didn’t get anything out of the other attendees, it would have been equally productive to watch the online video of the interesting sessions and skip the conference.
Below are my lightly edited notes:
CTO of CBOE
– Intel’s new Nehalem processor has excellent performance; it does overclocking on the fly.
– 300k transactions per second (up from 5k in 2001, after starting the Java project in 1998)
– Infiniband support, direct NIC access
– async nio
– G1 garbage collector, much easier to tune than CMS
Real World Real-time
– Mike Fulton, IBM Canada
– Stop The World GC provides very good throughput
– IBM’s JVM has a Generational GC called “metronome”, does frequent GCs that take a consistent 1 ms
– standard JVMs basically ignore thread priorities
– IBM has a RTSJ compliant JVM
– higher priority threads tend to have less work to do, and therefore complete quickly
– setup SCHED_FIFO threads, thread are less likely to be paused by the OS
– Ahead of time compilation cannot do some optimizations (mostly in-lining) that are allowed at runtime, so dynamic compilation performance is better than ahead-of-time
– can do ahead-of-time compilation; can then do additional dynamic compilation at runtime.
– RedHat, Novell have realtime linux distributions, plus Solaris, AIX
– the realtime OSes work to eliminate variablity, not to increase overall speed
JVM Performance talk
– max performance setup is Intel 5570 running 6u14-p
– 64 bit performance now exceeds 32 bit performance, and tuning efforts are going to be concentrated on 64bit
– Try the G1 gc
– Use all available hw threads for max throughput
– Xx usenuma flag on numa machines
– Xx Usecompressedoops
– Experiment with Intel’s VTune, use the latest version
Garbage Collection Tuning (not applicable to G1)
– Supervise the heap, don’t exceed available memory on the box (not total memory, but available memory)
– Disable virtual memory as we don’t want to use it?
– The time to collect the young generation is proportional to the size of objects that survive, not to the size of the heap being collected.
– Experiment with –XX +PrintTenuringDistribution
– Should always have some GC logging enabled in the production, the cost is minimal:
– PrintGCTimeStamps, PrintGCDetails and –xloggc:<filename> to start
– use a script to analyize the logs, e.g., PrictGCDetails