Archive for November 2008
For too long we’ve let the JRuby core contributors be the only voice for JRuby. I for one am guilty of taking and taking and taking from the tireless and thankless work the JRuby team has done. Charles, Ola, Tom, Nick, Vladimir and many others need to be thanked.
Almost all of the JRuby projects I’ve been aware of or a part of are nowhere to be found on blogs, twitter or any other techno-coder communication flavor of the month. These projects aren’t going to become popular or have the codista spotlight on them. Most JRuby work is done in the deep inner workings of the corporate bureaucratic sinkhole that is enterprise IT. JRuby work is hidden behind non-disclosure agreements and kept secret because of the technological edge secrecy provides. The great stories haven’t been told and Charles is only able to hint at them because they really aren’t his to tell.
This is one such story and I hope that this post encourages other JRubyists to speak up and at least share parts of their JRuby experience. You owe it to the JRuby team and the Ruby community in general.
I’ll start out by being blunt and if you want to dismiss the rest of the post due to the next sentence then go ahead and move along because this post is not for you. JRuby is fantastic. The rest of this post will hopefully explain why that statement is true.
I joined a project that started out using MRI to wrap a C library, which built into a gem. The C library is a financial analytics package used to price instruments and extract contract specifications. Working in C with MRI was easy. The Ruby C API methods are simple and you almost get the sense you’re working in Ruby. Everything was lollipops and gumdrops just as working with Ruby should be. A rails application was built to display data provided by the gem. As the rails community moved through new deployment strategies so did we moving from webrick to mongrel with lighttpd, etc.
Then some of the business specifications required pricing to be done on hundreds of thousands of instruments at a time. An order of magnitude change in usage made speed and memory usage become very important. Pricing that many instruments in 3 months is not helpful. It needed to be done in parallel.
With this many instruments needing to be priced my team and I created a simple system for distributing the data and processing it in parallel using DRb similar in many ways to Hadoop. Had we been using JRuby at the time Hadoop would be perfect to wrap, but MRI didn’t give us that option.
Right around this point in time MRI became a huge bottleneck. MRI wasn’t going to handle the over 6 million objects we needed to push through DRb and even if it could get data to the workers on a grid of machines it couldn’t fully utilize all the cores on each machine. A combination of running out of memory and MRI failing to fully utilize all the cores of 64-core servers ground the project to a halt.
JRuby 1.0 had just been released and was starting to gain some traction. A 1.0 project? Certainly that can’t handle these problems. With nowhere else to go we took part of the C Ruby API and moved to a C, JNI, Java and JRuby stack. The new stack of tools wasn’t lollipops and gumdrops, but if it worked then who cares? Not me. I enjoyed the polygot work passing ruby objects into C callbacks and unit testing C from JRuby. Mind streaching stuff.
Turns out JRuby had no problems managing 8GB of memory and 6 million plus objects being passed around over DRb. Having the JVM do memory management for your Ruby objects isn’t that bad. I didn’t have to care about it anymore and not caring about the JVM is light years ahead of caring about MRI memory management. Yes, there is real value to using the JVM.
Additionally, JRuby fit into the rest of our MRI system because we weren’t having any problems with MRI talking to JRuby over DRb. I ran into a few problems with IO and Socket, but Charles and Ola were available via IRC and the problems were fixed in a matter of days. The availability of the JRuby team is something I haven’t found in any other community. Charles always put my questions before his other tasks and if you know anything about the man, he is busy. I don’t know how many talks he’s done recently, but his twitter messages list so many cities I’m not sure where he lives anymore.
The initial pricing times came in at around an hour and fifteen minutes. Not bad considering the client was ok with 2 days. JRuby FTW!
Now the story could end here and I’d consider the transition to JRuby a success, but the story goes on.
Tweaking the JVM options allowed us to move the time to about 45 minutes once we upgraded to JRuby 1.1.1. I added some of my findings to the performance tuning wiki page, which you can find here. When was the last time you heard of someone passing some options to MRI’s garbage collector and see performance increases?
Exporting this much data turned out to be a problem as well. Excel’s 256 column limit wasn’t to happy about my 9,000+ column files and the standard ruby spreadsheet gem had trouble handling anything more than 7MB. Fortunately, the Apache POI (Java) project could handle some these problems as well as other features like auto-sizing columns and freeze panes, which no other MRI compatible gem could provide (Yes, there are Ruby POI bindings). I never thought I’d enjoy working with POI/Excel’s API, but JRuby plus POI libraries had me smiling. Excel with a ruby feel rocks. Using JRuby to wrap pre-existing Java solutions is a great way to sleep at night.
Next we moved our Rails apps over to JRuby by deploying them as wars in JBoss. Managing the mongel problem was gone and JBoss turned out to be much faster anyway provided you give it enough memory. Nick Seiger has done some great work with warbler and the process was a breeze. Unfortunately, with the number of apps we moved over the DBAs were starting to get upset about the 60+ database connections we used. Rails 2.2 wasn’t around yet so connection pooling inside of Rails wasn’t an option, but using JNDI inside of JBoss worked perfectly. Using prexisting Java tool’s with an adapter written by the JRuby team made my job a lot easier again.
Meanwhile, JRuby was still releasing new versions. Using 1.1.3 moved our pricing time to about 15 minutes. Yes, from 1 hour and 15 to 15. There were some other tweaks we made along the way, but the most significant improvements came from JRuby itself. In it’s current state the C/Java/JRuby API is now exposed through Merb (http/json), DRb (druby), Rails (http/xml, http/html) and an Excel plugin and more opportunities are ahead.
We’re able to upgrade to a new version of JRuby within a day of release. Yes, it is that stable and easy to switch. Yes, 1.1.5 is currently in our production environment. Upgrading to new versions of MRI was usually a nightmare for me so I welcome the stability. JRuby being a jar has some wonderful benefits.
I won’t go into detail about the other libraries we’ve wrapped with JRuby including JFreeChart and QuicKFIX/J. I won’t go into detail about using JRuby with CORBA or RMI and the many other tools that become available to you with the use of JRuby.
Currently, MRI isn’t even installed on our production servers and I don’t see it being installed in the future. Most if not all the ways that the data is available or usable is due to JRuby. JRuby made my job much easier and many of the features I’ve implemented possible. Give it a try.