Faster FreeMarker

|

If Spring's MVC framework has a failing, it is its lack of opinion. Unlike the new breed of web application frameworks led by Ruby on Rails, Spring refuses to tell you what technologies you should use. You want JSP? That's fine. JSF will work too. Maybe Velocity? No problem. Excel, PDF, even old school print statements.

The default view technology still seems to be JSP, although these days you won't get far without also using the annoyingly verbose JSTL tag libraries. I must admit that the concept of JSP pages has always appealed to me. Flip the coding problem inside out. Focus on the large blocks of static text and throw in a dash of dynamic functionality when needed. Once you're done, the compiler will create the servlet that you would have written.

For some reason, however, I've never been impressed with the results of JSP. The code generated by the compiler seems a little too clunky. The class loader behaves in unexpected ways. And the performance just doesn't seem to be there.

Templating languages like Velocity and FreeMarker, on the other hand, seem like exactly the wrong approach. Too much introspection, too much wrapping, too much dynamicism, too much magic. And yet, strangely, it doesn't seem to matter. After putting FreeMarker through its paces, I finally was able to come to believe that I could be happy using it.

Why choose FreeMarker over Velocity? Performance must be key to Velocity, shouldn't it? Speed is in its name? I haven't used Velocity enough to really give it a fair chance. FreeMarker's syntax just seemed a little more robust, and a little more powerful. I'm a programmer, after all, and I don't need my view technology to shield me from programming. Separation of concerns is a nice enough ideal, but I'm not willing to sacrifice power or performance to get it.

And so, with a view technology finally chosen, I can start confidentally down the path of building the next great web application. Spring's support of FreeMarker isn't quite up to its support for JSP, but it isn't a bad starting point. And the lack of structure means there's less framework to get in my way and keep me from doing what I want to do.

I very quickly, however, discovered what seemed to be a noticeable flaw in the default FreeMarker view. Spring chooses to render the view directly into the servlet's output stream. This is probably the safest thing to do, especially if you want to stream extremely large views to the client, but there are at least a few downsides to this approach. First, since the final output is unknown, the Content-Length header cannot be sent before the beginning of the output is flushed to the client. And since the output is being streamed directly to the servlet's output stream, each fragment of template output potentially must wait on an expensive network call. Most visibly, however, if an error interrupts the generation of content, there's no way to throw what's already been rendered.

Fortunately, Spring's lack of guidance works in your favor here. A very simple subclass of the FreeMarker view can enable fully buffered rendering out the output. Here's the start of my custom FreeMarker view:

public final class FreeMarkerViewImpl extends FreeMarkerView {
  private static final int INITIAL_BUFFER_SIZE = 8*1024;

@Override
@SuppressWarnings("unchecked")
protected final void processTemplate(Template template, Map model,
HttpServletResponse response) throws IOException, TemplateException {
ByteArrayOutputStream out = new ByteArrayOutputStream(INITIAL_BUFFER_SIZE);
template.process(model, new OutputStreamWriter(out));

response.setContentLength(out.size());
out.writeTo(response.getOutputStream());
}
}

Hooking the custom view into the Spring servlet configuration is just as easy. A single line in the definition of the view resolver is all that it takes:

<bean id="freeMarkerViewResolver" class="org.springframework.web.servlet.view.freemarker.FreeMarkerViewResolver">
    <property name="viewClass" value="com.mobileduo.web.mvc.FreeMarkerViewImpl"/>
    ....
</bean>

SPAM!

| | Comments (3)

I woke up this morning to find that about 150 spam comments, all variations of about half a dozen different comments, were posted to a single entry on this blog. Obviously my blog's spam filter needs a little bit of tuning, but this is ridiculous. What's the best way of dealing with unwanted spam these days?

Eventful Times

|

I spent some time this weekend working through some projects in Eric Meyer's "Eric Meyer on CSS" to brush up on some CSS techniques, or perhaps more accurately to actually learn how to use style sheets effectively in a Web page.

I found this line particularly amusing:

Although they aren't common, it's surprising just how useful events calendars can be on the Web. A personal site mught use one to indicate when a web log was updated or to show important dates in history. Even more interesting, an organization or community could use a calendar to publicize upcoming event.

Although they aren't common? Ah, yes, times seem to have changed a bit since 2002.

FastCGI Is Dead, Long Live...?

| | Comments (2)

Recently I've been experimenting with writing an Apache module, mostly to get a deeper understanding of how the server works. I've been intrigued by the concept of the APR portable runtime for a while, and with the recent release of Apache 2.2 and some of its more interesting features (like the Event MPM and the mod_dbd database manager), now seemed as good a time as any for a little exploration.

Writing modules in C can be hard, not so much because C is an intrinsically more difficult language to use but rather because there aren't as many cohesive, readily accessible libraries and frameworks available for modern Web 2.0-ish development. That means if I'm ever to have any hope of getting real applications written, I'll need to fall back on my trusty Ruby on Rails.

Running Rails means FastCGI. And so I download the FastCGI developer kit and install it on my server. No problems so far. Then I download mod_fastcgi to let Apache communicate with my Rails apps and...it won't even compile. It seems that this module hasn't been updated in a number of years. It works well enough with Apache 1.3. You can get it to run with Apache 2.0, although at the office we downgraded back to the older server after experiencing occassional unexplained problems. The support for Apache 2.0, it seems, relied on some compatability code to mask some of the rather substantial changes in the API. This compatability layer, however, was removed in the Apache 2.2 release and so the module no longer works at all.

There's an alternative module: mod_fcgid. I grabbed the source and was able to compile the module! After fussing around with build tools for a while (it seems the GNU libtools applications were installede without shared object support) I got the module built and installed. Unfortunately, bringing up the web server is rewarded with an immediate crash. After looking around on the web for a while, I discovered that the solution was to go back two releases of the module and then patch the code to supply a missing header file. Finally I was able to bring the server online, but by that time I was ready to fall asleep, too tired to see if I could actually invoke a FastCGI script.

Is this any kind of indication of the quality of open source software???

I suppose I could switch to using Lighttpd as my web server, which does come with a functional FastCGI interface. But then I wouldn't have access to Apache for my other experiments or for things like installing a Subversion source code repository.

The popular solution to this dilemma seems to be to configure Apache to run as a proxy in front of Lighttpd. But why would I want to do that? In addition to the extra performance overhead and increased risk of failure due to the increasingly complex number of moving parts, doesn't this just subject me to all of the weaknesses of both products?

Remote Controls

|


Remote Controls
Originally uploaded by rgcottrell.



I now have three different remote controls to switch songs on iTunes. Oh, and the keyboard and mouse still work pretty well too.

New Toys

|

I am sitting rather uncomfortably in front of my 42 inch plasma TV. That's old news, I've had it for over three years now. What's new is the Intel Mac Mini connected to it, with iTunes playing on the home sterero, streaming music wirelessly from my main computer in the other room. I just got the Apple AirportExpress stereo connection kit, not because I have an AirportExpress (I don't) but because it was the most convenient way to acquire that funky mini-phono plug to TOSLINK digital optical cable needed to connect the computer to the stereo in glorious 5.1 surround sound.

I am really surprised that sitting about three feet away from the screen actually delivers a pretty good picture, good enough to write blog posts at least. I still haven't figured out what I'm going to do with this system, as I've got three other Macs around the house that are much more convenient to work with. I have some vague ideas about experimenting with home multimedia and theater software, but I'm not sure exactly how that will be realized.

My tentative plan for a weekend project is to see if I can get the FireWire SDK and samples installed and play with the demos to grab video from the cable set top box and save it to disk. That's phase one of my secret plan at least....

Deer Park

| | Comments (1)

I've been relatively unhappy with the performance of Firefox on my new Intel iMac. The Rosetta technology is a great tool for moving PowerPC apps over to the new platform, but for something as big and bloated as a web browser, well there's really no choice but to go native.

I was surprised that Firefox has not yet come out with a universal binary version of the browser. At the Apple WWDC developer conference last year one of the sessions on porting apps to the new platform featured a live demonstration of the very minor changes needed to get Firefox to build.

I just discovered that there is a test version of the browser that is built as a universal binary. OK, so it's still alpha software but I need the extra speed. Besides, this isn't anything new for me. I first started using the Netscape Navigator browser at some sickening low version, 0.89 I believe, back in the day when you could get almost daily releases of the app.

So this post is now being written with the "Deer Park" browser. Wow, this thing is fast. I can't believe how fast pages are loading now. Even our top secret LOB site is rendering pages at breathtaking speed. I guess this shouldn't be too much of a surprise. After all, the old PowerBooks that we all have at work are aging machines.

So Tired, or Late Night Database Hacking

|

I was up until 2:00 am last night working on the database. I wanted to wait until the regular 1:00 am full backup completed so there would be something to go back to if I really screwed something up badly.

Using my training in the scientific method that I gained as a physics student at one of the top research universities in the country, I decided that my strategy would be to blindly poke at things and see if anything happened. And I would have to poke hard. If a little is a good thing then surely a lot must be even better. I made some adjustments to three or four settings and restarted the database. Things seemed to be working better than ever. Of course, it's hard to judge because the site always performs well under the light night traffic.

I went to sleep confident that I had accomplished something good. The sound of the pager at 4 am suggested otherwise. With a strange calmness I was able to form a bit of a theory as to why the database was freaking out and back out those changes. The clues were there earlier but at that time I wasn't aware of what they meant.

I was perhaps a little more conservative than necessary, but I waas able to leave in a change that I think will actually make the biggest difference. And with that, I've consumed pretty much all of the collective knowledge I could glean off the internet for tuning MySQL databases.

The only thing left that I can think would make a difference is to sacrifice some up to the second synchronization gurantees for reduced disk access. Dr. Brain may disagree, but I'm not convinced that an fsync() after each and every transaction is really necessary. But even this will become less of an issue when our new database server with its faster striped disks comes online.

With things working once again, I headed back to bed, only to wake up too soon to the ringing of the alarm clock. I pressed snooze for as long as I could, but I needed to be awake and ready to meet Andrej for breakfast this morning. And so I'm not sure whether I will have enough energy to carry me through the new year or whether I will just crash sometime later this evening.

Memcache Mysteries

| | Comments (2)

A little more digging into the memcache code revealed some interesting details. It looks like the root of the problem is due to socket options in the server. To get the maximum network performance the server tries to disable the default Nagle packet buffering algorithm. On systems that support the TCP_NOPUSH socket option, the server will bracket network writes within a no push section and then let the operating system send back the result as soon as all the data has been written. If the system doesn't support TCP_NOPUSH, memcache will instead fall back to TCP_NODELAY.

It looks like FreeBSD supports the TCP_NOPUSH option but it doesn't seem to work exactly the way you would want it to. Reading up on the newsgroups, it looks like there have been some proposed kernel patches to bring FreeBSD's handling more in line with what is found on Linux. I didn't really want to mess with the kernel, so I simply recompiled the memcache server to use TCP_NODELAY.

Initial testing looks good. The 100 millisecond response is now processed by the Ruby client in just over one millisecond. This is definitely much better than 100 milliseconds. I'll let the new server run on our staging machines for a while before trying to push it out to the live site.

Ruby & Memcache

| | Comments (2)

My natural distrust of the Ruby programming language might have caused me to miss something important. On 43 Things we rely heavily on memcache to offload database reads. After a bit of work simplifying and tuning the networking code, we were able to get very good response times for data lookups. Occassionally, however, the timings drifted from their usual submillisecond range to close to 100 milliseconds. I just assumed that either Ruby networking code occassionally freaked out or that we were caching complicated data structures that could take a while to parse.

Recently I've been working on some alternative algorithms to help solve some of the performance problems we've been seeing on a few of the more intensive pages. These solutions make extensive use of caching. While a single 100 ms lookup on a page might slip through without much notice, a handful of them will easily kill page serve times.

I dug into the code a little deeper and added some additional logging statements around cache access. Data marshalling costs of even the most complicated structures could account for no more than about 2 ms of the 100 ms. Something was wrong. Then I noticed that while most of the entries we read and write are fairly small--often only several hundred bytes--one entry that seemed to be performing consistently poorly was a larger 22K. Now 22K isn't that big, but it was a clue.

Last night I had downloaded the C language libmemcache client to think about whether it might make sense to ditch Ruby for a few resource intensive computations. The unit tests in the package include a benchmark app that can repeatedly make requests of a certain size to the cache server. With some trial and error I found that reading entries sized 14,304 bytes completed in about 130 microseconds, while reading entries 14,305 bytes or larger required 100 milliseconds. This is pure, untainted, wonderful C code so there's no way I can blame Ruby for these strange results.

Something strange is afoot at the memcache server....