Friday, April 27, 2007

Liveness Rule #2: Asynchronize Everything for Writes

And by everything, I mean at least as much as you possibly can. Writes are slow for all the same reasons brought up in the caching post, and then some. With writes you can add more potential performance speed bumps to "database hit list", such as table locking or row locking depending on how the database handles it and disk write speeds.

Caching works wonders on the Read side and because caching allows you to take the Read load off the database, it actually does help a lot to increase Write performance. What else can we do to increase liveness for operations that might take more than a fraction of a second? The answer: Go asynchronous.

Good examples of things that should be asynced:

  • Sending mail notifications. This usually takes a significant amount of time, so you should always async this
  • Database updates that the user doesn't need to see a response for. For instance, logging, counters (how many times has the user viewed item X), etc?

How Do I Implement Asynchronous Ops?

If you're using Java, the java.util.concurrent package is your new best friend. This has everything you need to asynchronize things in a simple and elegant way. The classes you will be interested in are Executor, ExecutorService, Executors (static factory methods). An Executor is essentially just a queue of tasks with a thread pool to dedicated to executing those tasks.

Here's a quick getting started guide to asynchronicity heaven:

  1. Create the ExecutorService:
    ExecutorService executor = Executors.newFixedThreadPool(THREAD_POOL_SIZE);
    Keep this available as a global variable, you can use this same Executor throughout your entire application if you want.
  2. Now pull out any code you want to execute asynchronously into small tasks that implement Runnable. For example, an emailer task might look like this:
    public class EmailTask implements Runnable {
    private Email email; // the email we want to send.
    public EmailTask (Email email) {
    this.email = email;
    }
    public void run() {
    // SEND MAIL HERE
    }
    }
  3. Now when executing the event action, create a new instance of your Task class and pass it to the ExecutorService's execute command:
    EmailTask emailTask = new EmailTask(email);
    executorService.execute(feedRetrieverTask);
  4. Done! That task will execute at some point in the future and you can respond to your user immediately without making him/her wait.

Another huge benefit of Asynchronizing things is that application automatically becomes more scalable and doesn't degrade as traffic goes up. Lets say you did the above example synchronously and it took 2 seconds on average to send an email. That may be fine when you have a small number of users (although 2 seconds is way too long to make someone wait), but what happens if you have have 1000 users? That's a total of 33.3 minutes (2000 seconds) of waiting time just for 1000 users. And lets say you have a maximum of 200 processing threads for your app server (Tomcat default). It won't take much to run out of processing threads if your traffic goes up, so now you have people waiting to get the application to respond at all. Pretty soon your app is on it's knees, dying a slow painful death. You can avoid all of these problems by taking a few extra seconds to implement it like I've shown above.

Learn it, live it, love it. Once you start getting into the habit of doing things like this, there's no turning back and you can rest easy knowing that you've future proofed your application.

Liveness Rule #1: Cache Everything for Reads

Caching is really the holy grail of liveness. Consider the difference between hitting your database for every request vs pulling that data directly out of memory. Here is a small comparison of performance related properties with pros in green and cons in red.

Memory:

  1. Near instantaneous access to data
  2. Concurrent contention is virtually nil
  3. Limited amount of memory

Database Hit:

  1. Disk seek speed
  2. Concurrent contention as threads wait for disk access
  3. IO speed
  4. The database query speed
  5. Connection setup/teardown
  6. Marshalling and unmarshalling
  7. Large amount of disk space available

These are all pretty obvious points, but it is just to highlight how much more stuff is done when you don't cache. These things are really taxing on the entire system, much more so than pulling stuff from memory which also means your application is far more scalable.

Our experience:
Traffic was increasing in large amounts and the response times were getting worse and worse. Not to the point where it was the end of the world, but bad enough that we had to do something about it before it got even worse (over a second is bad). Now that we cache almost everything, response times are down to a few milliseconds, right up there with Google search responses.

Caching is actually pretty simple to implement. No matter what solution you use, the interface is always something like java.util.Map where you put, get, and remove elements.

Thursday, April 26, 2007

Liveness In Web Applications

Liveness: A concurrent application's ability to execute in a timely manner is
known as its liveness.

In other words, liveness is making your application respond extremely fast to your users requests. Having had to deal with this over the past few months on rel8r, I thought I'd share my experiences.

  1. Caching
  2. Asynchronizing