A couple of days ago, I moved another application to Amazon Web Services. This is my third application I've setup on their infrastructure and I have to say, I think I'm addicted to the throwaway nature of Amazon Elastic Compute Cloud. I've got the whole long term, persistent storage thing nailed and can do quick recoveries so I am no longer worried about servers crashing... and boy that's a good feeling.
On our previous dedicated servers, there's always a worry about having a server meltdown because there is no quick way to get a new one up and running. It could take a day or more. Now I just launch new servers for fun. ;)
The big lessons learned:
- Automate absolutely everything. So when something goes wrong, you run a script or two to get a new instance running. And practice, practice, practice doing emergency recoveries (I've had to recover from crashed instances several times now so I learnt this lesson the hard way).
- Scale early. As soon as you notice some performance or memory limitations, start adding more instances to spread out the load. Need performance, put your database on a separate server from app. Need redundancy, replicate your database on two servers. Need memory, fire up some more instances just for caching (memcached, ehcache, jbosscache).
- Keep your database as small as possible. Store as much data as possible on Amazon Simple Storage System (S3), NOT in your database and just store the S3 key to the data in your database. Consider putting any blob or large text fields in S3. This will make it much easier and faster to manage database backups, plus your database will perform better.

19 comments:
Not an EC2 user (yet!). Are there some resources you'd recommend to get inundated with the knowledge necessary to become effective with the AWS systems? Thanks.
Yes Yes and Yes.
Caching is a big deal. The key to scaling EC2 apps is figuring out how to cache as much as possible, and how to chop up and distribute anything CPU intensive.
The standard approach to depending on database query caching and master-slave database relationships doesn't scale well in a "horizontal" environment like EC2. The more you can do to avoid real-time database dependencies the happier you'll be!
Similarly, you may have to stripe your data across different servers. Think of it like RAID 1, as applied to servers instead of disks.
It's a different way of thinking about things that doesn't fit how most web application frameworks interact with databases. So, if you're going to deploy on EC2 and you're expecting a lot of traffic ... a little planning up front will save you a LOT of headaches down the road.
Good article!
James D Kirk, the Getting Started docs in the Resources section will get you up and running, but from then on, it's a lot of trial and error since everyone has different needs and requirements.
Have you had any issues with latency either between EC2 instances (particularly database and application assuming they're on different machines) or between EC2 and S3?
Do you always get the performance you hope for storing data in S3 and accessing it from an EC2 node?
Thanks.
James, I should have also mentioned the forums: http://developer.amazonwebservices.com/connect/forum.jspa?forumID=30 , they are your best friend. The Amazon staff are usually very quick to respond on the forums as well.
Matt,
The network between instances and S3 is very fast. I just pushed a ~2.2 GB file to S3 and it took 3 minutes, 27 seconds. So that's ~10 MB/sec.
That being said, if it's data that is accessed and modified a lot, it's kept in a database on an EC2 instance. And I do a lot of caching to reduce the hits on the db and S3.
Thanks for the tips, Travis. Soooo much to learn, so much!
"Keep your database as small as possible. Store as much data as possible on Amazon Simple Storage System (S3), NOT in your database and just store the S3 key to the data in your database. Consider putting any blob or large text fields in S3. This will make it much easier and faster to manage database backups, plus your database will perform better."
How exactly does that work?
Rahsun,
As an example off the top of my head, lets use a webmail scenario with the following table:
MailMessage
- bigint id
- from varchar (255)
- to varchar (255)
- sentDate datetime
- subject varchar (255)
- body text
The body column will generally be taking up most of the space in this table, and possibly probably also taking up a huge percentage of your entire database. If you have a lot of users, this may soon get out of control. Lets say the average message is 1kB and you have 100 million messages, you now have a 100GB+ database to deal with.
So instead of storing the "body" contents in the database, change the body field to a key to the contents in S3:
- body_key varchar (64)
Or better yet, just use the MailMessage id as the key for S3 then you can drop the body column altogether.
Now in your application when the MailMessage is requested, you get the "body" content from S3 on demand. A helper method can make this easy:
String body = MyS3Util.get(body_key);
So now we've reduced the size of the database big time, 100GB going by the numbers above. Not only that, but the growth rate (and therefore the stress rate) has also been substantially reduced.
There are other ways of dealing with this such as partitioning, etc, but the S3 option is very simple and very scalable.
Try to find the 10% of your tables that take up 90% of your space and see if this approach can be applied. I'd love to hear your stories.
"I've got the whole long term, persistent storage thing nailed and can do quick recoveries so I am no longer worried about servers crashing... and boy that's a good feeling"....
Is it possible to share how you 'nailed' it...the specifics? Many of us would like to know.
travis,
That's very interesting. I'll investigate that approach further. It's sounds really cool - IF - I can get that to work.
Are you using something like MySQL S3 Storage Engine to do this. Or, are you creating your own custom method to accomplish this?
I am running a Rails/MySQL application and I wanted to see if I could run the DB on a separate server with everything else on EC2. In your point #2, you said "Need performance, put your database on a separate server from app." - did you mean on a separate server within EC2 or a separate server altogether?
raghus, I meant on a separate EC2 server. I would NOT recommend hosting the database with a different provider, the latency will kill your performance.
rahsun, I am not using MySQL S3 Storage Engine. I am doing regular backups of the database (check out MySQL binlogging to make this efficient, other db's should have it too).
Truthfully, I don't think S3 Storage Engine will ever fly, there is just too much performance loss using http for every database hit.
Travis,
Not bandwidth, latency. The profile of web app access to databases is lots of connections with small payloads. Moving a large file around EC2 or from S3 to EC2 doesn't really simulate access speed.
What I'm especially interested in is whether the latency between EC2 nodes is such that database clustering (where there has to be some synchronization between nodes) won't work.
Thanks.
Have you had a chance to look at RightScale, www.rightscale.com. Their interface will help you address most all of these items.
Great advice, I'm considering using Prevayler (java in memory persistence) mapped to s3...although I must confess, I'm still stuck on the task of just getting an instance with a JDK on it.
I tried mget to sun, but that's not workable...I guessing I'm going to have to upload the jdk rpm to my s3, and get it from there.
How did you do it?
taylor: If you get Prevayler working, I'd like to hear about it. An interesting idea, but I'd imagine that you'd have to be pretty sure that your database would never get big.
As for getting java on an instance, check out this step by step on how to get the JDK onto your server.
Hi - I'm an EC2 user and I have two Amazon EC2 instances to call Facebook's API. The problem I have is with the network latency between apps.facebook.com and EC2, it's about 78ms per ping. Have you experienced latency problems too? - Mark
Post a Comment