You are on page 1of 13

Running a website on Amazon EC2

Written by Diginmotion
Thursday, 11 November 2010 05:44

About Amazon Web Services

I won't recount the history and philosophy of Amazon's Web Services (for that, I recommend
visiting their site , their blog , or the RightScale blog ), but I'd like to at least introduce them
and describe why I think they are hugely important. Many web developers are familiar with
Amazon's Simple Storage Service (S3),
which lets you store gigabytes of data reliably in a globe-spanning system and serve that data
to users on the cheap. Less well known is the
Elastic Compute Cloud
(EC2).

EC2 is a service whereby you can create virtual machines and run them on one of Amazon's
data centers. These virtual machines take a slice of that center and simulate the hardware of a
physical server. You can choose from a variety of machine configurations that have different
processing powers, memory configurations, and virtual hard drive sizes. Once you have picked
the class of your machine, you start it by loading an Amazon Machine Image (AMI). You can
pick from one of the many public AMIs, most of them based on Linux, or any custom ones you
may have created.

The greatest advantage of EC2 is the power it provides you with. The classes of hardware you
can choose from start at a pretty powerful level and only increase from there. Your EC2 virtual
machine sits on Amazon's data pipe, and thus has tremendous bandwidth available to it. You
can create as many machines as you want and start or stop them on demand. You are only
billed for the computing time and bandwidth you use.

To date, EC2 has been limited in terms of what you can use it for, due to a lack of persistent
storage. You can create a custom virtual machine image that runs on EC2, but that machine
runs as if it is in RAM, where all data or changes since the image started are lost when it
terminates. Termination can be caused by a software shutdown or by a failure of some sort at
the data center. While these failures are infrequent, the possibility of data loss on EC2 made it
impractical to host websites on the service. More common applications included large-scale
image processing, searches of massive DNA sequence databases, and other computationally
intensive tasks that only needed to be fed initial data and to produce a result of some kind.

That changed August 20 with the release of Amazon's Elastic Block Store (EBS) . With EBS,
you can create persistent virtual drives that look like network-attached storage and that will
exist after your EC2 instances have terminated. This enables the operation of EC2-based

1 / 13
Running a website on Amazon EC2

Written by Diginmotion
Thursday, 11 November 2010 05:44

websites where the core configuration of the server is set up ahead of time and saved as a
custom AMI, and the data that changes (logs, databases, HTML files, uploads) is all stored on
the persistent drive.

EBS volumes are on RAID-equivalent hardware within a data center, but you will want to create
even more reliable backups for critical data. Another new feature introduced with EBS was
snapshots, where the EBS volumes can be set up to do automated incremental backups to an
S3 bucket. Not only is this providing very dependable storage for the site's data, but the
time-based snapshots give you the ability to roll back your site to specific points in time. This is
very similar to Mac OS X Leopard's Time Machine backup feature and is a unique and useful
capability.

One thing I haven't mentioned is cost. I did say that you only will be billed for what you use.
For an entry-level virtual machine (also called an instance), you are billed $0.10 for every hour
it is operational. For a web server that is on all the time, that comes to $72 per month.
Bandwidth is billed at $0.17 per GB downloaded, with volume discounts available if you pass
10 TB per month. An EBS store will run you $0.10 per GB per month. Amazon has a site
where you can estimate the cost of running an EC2 instance. Basically, it's a flat rate of $72
per month, with a small amount on top that scales very nicely with load. This is a little more
costly than some virtual private server services, such as
Media Temple's (dv) Base
at $50 per month, but the fact that you get access to something as powerful as Amazon's data
centers and network connections might make up the difference and more.

Setting up your AWS account and tools

If you are still interested in the service, you'll first need to set up an account with Amazon. If
you already have a normal shopping account with Amazon, you can simply add the charges for
the web services to that. Once you have an Amazon account, go to the AWS site and sign up
for the web services. After you've signed up, go to the AWS Access Identifiers page and write
down your Access Key ID and Secret Access Key. While you're on that page, create a new
X.509 certificate by following their instructions. Download that certificate and your private key.
You'll need all these identifiers later.

After the initial signup, you'll need to go to the EC2 page and click on "Sign Up For This Web
Service". This will allow you to use EC2 and it will also sign you up for S3. Don't worry, you
won't be charged anything until you actually start using the services.

2 / 13
Running a website on Amazon EC2

Written by Diginmotion
Thursday, 11 November 2010 05:44

Next, you'll need the best currently available graphical interface to EC2, Elasticfox . Elasticfox
is a Firefox plugin that gives you control over all aspects of EC2. In Firefox, visit the link above
and install the plugin. To open Elasticfox within Firefox, select the Tools | Elasticfox menu
item.

Before you can use Elasticfox, you'll need to configure your Amazon Web Services credentials.
Click on the Credentials button in the upper left of the screen, which should bring up a dialog
where you can enter your Access Key and Secret Access Key and give your account a name.
Enter those, click Add and then click Close.

The next step is to configure a key pair for use when starting up instances. This public / private
key pair will allow you to log in as root to a new instance generated off of a public machine
image without the use of a password. Click on the KeyPairs tab and then on the green
"Create a new key pair" button. Give this new key pair a name and choose a location to save
your private key. Click on the Tools icon in the upper right to set the location of the key on your
local file system. Change the SSH Key Template field to be something like

${home}/Documents/Website/EC2/${keyname}.pem
changing the middle part of the path to reflect where you actually placed the key file.

That should take care of setting up your tools, now to configure an instance.

Setting up a security group

The first step is to set up a security group. A security group acts like a configuration file for a
firewall. It lets you set which ports are open to the world and which are closed.

Click on the Security Groups tab, then on the green plus button to add a new group. Give it a
name and description. Select that new group (as this is a browser plugin, you may need to hit
the blue refresh icon to see the new group in the list).

By default, all ports are closed, so we'll want to open up the ones we'll use. In the "Group
Permissions" section of the window, click on the green checkmark to bring up a dialog asking
which ports you'd like to open. Choose the TCP/IP protocol, set a port range of 22 to 22, and
click Add. This will open up port 22, necessary if you want to have SSH access to your
instances. Repeat the process for port 80 (HTTP) and, if desired, port 443 (HTTPS).

3 / 13
Running a website on Amazon EC2

Written by Diginmotion
Thursday, 11 November 2010 05:44

Creating a persistent storage volume

As I said earlier, the addition of persistent storage to EC2 is what makes hosting a website on
the service practical. To create a persistent store, first click on the Volumes and Snapshots
tab, then click on the green plus button. Choose a size for your volume (I went with 1 GB,
because I didn't have a lot of data to store) and an availability zone.

Availability zones are basically Amazon data centers. Currently there are three of them, all on
the east coast of the U.S. Remember what availability zone you picked here, because you'll
need to select that same zone when starting your instance. An instance can only mount a
persistent storage volume existing within the same data center.

Starting and configuring your instance

With all the preparatory work out of the way, it's time to start up a virtual machine and
customize it to meet our needs. Click on the AMIs and Instances tab to be brought to the main
control panel for starting and monitoring your compute instances. First, we'll need to pick a
public image to start from. You can create one from scratch if you want, but that's beyond the
scope of this walkthrough. If you do not see a list of AMIs under the Machine Images section,
click on the blue refresh button and wait a few seconds for the list to load. The image I chose
was one of those contributed by RightScale , a company that provides EC2 services and has
contributed quite a bit back to the community. The specific image is ami-d8a347b1, a 32-bit
image based on CentOS 5. This particular image is pretty stripped down and should be a good
starting point for a no-frills web server.

Find the image in the list using the search function, click to select it, and click on the green
Launch Instance(s) button. A dialog will appear to let you configure the virtual hardware
characteristics of this instance. Choose an instance type of m1.small, the least powerful of all
the hardware types. Select your keypair by name in the dropdown list, and select the same
availability zone as you had chosen for your persistent storage volume. Click on the default
security group on the right side of the dialog and move it to the left, while moving your custom
security group to the right to assign it to this instance. When everything is set, click on Launch.

Your new instance will now boot. This should take a minute or two. You'll need to keep
checking on the status of your instance by clicking on the blue refresh icon under the Your
Instances section of the screen. Once the status changes from Pending to Running, you're
ready to start using the new virtual machine.

4 / 13
Running a website on Amazon EC2

Written by Diginmotion
Thursday, 11 November 2010 05:44

The first thing you'll want to do with this machine is to SSH into it. Click to highlight your
instance and click on the Open SSH Connection button. If you properly set the location of your
private key as described above, a terminal window should open and you should be connected
to your virtual machine as the root user. The warning message you get means nothing. It is
due to a slight bug in this particular public image.

Now it's time to start installing the packages you'll need to configure a full Drupal site. Execute
the following commands to install PHP and MySQL:

yum -y install php


yum -y install php-gd
yum -y install php-mbstring
yum -y install mysql
yum -y install mysql-server
yum -y install php-mysql

The XFS filesystem seems to be the preferred choice for the persistent storage, due to its
ability to freeze the filesystem during a snapshot operation, and you'll need the following to be
installed to take advantage of that:

yum -y install kmod-xfs.i686


yum -y install xfsdump.i386
yum -y install xfsprogs.i386
yum -y install dmapi

If you want SSL on your web server, you may want to run the following:

yum -y install mod_ssl

If you're like me, you'll want to create a non-root user to have on hand for day-to-day use. To
create that user and set its password, enter the following

adduser [username]
passwd [username]

I wanted to do password-based SFTP transfers using that new user because I really like to use
Transmit
. To enable password-based authentication for users, execute

nano /etc/ssh/sshd_config
to edit the SSH configuration and change the appropriate line to read
PasswordAuthentication yes

5 / 13
Running a website on Amazon EC2

Written by Diginmotion
Thursday, 11 November 2010 05:44

while leaving password authentication disabled for the root user. Restart the SSH daemon
using the following command:
/etc/rc.d/init.d/sshd restart

I also wanted to install eAccelerator to speed up PHP on the server, but the yum install of it
failed with dependency errors. Therefore, to install it, you'll need to download php-eaccelerato
r-0.9.5.2-2.el5.remi.i386.rpm
and SFTP it over to the server. Once on the server, install it using

rpm -ivh php-eaccelerator-0.9.5.2-2.el5.remi.i386.rpm Attaching the persistent


storage volume

With the base configuration of the image now how we want it, it's time to attach the persistent
storage volume. Go back to Elasticfox and select the Volumes and Snapshots tab. Select the
volume you created and click on the green checkmark to attach it to your running instance. A
dialog will appear asking for you to select the instance to attach this to (there should only be
the one in the pull-down list) and a device path. Enter a path of /dev/sdh for this volume and
proceed. Your volume should shortly be attached to your instance.

Switch back to the instance to format and mount the volume. Run the following commands to
create an XFS filesystem on the device, and create a mount point for it at /persistent.

mkfs.xfs /dev/sdh
mkdir /persistent

Edit /etc/fstab and insert the following line at the end of that file:

/dev/sdh /persistent xfs defaults 0 0

Now you can mount the persistent store volume at the path /persistent using the following
command:

mount /persistent

Anything placed in the /persistent directory will survive sudden termination of the running
instance, so you'll want to place log files, databases, the files that define your site architecture,
and anything else you don't want to lose in that directory.

The first step is to move users' home directories. You can do this either by moving specific
users' directories or by moving the /home directory to /persistent, then using the ln -s command

6 / 13
Running a website on Amazon EC2

Written by Diginmotion
Thursday, 11 November 2010 05:44

to symbolically link the new location to its old place on the file system.

Next, you will want to move your log files. I chose to move my Apache server logs to the
persistent store. To do this, I created a /persistent/log/http directory to store these logs (and a
/persistent/log/https for the SSL logs). To point Apache's logging facilities to this directory,
you'll need to edit /etc/httpd/conf/httpd.conf and change the following lines:

ErrorLog /persistent/log/http/error_log
CustomLog /persistent/log/http/access_log combined

Finally, the MySQL database should be pointed to the persistent store. To do this, create a
/persistent/mysql directory and edit the following line in /etc/my.cnf:

[mysqld]
datadir=/persistent/mysql Installing Drupal

With the persistent store volume in place, it is time to install Drupal. To do so, go to the
/persistent directory and download and extract the latest version of Drupal (6.4 as far as this
writing) using the following commands:

wget http://ftp.drupal.org/files/projects/drupal-6.4.tar.gz

This should leave you with a drupal-6.4 directory. Rename this to html and point Apache to it
by editing /etc/httpd/conf/httpd.conf again and changing the following lines:

DocumentRoot "/persistent/html"
<Directory "/persistent/html">

You will need to set up a cron job to perform automated updates of your Drupal site's search
index and other maintenance tasks. To do that, copy the /persistent/html/scripts/cron-lynx.sh to
/etc/cron.daily. Edit that file to replace the sample domain name with your own, save it, and
make it executable using chmod +x.

A database will need to be created and installed in MySQL for use with Drupal. Before you can
do that, make sure the MySQL database is running using the following command:

/etc/rc.d/init.d/mysqld restart

For security, it's a good idea to set a password for the root MySQL user using the command

mysqladmin -u root password [password]

7 / 13
Running a website on Amazon EC2

Written by Diginmotion
Thursday, 11 November 2010 05:44

Next, create a database for your Drupal installation and a user that can access that database.
You can use the following to create a database named "drupal" (feel free to change the name)
and give the user "drupal" running on the local computer access:

grant all privileges on drupal.* to 'drupal'@'localhost' identified by '[password]';

Finally, direct Drupal to use that database by editing /persistent/html/sites/default/settings.php


to change the following line (you may need to make it writeable first, save it, then remove the
writeable bit):

$db_url = 'mysql://drupal:[password]@localhost/drupal';

Once that's all set up, your Drupal installation should be good to go. If you had an existing
Drupal database from another site, you could import from an SQL dump using

mysql -u root -p drupal < drupal.sql

Otherwise, you can set up a fresh Drupal site by loading the installation page. Your site is
currently accessible to the outside world only via a special name that will resolve to its dynamic
IP address within Amazon's data center. To obtain that name, go to Elasticfox and
double-click on your running instance. Within the dialog that appears, copy the Public DNS
Name and paste it into your web browser. Add the path element /install.php and load the
resulting page. The remainder of the setup required for Drupal is beyond the scope of this
guide, but I direct you to my previous post about Drupal, as well as the main Drupal.org site
for more information.

Tuning MySQL and Apache for Drupal

You can spend a long while tweaking your server to run Drupal in an optimal fashion, but I'd
like to share some of the settings that seem to work for me right now. Most of these were
arrived at through trial-and-error with my simple site, and may not scale to your particular
setup, so take them with a grain of salt.

First, let's look into Apache optimizations. A setting that I've found to help reduce transfer sizes
on your site is to enable Gzip compression of all pages that Apache serves. To do this, you
can add the following to the end of your /etc/httpd/conf/httpd.conf file:

SetOutputFilter DEFLATE
BrowserMatch ^Mozilla/4 gzip-only-text/html
BrowserMatch ^Mozilla/4.0[678] no-gzip
BrowserMatch bMSI[E] !no-gzip !gzip-only-text/html
SetEnvIfNoCase Request_URI
.(?:gif|jpe?g|png)$ no-gzip dont-vary

8 / 13
Running a website on Amazon EC2

Written by Diginmotion
Thursday, 11 November 2010 05:44

Header append Vary User-Agent env=!dont-vary

I also noticed a significant performance improvement caused by changing the following line:

KeepAlive On

For Drupal's Clean URLs to work, the following line needs to be changed under the <Directory
"/persistent/html"> section:

AllowOverride All

Related to Apache tuning are PHP settings. Edit the following lines in /etc/php.ini

max_execution_time = 60
max_input_time = 120
memory_limit = 128M

To apply these settings, restart the Apache server using

/etc/rc.d/init.c/httpd restart

MySQL tuning is tricky and is site-dependent. I'll just list the current settings in my /etc/my.cnf
file and let you decide if they'll work for your case:

[mysqld_safe]
log-error = /var/log/mysqld.log
pid-file = /var/run/mysqld/mysqld.pid
 
[mysqld]
datadir = /persistent/mysql
socket = /var/lib/mysql/mysql.sock
user = mysql
old_passwords = 1
query_cache_limit = 12M
query_cache_size = 32M
query_cache_type = 1
max_connections = 60
key_buffer_size = 24M
bulk_insert_buffer_size = 24M
max_heap_table_size = 40M
read_buffer_size = 2M
read_rnd_buffer_size = 16M
myisam_sort_buffer_size = 32M
sort_buffer_size = 2M
table_cache = 1024
thread_cache_size = 64

9 / 13
Running a website on Amazon EC2

Written by Diginmotion
Thursday, 11 November 2010 05:44

tmp_table_size = 40M
join_buffer_size = 1M
wait_timeout = 60
connect_timeout = 20
interactive_timeout = 120
thread_concurrency = 4
max_allowed_packet = 50M
thread_stack = 128K

Again, to apply these settings, restart the MySQL server using

/etc/rc.d/init.c/mysql restart Setting up persistent store snapshotting

With all of your website's content now stored on the persistent volume, it's time to set up
automatic snapshotting of that volume. As mentioned before, one of the unique features of
Amazon's Elastic Block Store on EC2 is the capability to do incremental snapshots that are
stored on S3. This provides a secure offsite backup of the important data on your website and
lets you roll back to the state of your server at the time of any of the snapshots. For example, if
your site was hacked two days ago but you only found out about it now, you could restore your
site to the state it was in just before that.

The first step in setting this up is to install the latest binaries for Amazon's EC2 AMI and API
tools. The latest EC2 API tools can be downloaded from this page and the latest AMI tools
can be grabbed from here
. I downloaded the Zip files, uploaded them to the running instance, and unzipped them. For
the particular AMI that I started from, there was a /home/ec2 directory that contained older
versions of the tools. I deleted the /home/ec2/bin and /home/ec2/lib directories and replaced
them with the contents of the bin and lib directories from those two Zip files.

You will need to have the X.509 certificate and private key (you downloaded these when setting
up your Amazon Web Services account) on your instance, so upload those now. Create a
/home/ec2/certs directory and move these pk-*.pem and cert-*.pem files there. In your
/root/.bashrc file, add the following lines to make sure that the EC2 tools know where to find
your certificate and key:

export EC2_CERT=/root/[certificate name].pem


export EC2_PRIVATE_KEY=/root/[private key name].pem

The backup script that will run every hour will need to lock the MySQL database during the
snapshot process, so create a /root/.my.cnf file that has the following format:

10 / 13
Running a website on Amazon EC2

Written by Diginmotion
Thursday, 11 November 2010 05:44

[client]
user=root
password=[password]

I use two scripts, with cron calling one which in turn calls the other. The first is called takesnap
shot and
should be downloaded and placed in /etc/cron.hourly. You will need to edit this file to insert the
volume ID of your persistent store. This ID can be found in Elasticfox under the Volumes and
Snapshots tab. Finally, make this script executable using chmod +x.

The second script is called ec2-snapshot-xfs-mysql.pl and is a modified version of the one Eric
Hammond created for his tutorial here .
This one does all the heavy lifting by locking the MySQL database (to ensure that the
snapshotted database will be in a workable state upon a restore) and by freezing the XFS
filesystem of the volume during the snapshot process. Move this script to /usr/bin, edit it to
point to the proper file names of your X.509 certificate and private key, and make it executable.

With all this in place, you should be able to test the snapshot process by manually running the
takesnapshot script. If it runs without errors, go to Elasticfox and refresh the Snapshots section
of the Volumes and Snapshots tab. Your new snapshot should appear in the list.

Creating and attaching an Elastic IP Address

This site is now fully operational, so it is time to give it a publicly accessible static IP address.
Amazon offers what are called Elastic IP Addresses. These are static IP addresses that you
can requisition on the fly and attach to any of your running instances. This means that you can
have a static IP address that your outside DNS records will point to, but be able to switch it
between different instances within EC2. This is extremely useful for development, where you
might want to clone your existing site off of a snapshot, try out a new design, and if that design
works you would simply switch over the Elastic IP Address to point to the development server
to make it live.

Creating and assigning an IP address is simple. Return to Elasticfox and click on the Elastic
IPs tab. Within this tab, click on the green plus button to allocate a new address.
Unfortunately, Elasticfox does not give you an easy drop-down menu for selecting the ID of
your running instance, so go to the AMIs and Instances tab and copy down that ID. Return to
the Elastic IPs, select the IP address, and click the green button to associate this IP with an
instance. Enter in the instance ID that you wrote down and proceed.

11 / 13
Running a website on Amazon EC2

Written by Diginmotion
Thursday, 11 November 2010 05:44

It takes a few minutes for the assignment to propagate through Amazon's routers, but once the
process is done you should be able to see your new web site at that static IP address. You can
then set up your DNS records to point to this new address.

Bundling and uploading your custom AMI

Before we are finished, you should wrap up your changes to the virtual machine and create a
new custom image. Even though your website's data is protected on the persistent store
volume, all the configuration changes you've made to the base machine will be reset upon
termination of the running instance. To preserve them, you'll need to save them in a new AMI
that can be started at any point in the future.

To do this, first shut down MySQL and Apache and unmount your persistent store using the
following commands:

/etc/rc.d/init.c/mysql stop
/etc/rc.d/init.c/httpd stop
umount /persistent

Go to Elasticfox and retrieve your Owner ID from the running instance. Copy that and paste it
within the following command, which creates the new AMI:

ec2-bundle-vol --fstab /etc/fstab -c /home/ec2/certs/[certificate] -k /home/ec2/certs/[private


key] -u [Owner ID]

This will create the image in the /tmp directory, but that image still needs to be uploaded to S3.
Upload it using the following command:

ec2-upload-bundle -b [S3 bucket name] -m /tmp/image.manifest.xml -a [Access Key ID] -s


[Secret Access Key]
where the S3 bucket name is a globally unique identifier. It can be the name of an S3 bucket
you already use or a new one, in which case the bucket will be created (if the name is
available).

You will need to register this new AMI with Elasticfox by going to the AMIs and Instances tab
and clicking the green plus button under the Machine Images section. The AMI manifest path
that it will ask for is your S3 bucket's name followed by /image.manifest.xml. Elasticfox should
add your AMI to the list of public ones (it will be marked "private"). If you don't see it right
away, you can do a search for a substring within the name of your bucket.

12 / 13
Running a website on Amazon EC2

Written by Diginmotion
Thursday, 11 November 2010 05:44

As a final test, start a new instance based on this custom AMI while your original instance is
running. If this new image boots to a running state, SSH into it to make sure that everything is
operational. If so, shut down the old instance, dissociate the persistent store volume and
Elastic IP from it, and associate them both with the new instance. Mount the /persistent
directory on the new instance and start up the MySQL and Apache servers. Your new website
should now be complete and running well on EC2.

Conclusion and additional resources

Thank you for reading this far. I'm sorry that this turned into a far longer post than I had
intended. I may have gone overboard on the detail, but I hope that this was of some use to you
in either starting out with EC2 or learning a bit more about what it offers you.

Unfortunately, you can see that these services are currently very intricate to set up and are
aimed at developers, not casual users. Elasticfox, as incredibly impressive a tool as it is, is still
limited by the fact that it's a Firefox browser extension and not a full desktop application. I'm
sure that the brilliant engineers at Amazon and / or members of the AWS community soon will
be designing tools to allow the average user to take advantage of EC2 and their other services.
Amazon is sitting on core technology that I believe will have a tremendous impact on the Web
over the next 5-10 years as it becomes accessible to more users.

13 / 13

You might also like