You are on page 1of 5

AMAZON S3 AND EC2

PERFORMANCE REPORT

OVERVIEW
A frequently asked question regarding the Amazon Web Services (AWS) cloud computing platform
is how well their storage system (S3) performs with their computing platform (EC2). As a file
sharing solution that runs entirely within the AWS cloud, HostedFTP.com has, based on our internal
performance data, created this report to discuss the performance you can expect when storing and
retrieving files between an EC2 instance and S3.

We will also be reporting on how the AWS infrastructure performs over time. Each month we will
publish updates to the data to give you an insider's view on how well AWS scales as they continue to
add capacity and customers.

THE PERFORMANCE MODEL


When storing or retrieving a file with S3 we expect the performance to be comprised of two parts: a
fixed transaction cost that is not related to file size and a variable bandwidth cost that is related to
file size. In other words, we expect a linear performance model for storing and retrieving files
between S3 and EC2.

Determining the fixed transaction and variable bandwidth costs is the goal of our data analysis.
THE DATA
Here is the data, grouped by file size, for storing and retrieving files:

Data set 1: Storing a file

File size range Sample Size Average file size Average time Performance
(ms) (MB/s)

From To
0 KB 100 KB 7658 44 KB 122 0.36
100 KB 200 KB 1922 157 KB 146 1.08
200 KB 300 KB 1574 250 KB 138 1.82
300 KB 400 KB 1562 347 KB 206 1.69
400 KB 500 KB 1014 451 KB 177 2.55
500 KB 600 KB 431 546 KB 173 3.15
600 KB 700 KB 545 655 KB 287 2.28
700 KB 800 KB 204 721 KB 192 3.75
800 KB 900 KB 113 845 KB 429 1.97
900 KB 1.0 MB 101 959 KB 327 2.93
1.0 MB 1.1 MB 118 1.05 MB 332 3.16
1.1 MB 1.2 MB 88 1.15 MB 243 4.74
1.2 MB 1.3 MB 78 1.25 MB 224 5.60
1.3 MB 1.4 MB 89 1.35 MB 315 4.28
1.4 MB 1.5 MB 88 1.45 MB 281 5.16
1.5 MB 1.6 MB 92 1.55 MB 261 5.95
1.6 MB 3.2 MB 1450 2.3 MB 554 4.16
3.2 MB 6.4 MB 862 4.8 MB 536 9.02
6.4 MB 12.8 MB 182 8.6 MB 1132 7.58
12.8 MB 25.6 MB 792 16.3 MB 1522 10.69
25.6 MB 51.2 MB 340 36.9 MB 3367 10.97
51.2 MB 102.4 MB 147 72.4 MB 6741 10.74
102.4 MB 5 GB 81 460.3 MB 37883 12.15
ANALYSIS
The variable bandwidth cost when storing a file is between 10 and 12 MB/s. To determine the fixed
transaction cost we can perform a linear regression. The following chart illustrates:

This chart shows that the fixed transaction cost when storing a file is around 140 ms.

Data set 2: Retrieving a file


File size range Sample Size Average file size Average time Performance
(ms) (MB/s)
From To
0 KB 100 KB 11363 53 KB 5 10.43
100 KB 200 KB 3754 157 KB 16 9.75
200 KB 300 KB 2420 245 KB 27 8.93
300 KB 400 KB 2299 345 KB 36 9.85
400 KB 500 KB 1282 452 KB 45 10.03
500 KB 600 KB 603 552 KB 52 10.53
600 KB 700 KB 1007 654 KB 60 10.95
700 KB 800 KB 531 719 KB 66 10.85
800 KB 900 KB 185 844 KB 71 11.84
900 KB 1.0 MB 167 957 KB 71 13.48
1.0 MB 1.1 MB 202 1.05 MB 93 11.30
1.1 MB 1.2 MB 271 1.15 MB 106 10.83
1.2 MB 1.3 MB 168 1.25 MB 125 10.04
1.3 MB 1.4 MB 156 1.35 MB 117 11.53
1.4 MB 1.5 MB 125 1.45 MB 159 9.08
1.5 MB 1.6 MB 148 1.55 MB 140 11.10
1.6 MB 3.2 MB 2043 2.35 MB 186 12.67
3.2 MB 6.4 MB 1382 4.74 MB 435 10.89
6.4 MB 12.8 MB 485 8.63 MB 833 10.36
12.8 MB 25.6 MB 935 16.49 MB 1405 11.74
25.6 MB 51.2 MB 636 36.76 MB 3878 9.48
51.2 MB 102.4 MB 202 71.38 MB 7003 10.19
102.4 MB 5 GB 105 399.84 MB 35081 11.40
ANALYSIS
The variable bandwidth cost when retrieving a file is again between 10 and 12 MB/s. Unlike with
storing files there appears to be no discernible fixed transaction cost

CONCLUSION
We can conclude the following from the above analysis:

1. The variable bandwidth cost when storing and retrieving files is between 10 and 12 MB/s
2. The fixed transaction cost when storing a file is roughly 140ms and negligible when
retrieving a file

METHODOLOGY
We tracked the number of milliseconds (ms) it takes to store and retrieve files from S3 using large
EC2 instances in the us-east-1a availability zone. We used the JetS3t Java Library to handle
the actual storing and retrieving of files which in turn uses the Commons HttpClient Library.
We started the timer at the point when the file is being stored or retrieved to S3.

We included 50,000 total data points (stores and retrieves) in our analysis, drawn randomly from
the month of February. These data points cover all days of the week and hours of the day.

LIMITATIONS AND OTHER CONSIDERATIONS


The maximum throughput (around 50 MB/s) you can expect when using S3 from a large EC2
instance is discussed here. Since we actively load balance our instances we don't anticipate that this
limit had any discernible impact on our results.

To track the time it takes to store and retrieve a file we use the Java System.currentTimeMillis()
function. From the documentation on this function:

Note that while the unit of time of the return value is a millisecond, the granularity of the value
depends on the underlying operating system and may be larger. For example, many operating
systems measure time in units of tens of milliseconds.

Since we have a large number of data points at small file sizes this should have a very limited
impact on our results.

Our times do not include failed attempts to store or retrieve a file from S3.
ABOUT HOSTEDFTP.COM
HostedFTP.com is a cloud file sharing solution that's secure, reliable and easy to use. Designed for
use with both web browsers and FTP clients, HostedFTP.com improves network security and saves
your business money.

For more information please visit our website www.hostedftp.com

You might also like