You are on page 1of 6

#117 Get More Refcardz! Visit refcardz.

com

Getting Started with


CONTENTS INCLUDE:
n
Introduction
Apache Hadoop

Apache Hadoop
n

n
Hadoop Quick Reference
n
Hadoop Quick How-To
n
Staying Current
n
Hot Tips and more...
By Eugene Ciurana and Masoud Kalali

INTRODUCTION

This Refcard presents a basic blueprint for applying


MapReduce to solving large-scale, unstructured data
processing problems by showing how to deploy and use an
Apache Hadoop computational cluster. It complements DZone
Refcardz #43 and #103, which provide introductions to high-
performance computational scalability and high-volume data
handling techniques, including MapReduce.

What Is MapReduce?
MapReduce refers to a framework that runs on a computational
cluster to mine large datasets. The name derives from the
application of map() and reduce() functions repurposed from
www.dzone.com

functional programming languages.


 
APACHE HADOOP
• “Map” applies to all the members of the dataset and
returns a list of results
Apache Hadoop is an open source, Java framework for
• “Reduce” collates and resolves the results from one or
implementing reliable and scalable computational networks.
more mapping operations executed in parallel
Hadoop includes several subprojects:
• Very large datasets are split into large subsets called splits
• MapReduce
• A parallelized operation performed on all splits yields • Pig
the same results as if it were executed against the larger • ZooKeeper
dataset before turning it into splits • HBase
• HDFS
• Implementations separate business logic from multi-
• Hive
processing logic
• Chukwa
Getting Started with Apache Hadoop

• MapReduce framework developers focus on process


dispatching, locking, and logic flow This Refcard presents how to deploy and use the common
tools, MapReduce, and HDFS for application development
• App developers focus on implementing the business logic after a brief overview of all of Hadoop’s components.
without worrying about infrastructure or scalability issues

Implementation patterns
The Map(k1, v1) -> list(k2, v2) function is applied to every
item in the split. It produces a list of (k2, v2) pairs for each call.
The framework groups all the results with the same key
Get over 90 DZone Refcardz
together in a new split. FREE from Refcardz.com!
The Reduce(k2, list(v2)) -> list(v3) function is applied
to each intermediate results split to produce a collection
of values v3 in the same domain. This collection may have
zero or more values. The desired result consists of all the v3
collections, often aggregated into one result file.

MapReduce frameworks produce lists of values.


Hot Users familiar with functional programming
Tip mistakenly expect a single result from the
mapping operations.

DZone, Inc. | www.dzone.com


2 Getting Started with Apache Hadoop

•Z
 ooKeeper - a distributed application management tool
Hot http://hadoop.apache.org is the authoritative for configuration, event synchronization, naming, and
Tip reference for all things Hadoop. group services used for managing the nodes in a Hadoop
computational network.

Hadoop comprises tools and utilities for data serialization, file


Sqoop is a product released by Cloudera, the most
system access, and interprocess communication pertaining
influential Hadoop commercial vendor, under the
to MapReduce implementations. Single and clustered Hot Apache 2.0 license. The source code and binary
configurations are possible. This configuration almost Tip packages are available at:
always includes HDFS because it’s better optimized for high
http://wiki.github.com/cloudera/sqoop
throughput MapReduce I/O than general-purpose file systems.

Components Hadoop Cluster Building Blocks


Figure 2 shows how the various Hadoop components relate to Hadoop clusters may be deployed in three basic configurations:
one another:
Mode Description Usage
Local Multi-threading components, single Development,
(default) JVM test, debug
Pseudo- Multiple JVMs, single node Development,
distributed test, debug
Distributed All components run in separate Staging,
nodes production

Figure 3 shows how the components are deployed for any of


these configurations:
 
Essentials
•H
 DFS - a scalable, high-performance distributed file
system. It stores its data blocks on top of the native file
system. HDFS is designed for consistency; commits aren’t
considered “complete” until data is written to at least two
different configurable volumes. HDFS presents a single
view of multiple physical disks or file systems.
•M
 apReduce - A Java-based job tracking, node
management, and application container for mappers and
reducers written in Java or in any scripting language that
supports STDIN and STDOUT for job interaction.

Hadoop also supports other file systems


likeAmazon Simple Storage Service (S3), Kosmix’s
Hot CloudStore, and IBM’s General Parallel File System.  
Tip These may be cheaper alternatives to hosting data in Each node in a Hadoop installation runs one or more daemons
the local data center. executing MapReduce code or HDFS commands. Each
daemon’s responsibilities in the cluster are:
Frameworks
•C
 hukwa - a data collection system for monitoring, displaying, •N
 ameNode: manages HDFS and communicates with every
and analyzing logs from large distributed systems. DataNode daemon in the cluster

•H
 ive - structured data warehousing infrastructure that • J obTracker: dispatches jobs and assigns splits (splits) to
provides a mechanisms for storage, data extraction, mappers or reducers as each stage completes
transformation, and loading (ETL), and a SQL-like •T
 askTracker: executes tasks sent by the JobTracker and
language for querying and analysis. reports status
•H
 Base - a column-oriented (NoSQL) database designed for •D
 ataNode: Manages HDFS content in the node and
real-time storage, retrieval, and search of very large tables updates status to the NameNode
(billions of rows/millions of columns) running atop HDFS.
These daemons execute in the three distinct processing
Utilities layers of a Hadoop cluster: master (Name Node), slaves (Data
•P  ig - a set of tools for programmatic flat-file data Nodes), and user applications.
analysis that provides a programming language, data
transformation, and parallelized processing. Name Node (Master)
• Manages the file system name space
•S
 qoop - a tool for importing and exporting data stored in
• Keeps track of job execution
relational databases into Hadoop or Hive, and vice versa
using MapReduce tools and standard JDBC drivers. • Manages the cluster

DZone, Inc. | www.dzone.com


3 Getting Started with Apache Hadoop

• Replicates data blocks and keeps them evenly distributed configuration from the Hadoop site. All the configuration
files are located in the directory $HADOOP_HOME/conf; the
•M
 anages lists of files, list of blocks in each file, list of
minimum configuration requirements for each file are:
blocks per node, and file attributes and other meta-data
•T
 racks HDFS file creation and deletion operations in an •h
 adoop-env.sh — environmental configuration,
activity log JVM configuration, logging, master and slave
configuration files
Depending on system load, the NameNode and JobTracker
• c ore-site.xml — site wide configuration, such as users,
daemons may run on separate computers.
groups, sockets
Although there can be two or more Name Nodes in •h
 dfs-site.xml — HDFS block size, Name and Data
a cluster, Hadoop supports only one Name Node. node directories
Hot Secondary nodes, at the time of writing, only log
Tip what happened in the primary. The Name Node is a
•m
 apred-site.xml — total MapReduce tasks,
JobTracker address
single point of failure that requires manual fail-over!
•m
 asters, slaves files — NameNode, JobTracker,
Data Nodes (Slaves) DataNodes, and TaskTrackers addresses, as appropriate
• Store blocks of data in their local file system
Test the Installation
• Store meta-data for each block Log on to each server without a passphrase:
• Serve data and meta-data to the job they execute ssh servername or ssh localhost

• Send periodic status reports to the Name Node Format a new distributed file system:
hadoop namenode -format
•S
 end data blocks to other nodes required by the
Name Node Start the Hadoop daemons:
start-all.sh
Data nodes execute the DataNode and TaskTracker daemons
described earlier in this section. Check the logs for errors at $HADOOP_HOME/logs!

User Applications Browse the NameNode and JobTracker interfaces at


•D ispatch mappers and reducers to the Name Node for (localhost is a valid name for local configurations):
execution in the Hadoop cluster
• http://namenode.server.name:50070/
•E
 xecute implementation contracts for Java and for • http://jobtracker.server.name:50070/
scripting languages mappers and reducers
• Provide application-specific execution parameters HADOOP QUICK REFERENCE

•S
 et Hadoop runtime configuration parameters with
semantics that apply to the Name or the Data nodes The official commands guide is available from:
http://hadoop.apache.org/common/docs/current/commands_
A user application may be a stand-alone executable, a script, a manual.html
web application, or any combination of these. The application
is required to implement either the Java or the streaming APIs. Usage
Hadoop Installation hadoop [--config confdir] [COMMAND]
[GENERIC_OPTIONS] [COMMAND_OPTIONS]

Cygwin is a requirement for any Windows systems


Hot running Hadoop — install it before continuing if
Hadoop can parse generic options and run classes from the
Tip command line. confdir can override the default $HADOOP_HOME/
you’re using this OS. conf directory.

Required detailed instructions for this section are available at: Generic Options
http://hadoop.apache.org/comon/docs/current -conf <config file> App configuration file
•E
 nsure that Java 6 and both ssh and sshd are running in -D <property=value> Set a property
all nodes
-fs <local|namenode:port> Specify a namenode
•G
 et the most recent, stable release from
-jg <local|jobtracker:port> Specify a job tracker; applies only
http://hadoop.apache.org/common/releases.html
to a job
• Decide on local, pseudo-distributed or distributed mode
-files <file1, file2, .., fileN> Files to copy to the cluster (job only)
• Install the Hadoop distribution on each server
-libjars <file1, file2, ..,fileN> .jar files to include in the classpath
•S
 et the HADOOP_HOME environment variable to the directory (job only)
where the distribution is installed -archives <file1, file2, .., fileN> Archives to unbundle on the
• Add $HADOOP_HOME/bin to PATH computational nodes (job only)

Follow the instructions for local, pseudo-clustered, or clustered $HADOOP_HOME/bin/hadoop precedes all commands.

DZone, Inc. | www.dzone.com


4 Getting Started with Apache Hadoop

User Commands Wildcard expansion happens in the host’s shell, not


archive -archiveName file.har Create an archive Hot in the HDFS shell! A command issued to a directory
/var/data1 /var/data2 Tip will affect the directory and all the files in it,
distcp Distributed copy from one or more inclusive. Remember this to prevent surprises.
hdfs://node1:8020/dir_a node/dirs to a target
To leverage this quick reference, review and understand all the
hdfs://node2:8020/dir_b
Hadoop configuration, deployment, and HDFS management
fsck -locations /var/data1 File system checks: list block/ concepts. The complete documentation is available from
fsck -move /var/data1 location, move corrupted files to / http://hadoop.apache.org.
fsck /var/data lost+found, and general check
job -list [all] Job list, dispatching, status check, HADOOP APPS QUICK HOW-TO
job -submit job_file and kill; submitting a job returns
job -status 42 its ID
A Hadoop application is made up of one or more jobs. A job
job -kill 42
consists of a configuration file and one or more Java classes or
pipes -conf file Use HDFS and MapReduce from a a set of scripts. Data must already exist in HDFS.
pipes -map File.class C++ program
pipes -map M.class -reduce Figure 4 shows the basic building blocks of a Hadoop
R.class -files application written in Java:
queue -list List job queues

Administrator Commands
balancer -threshold 50 Cluster balancing at percent of
disk capacity
daemonlog -getlevel host name Fetch http://host/
logLevel?log=name
datanode Run a new datanode
jobtracker Run a new job tracker
namenode -format Format, start a new instance,
namenode -regular upgrade from a previous version  
namenode -upgrade of Hadoop, or remove previous
namenode -finalize version's files and complete An application has one or more mappers and reducers and a
upgrade
configuration container that describes the job, its stages, and
HDFS shell commands apply to local or HDFS file systems and intermediate results. Classes are submitted and monitored
take the form: using the tools described in the previous section.

hadoop dfs -command dfs_command_options Input Formats and Types


•K
 eyValueTextInputFormat — Each line represents a key
HDFS Shell and value delimited by a separator; if the separator is
du /var/data1 hdfs://node/data2 Display cumulative of files and missing the key and value are empty
directories
• TextInputFormat — The key is the line number, the value
lsr Recursive directory list is the text itself for each line
cat hdfs://node/file Types a file to stdout •N
 LineInputFormat — N sequential lines represent the
value, the offset is the key
count hdfs://node/data Count the directories, files,
and bytes in a path •M
 ultiFileInputFormat — An abstraction that the user
chmod, chgrp, chown Permissions overrides to define the keys and values in terms of
multiple files
expunge Empty file system trash
•S
 equence Input Format — Raw format serialized
get hdfs://node/data2 /var/data2 Recursive copy files to the key/value pairs
local system
•D
 BInputFormat — JDBC driver fed data input
put /var/data2 hdfs://node/data2 Recursive copy files to the
target file system Output Formats
cp, mv, rm Copy, move, or delete files in The output formats have a 1:1 correspondence with the
HDFS only input formats and types. The complete list is available from:
http://hadoop.apache.org/common/docs/current/api
mkdir hdfs://node/path Recursively create a new direc-
tory in the target
Word Indexer Job Example
setrep -R -w 3 Recursively set a file or direc- Applications are often required to index massive amounts
tory replication factor (number of text. This sample application shows how to build a simple
of copies of the file)
indexer for text files. The input is free-form text such as:

DZone, Inc. | www.dzone.com


5 Getting Started with Apache Hadoop

hamlet@11141\tKING CLAUDIUS\tWe doubt it nothing: heartily Job Driver


farewell. public class Driver {
public static void main(String… argV) {
The map function output should be something like: Job job = new Job(new Configuration(), “test”);
job.setMapper(LineIndexMapper.class);
job.setCombiner(LineIndexReducer.class);
<KING, hamlet@11141>
job.setReducer(LineIndexReducer.class);
<CLAUDIUS, hamlet@11141>
<We, hamlet@11141>
job.waitForCompletion(true);
<doubt, hamlet@11141>
}
} // Driver
The number represents the line in which the text occurred. The
mapper and reducer/combiner implementations in this section
This driver is submitted to the Hadoop cluster for processing,
require the documentation from:
along with the rest of the code in a .jar file. One or more files
http://hadoop.apache.org/mapreduce/docs/current/api must be available in a reachable hdfs://node/path before
submitting the job using the command:
The Mapper
The basic Java code implementation for the mapper has the hadoop jar shakespeare_indexer.jar
form:
Using the Streaming API
public class LineIndexMapper
The streaming API is intended for users with very limited Java
extends MapReduceBase
implements Mapper<LongWritable, knowledge and interacts with any code that supports STDIN
Text, Text, Text> { and STDOUT streaming. Java is considered the best choice for
“heavy duty” jobs. Development speed could be a reason for
public void map(LongWritable k,
Text v, OutputCollector<Text, Text> o,
using the streaming API instead. Some scripted languages may
Reporter r) throws IOException { /* implementation here work as well or better than Java in specific problem domains.
*/ } This section shows how to implement the same mapper and
.
reducer using awk and compares its performance against
.
} Java’s.

The implementation itself uses standard Java text manipulation The Mapper
tools; you can use regular expressions, scanners, whatever is #!/usr/bin/gawk -f
necessary. {
for (n = 2;n <= NF;n++) {
gsub(“[,:;)(|!\\[\\]\\.\\?]|--”,””);
There were significant changes to the method if (length($n) > 0) printf(“%s\t%s\n”, $n, $1);
Hot signatures in Hadoop 0.18, 0.20, and 0.21 - check
}
}

Tip the documentation to get the exact signature for the


version you use. The output is mapped with the key, a tab separator, then the
index occurrence.
The Reducer/Combiner
The combiner is an output handler for the mapper to reduce The Reducer
the total data transferred over the network. It can be thought #!/usr/bin/gawk -f
of as a reducer on the local node. { wordsList[$1] = ($1 in wordsList) ?
sprintf(“%s,%s”,wordsList[$1], $2) : $2; }
public class LineIndexReducer
extends MapReduceBase END {
implements Reducer<Text, for (key in wordsList)
Text, Text, Text> { printf(“%s\t%s\n”, key,wordsList[key]);
}
public void reduce(Text k,
Iterator<Text> v, The output is a list of all entries for a given word, like in the
OutputCollector<Text, Text> o, previous section:
Reporter r) throws IOException {
/* implementation */ }
doubt\thamlet@111141,romeoandjuliet@23445,henryv@426917
.
.
} Awk’s main advantage is conciseness and raw text processing
power over other scripting languages and Java. Other
The reducer iterates over keys and values generated in the languages, like Python and Perl, are supported if they are
previous step adding a line number to each word’s occurrence installed in the Data Nodes. It’s all about balancing speed of
index. The reduction results have the form: development and deployment vs. speed of execution.

<KING, hamlet@11141; hamlet@42691; lear@31337> Job Driver


hadoop jar hadoop-streaming.jar -mapper shakemapper.awk
A complete index shows the line where each word occurs, and -reducer shakereducer.awk -input hdfs://node/shakespeare-
the file/work where it occurred. works

DZone, Inc. | www.dzone.com


6 Getting Started with Apache Hadoop

Performance Tradeoff
The streamed awk invocation vs. Java are
functionally equivalent and the awk version is only
Hot about 5% slower. This may be a good tradeoff if the
Tip
scripted version is significantly faster to develop and
is continuously maintained.

STAYING CURRENT

Do you want to know about specific projects and use cases


where NoSQL and data scalability are the hot topics? Join the
scalability newsletter:
  http://eugeneciurana.com/scalablesystems

ABOUT THE AUTHOR RECOMMENDED BOOKS


Eugene Ciurana (http://eugeneciurana.com) is an open-source Hadoop: The Definitive Guide helps you harness the
evangelist who specializes in the design and implementation of power of your data. Ideal for processing large datasets,
mission-critical, high-availability large scale systems. Over the the Apache Hadoop framework is an open source
last two years, Eugene designed and built hybrid cloud scalable implementation of the MapReduce algorithm on which
systems and computational networks for leading financial, Google built its empire. This comprehensive resource
software, insurance, and healthcare companies in the US, Japan, demonstrates how to use Hadoop to build reliable,
Mexico, and Europe. scalable, distributed systems: programmers will find
details for analyzing large datasets, and administrators will
Publications
learn how to set up and run Hadoop clusters.
• Developing with Google App Engine, Apress
• DZone Refcard #105: NoSQL and Data Scalability
• DZone Refcard #43: Scalability and High Availability
BUY NOW
• DZone Refcard #38: SOA Patterns books.dzone.com/books/hadoop-definitive-guide
• The Tesla Testament: A Thriller, CIMEntertainment

Masoud Kalali (http://kalali.me) is a software engineer and


author. He has been working on software development
projects since 1998. He is experienced in a variety of
technologies and platforms.

Masoud is the author of several DZone Refcardz, including:


Using XML in Java, Berkeley DB Java Edition, Java EE Security,
and GlassFish v3. Masoud is also the author of a book on
GlassFish Security published by Packt. He is one of the founding
members of the NetBeans Dream Team and is a GlassFish
community spotlighted developer.

#82

Browse our collection of over 100 Free Cheat Sheets


Get More Refcardz! Visit refcardz.com

CONTENTS INCLUDE:


About Cloud Computing
Usage Scenarios Getting Started with

Aldon Cloud#64Computing

Underlying Concepts
Cost
by...

Upcoming Refcardz
youTechnologies ®

Data
t toTier
brough Comply.
borate.
Platform Management and more...

Chan
ge. Colla By Daniel Rubio

tion:
dz. com

also minimizes the need to make design changes to support


CON

tegra ternvasll
ABOUT CLOUD COMPUTING one time events. TEN TS
INC ■
HTML LUD E:

us Ind Anti-PPaat
Basics
Automated growthHTM
ref car

Web applications have always been deployed on servers & scalable


L vs XHT technologies

nuorn

Valid
ation one time events, cloud ML
connected to what is now deemed the ‘cloud’. Having the capability to support

Network Security
Usef
Du
ti

ul M.
computing platforms alsoulfacilitate
Open the gradual growth curves

n an
Page Source

o

s
Vis it

However, the demands and technology used on such servers Structure

C
faced by web applications. Tools

Core
By Key ■
Elem

atte
E: has changed substantially in recent years, especially with ents Structur
LUD al Elem
TS INC gration
P the entrance of service providers like Amazon, Google and Large scale growth scenarios involvingents
specialized
and mor equipment
rdz !

ON TEN tinuous Inte Change

HTML
ry Microsoft. e chang
es e... away by
(e.g. load balancers and clusters) are all but abstracted
Con at Eve
About ns to isolat
relying on a cloud computing platform’s technology.
Software i-patter
space

ALM
n
Re fca

e Work
Build
riptio
and Ant
Desc
These companies have a Privat
are in long deployedtrol repos
itory
webmana applications
ge HTM
L BAS
Patterns Control lop softw n-con to

that adapt and scale
Deve
les toto large user
a versio ing and making them
bases, In addition, several cloud computing ICSplatforms support data
ize merg
ment
rn
Version e... Patte it all fi minim le HTM
Manage s and mor e Work knowledgeable in amany
ine to
mainl aspects related tos multip
cloud computing. tier technologies that Lexceed the precedent set by Relational
space Comm and XHT

Build
re

Privat lop on that utilize HTML MLReduce,


Practice Database Systems (RDBMS): is usedMap are web service APIs,

Deve code lines a system
d of work
as thescalethe By An
sitory
prog foundati
Ge t Mo

Buil Repo
This Refcard active
will introduce are to you to clouddcomputing,within units with an
RATION etc. Some platforms ram support large grapRDBMS deployments.

The src
dy Ha
softw
and Java s written in on of
riente
e ine loping hical
INTEG
task-o it attribute
softwar
Mainl emphasis onDeve es by all
S these
ines providers, so youComm can better understand
also rece JavaScri user interfac web develop and the rris

Solr
chang
codel desc
INUOU Task Level
ding e code as the
www.dzone.com

NT of buil trol what it is a cloudnize


line Policy sourc es as aplatform
computing can offer your ut web output ive data pt. Server-s e in clien ment. the ima alt attribute ribes whe
T CO cess ion con Code Orga it chang e witho likew ide lang t-side ge is describe re the ima
ise use mec hanism. fromAND
e name
the pro ject’s vers CLOUD COMPUTING PLATFORMS
and subm sourc
applications.
ABOU (CI) is evel Comm
it
the build
with uniqu softw
are from
was HTM
web
The eme pages and uages like Nested
unavaila
ble. s alte ge file
a pro um
UNDERLYING oncCONCEPTS rnate can be
gration
build L and
ed to ies to use HTM PHP tags
Task-L Label minim e text that
blem activit the bare standard a very loos XHTML rging Tags found,
ous Inte committ to a pro USAGE SCENARIOS ate all
Autom configurat
ion cies to t
ely-defi as thei Ajax
tech L can is disp
Continu ry change ization, cannot be (and freq
nden ymen
tion layed
need ned lang r visual eng nologies
Build al depe
a solu ineffective
deplo t
Label manu d tool the same for stan but as if
(i.e., ) nmen overlap
eve , Build stalle
t, use target enviro Amazon EC2: Industry standard it has
software and uag
virtualization ine. HTM uen
with erns ns (i.e. problem ated ce pre-in ymen whether dards e with b></ , so <a>< tly are) nest
ory. via patt icular
Autom Redu d deploEAR) in each has bec become very little L
a> is
-patter

Subversion
reposit s that Pay only cieswhat you consume
tagge or Amazon’s cloud you cho platform
computing
the curr isome
heavily basedmoron fine. b></ ed insid
lained ) and anti the part solution duce nden For each (e.g. WAR es t
ent stan ose to writ more e impo a></ e
not lega each othe
exp x” al Depe ge nmen b> is
be text to “fi ns are Web application deployment until
nden a few years
t librari ago was similar
t enviro industry standard
that will softwaredard and virtualization app
e HTM technology. rtant,
to pro Minim packa
Mo re

CI can ticular con used i-patter they tend but can


rity all depe all targe
s will L or XHT arent. Reg the HTM l, but r. Tags
etimes s. Ant tices,
to most phone services:
y Integ alizeplans with le that
late fialloted resources, ts with an and XHT simplify all help ML, und ardless
L VS
XHTM <a><
in a par hes som , Binar Centr
temp you prov b></
proces in the end bad prac lementing
nt nmen your
incurred costgeme whether e a single on ML of L
Creatsuchare resources were consumed or
thenot.
t enviro Virtualization allows a physical
are actu pieceothe of hardware to be erst HTML
based much web cod ide a solid
approac ed with the cial, but,
cy Mana nt targe es to anding
essarily ed to imp nden rties
into differe chang of the ally simp r This has
efi Depe prope utilized by multiple operating
function systems. ing.resourcesfoundati
ler thanallows job adm been arou
not nec compar
itting
associat to be ben
er
Ge t

late Verifi te builds e comm commo Fortuna


y are n Cloud computing asRun it’sremo
known etoday has changed this.
befor etc. alitytohas they on irably, nd for
Temp Build (e.g. bandwidth, nmemory, CPU) be allocated exclusively totely exp som
appear effects. The results whe that
ually,
rm a
Privat y, contin Every elem mov used
to be, HTML
ected. job has e time
ed The various s resourcesPerfo consumed by webperio applications
dicall (e.g. nt team
pag
individual operating entsinstances. ed to CSS
system Brow Early . Whi
gration expand
opme
adverse unintend
d Build
common e (HTML because HTM
ed far le it has don
sitory . ser
Build r to devel
web dev manufacture L had very
Repo
ous Inte
Stage
e bandwidth, memory, CPU) areIntegtallied
ration on a per-unit CI serve basis or XHT
produc Continu Refcard
e Build rm an ack from extensio .) All are limited more than e its
om

ML shar
tern. term
Privat
(starting from zero) by Perfo all majorated cloud
feedb computing platforms. on As a user of Amazon’s
n HT EC2 essecloud
nti l computing es c platform, you are result elopers rs add
ed layo anyb
he pat tion f he le this h s d d utom they occur ld based i c d

DZone, Inc.
ISBN-13: 978-1-934238-75-2
140 Preston Executive Dr. ISBN-10: 1-934238-75-9
Suite 100
50795
Cary, NC 27513

DZone communities deliver over 6 million pages each month to 888.678.0399


919.678.0300
more than 3.3 million software developers, architects and decision
Refcardz Feedback Welcome
$7.95

makers. DZone offers something for everyone, including news,


refcardz@dzone.com
tutorials, cheatsheets, blogs, feature articles, source code and more. 9 781934 238752
Sponsorship Opportunities
“DZone is a developer’s dream,” says PC Magazine.
sales@dzone.com
Copyright © 2010 DZone, Inc. All rights reserved. No part of this publication may be reproduced, stored in a retrieval system, or transmitted, in any form or by means electronic, mechanical, Version 1.0
photocopying, or otherwise, without prior written permission of the publisher.