You are on page 1of 1

BATCH

Batch processing on AWS allows for the on-demand provisioning Batch processing architectures are often synonymous with highly

W S of a multi-part job processing architecture that can be used for


instantaneous or delayed deployment of a heterogeneous,
variable usage patterns that have significant usage peaks (e.g.,
month-end processing) followed by significant periods of
A ce scalable “grid” of worker nodes that can quickly crunch through underutilization.

PROCESSING
large batch processing tasks in parallel. There are numerous There are numerous approaches to building a batch processing

e r enes batch oriented applications in place today that can leverage this architecture. This document outlines a basic batch processing
f
Reectur
style of on-demand processing, including claims processing, large architecture that supports job scheduling, job status inspection,
scale transformation, media transcoding and multi-part data uploading raw data, outputting job results, grid management, and
processing work. reporting job performance data.
h it 2
Arc
C
E
n
zo

Si
a

EC
m

A
m
A

m
R

a
pl 2

zo
n

eD

n
zo

A
m
a

a
m

zo
D

n
A

le
p
im

Sc Au
S

r
3

ali to rke
n

S
zo

ng o
o &ore
a

W des
m

f
a
A

n
m

b I St No
g
A

Jo tics
li
a
c

aly
S


to

n
S

A
Q
u

SQ
A

A
S

m
S
n

a
zo
zo

OR

n
a
m
A

s
e
g
a
s
ut

s
Ma

e
st p

M
er t
6 Ou eue 7

l
o
DB

tr
o
n Qu nal)
C
Sl tio
(Op
av
eD
B 2
in g
RD 5 ain
Ch
A
m

S
a

l
zo

a
n

EC n
A

tio
m
a

2 p
zo

O
n

S3
m
a
zo
n

D ata
b e
b r
Jo nage Jo Stor 3
Ma SQ
A
m
S
a
e
zo
u

n
El 1 ue
En as u tQ
d
Us tic Inp
er IP

System 1 Users interact with the Job Manager application which


is deployed on an Amazon Elastic Computer Cloud 3 Individual job tasks are inserted by the Job Manager in
an Amazon Simple Queue Service (SQS) input 5 Interim results from worker nodes are stored in
Amazon S3.

Overview (EC2) instance. This component controls the process of


accepting, scheduling, starting, managing, and completing
batch jobs. It also provides access to the final results, job and
queue on the user’s behalf.
6 Progress information and statistics are stored on the
analytics store. This component can be either an
worker statistics, and job progress information. Worker nodes are Amazon EC2 instances deployed Amazon SimpleDB domain or a relational database such as
4 on an Auto Scaling group. This group is a container an Amazon Relational Database Service (RDS) instance.
that ensures health and scalability of worker nodes. Worker
Raw job data is uploaded to Amazon Simple Storage nodes pick up job parts from the input queue automatically Optionaly, completed tasks can be inserted in an
2 Service (S3), a highly-available and persistent data and perform single tasks that are part of the list of batch 7 Amazon SQS queue for chaining to a second
store. processing steps. processing stage.

You might also like