You are on page 1of 46

Intelligent People. Uncommon Ideas.

Building a Scalable Architecture for Web Apps -


Part I
(Lessons Learned @ Directi)

By Bhavin Turakhia
CEO, Directi

(http://www.directi.com | http://wiki.directi.com | http://careers.directi.com)

Licensed under Creative Commons Attribution Sharealike Noncommercial

1
Agenda

• Why is Scalability important


• Introduction to the Variables and Factors
• Building our own Scalable Architecture (in incremental
steps)
 Vertical Scaling
 Vertical Partitioning
 Horizontal Scaling
 Horizontal Partitioning
 … etc
• Platform Selection Considerations
• Tips

2
Creative Commons Sharealike Attributions Noncommercial
Why is Scalability Important in a Web 2.0 world

• Viral marketing can result in instant successes


• RSS / Ajax / SOA
 pull based / polling type
 XML protocols - Meta-data > data
 Number of Requests exponentially grows with user base
• RoR / Grails – Dynamic language landscape gaining
popularity
• In the end you want to build a Web 2.0 app that can serve
millions of users with ZERO downtime

3
Creative Commons Sharealike Attributions Noncommercial
The Variables

• Scalability - Number of users / sessions / transactions /


operations the entire system can perform
• Performance – Optimal utilization of resources
• Responsiveness – Time taken per operation
• Availability - Probability of the application or a portion of the
application being available at any given point in time
• Downtime Impact - The impact of a downtime of a
server/service/resource - number of users, type of impact etc
• Cost
• Maintenance Effort
High: scalability, availability, performance & responsiveness
Low: downtime impact, cost & maintenance effort

4
Creative Commons Sharealike Attributions Noncommercial
The Factors

• Platform selection
• Hardware
• Application Design
• Database/Datastore Structure and Architecture
• Deployment Architecture
• Storage Architecture
• Abuse prevention
• Monitoring mechanisms
• … and more

5
Creative Commons Sharealike Attributions Noncommercial
Lets Start …

• We will now build an example architecture for an example


app using the following iterative incremental steps –
 Inspect current Architecture
 Identify Scalability Bottlenecks
 Identify SPOFs and Availability Issues
 Identify Downtime Impact Risk Zones
 Apply one of -
• Vertical Scaling
• Vertical Partitioning
• Horizontal Scaling
• Horizontal Partitioning
 Repeat process

6
Creative Commons Sharealike Attributions Noncommercial
Step 1 – Lets Start …

Appserver &
DBServer

7
Creative Commons Sharealike Attributions Noncommercial
Step 2 – Vertical Scaling

Appserver, CPU
DBServer
CPU
RAM RAM

8
Creative Commons Sharealike Attributions Noncommercial
Step 2 - Vertical Scaling

• Introduction
 Increasing the hardware resources
without changing the number of nodes
 Referred to as “Scaling up” the Server
• Advantages
Appserver, CPU CPU  Simple to implement
DBServer
CPU CPU • Disadvantages
RAM RAM  Finite limit
 Hardware does not scale linearly
RAM RAM (diminishing returns for each
incremental unit)
 Requires downtime
 Increases Downtime Impact
 Incremental costs increase
exponentially

9
Creative Commons Sharealike Attributions Noncommercial
Step 3 – Vertical Partitioning (Services)

• Introduction
 Deploying each service on a separate node
• Positives
 Increases per application Availability
AppServer
 Task-based specialization, optimization and
tuning possible
 Reduces context switching
DBServer
 Simple to implement for out of band
processes
 No changes to App required
 Flexibility increases
• Negatives
 Sub-optimal resource utilization
 May not increase overall availability
 Finite Scalability 10
Creative Commons Sharealike Attributions Noncommercial
Understanding Vertical Partitioning

• The term Vertical Partitioning denotes –


 Increase in the number of nodes by distributing the
tasks/functions
 Each node (or cluster) performs separate Tasks
 Each node (or cluster) is different from the other
• Vertical Partitioning can be performed at various layers
(App / Server / Data / Hardware etc)

11
Creative Commons Sharealike Attributions Noncommercial
Step 4 – Horizontal Scaling (App Server)

• Introduction
 Increasing the number of nodes of
Load Balancer the App Server through Load
Balancing
 Referred to as “Scaling out” the
AppServer AppServer AppServer App Server

DBServer

12
Creative Commons Sharealike Attributions Noncommercial
Understanding Horizontal Scaling

• The term Horizontal Scaling denotes –


 Increase in the number of nodes by replicating the nodes
 Each node performs the same Tasks
 Each node is identical
 Typically the collection of nodes maybe known as a cluster
(though the term cluster is often misused)
 Also referred to as “Scaling Out”
• Horizontal Scaling can be performed for any particular
type of node (AppServer / DBServer etc)

13
Creative Commons Sharealike Attributions Noncommercial
Load Balancer – Hardware vs Software

• Hardware Load balancers are faster


• Software Load balancers are more customizable
• With HTTP Servers load balancing is typically combined
with http accelerators

14
Creative Commons Sharealike Attributions Noncommercial
Load Balancer – Session Management

• Sticky Sessions Sticky Sessions


 Requests for a given user are
sent to a fixed App Server User 1 User 2

 Observations
• Asymmetrical load distribution
(especially during downtimes)
Load Balancer
• Downtime Impact – Loss of
session data

AppServer AppServer AppServer

15
Creative Commons Sharealike Attributions Noncommercial
Load Balancer – Session Management

• Central Session Store Central Session Storage


 Introduces SPOF Load Balancer
 An additional variable
 Session reads and writes
generate Disk + Network I/O AppServer AppServer AppServer
 Also known as a Shared
Session Store Cluster
Session Store

16
Creative Commons Sharealike Attributions Noncommercial
Load Balancer – Session Management
• Clustered Session Clustered Session Management
Management
 Easier to setup Load Balancer
 No SPOF
 Session reads are instantaneous
 Session writes generate Network AppServer AppServer AppServer
I/O
 Network I/O increases
exponentially with increase in
number of nodes
 In very rare circumstances a
request may get stale session
data
• User request reaches subsequent
node faster than intra-node
message
• Intra-node communication fails
 AKA Shared-nothing Cluster
17
Creative Commons Sharealike Attributions Noncommercial
Load Balancer – Session Management

• Sticky Sessions with Central Sticky Sessions


Session Store User 1 User 2
 Downtime does not cause loss
of data
 Session reads need not
generate network I/O Load Balancer

• Sticky Sessions with


Clustered Session
Management
AppServer AppServer AppServer
 No specific advantages

18
Creative Commons Sharealike Attributions Noncommercial
Load Balancer – Session Management

• Recommendation
 Use Clustered Session Management if you have –
• Smaller Number of App Servers
• Fewer Session writes
 Use a Central Session Store elsewhere
 Use sticky sessions only if you have to

19
Creative Commons Sharealike Attributions Noncommercial
Load Balancer – Removing SPOF
Active-Passive LB
• In a Load Balanced App Users
Server Cluster the LB is an
SPOF
• Setup LB in Active-Active or Load Balancer Load Balancer

Active-Passive mode
 Note: Active-Active nevertheless AppServer AppServer AppServer
assumes that each LB is
independently able to take up
the load of the other Active-Active LB
 If one wants ZERO downtime, Users
then Active-Active becomes truly
cost beneficial only if multiple
LBs (more than 3 to 4) are daisy Load Balancer Load Balancer
chained as Active-Active forming
an LB Cluster
AppServer AppServer AppServer

20
Creative Commons Sharealike Attributions Noncommercial
Step 4 – Horizontal Scaling (App Server)

• Our deployment at the end of


Step 4
Load Balanced
App Servers • Positives
 Increases Availability and
Scalability
 No changes to App required
 Easy setup
DBServer
• Negatives
 Finite Scalability

21
Creative Commons Sharealike Attributions Noncommercial
Step 5 – Vertical Partitioning (Hardware)

• Introduction
 Partitioning out the Storage
Load Balanced function using a SAN
App Servers
• SAN config options
 Refer to “Demystifying Storage” at
http://wiki.directi.com -> Dev
University -> Presentations
DBServer • Positives
 Allows “Scaling Up” the DB Server
SAN
 Boosts Performance of DB Server
• Negatives
 Increases Cost

22
Creative Commons Sharealike Attributions Noncommercial
Step 6 – Horizontal Scaling (DB)

• Introduction
Load Balanced
 Increasing the number of DB nodes
App Servers  Referred to as “Scaling out” the DB
Server
• Options
 Shared nothing Cluster
 Real Application Cluster (or Shared
DBServer DBServer DBServer
Storage Cluster)

SAN

23
Creative Commons Sharealike Attributions Noncommercial
Shared Nothing Cluster

• Each DB Server node has


its own complete copy of the
database
DBServer DBServer DBServer
• Nothing is shared between Database Database Database
the DB Server Nodes
• This is achieved through DB
Replication at DB / Driver / Note: Actual DB files maybe
stored on a central SAN
App level or through a proxy
• Supported by most RDBMs
natively or through 3rd party
software

24
Creative Commons Sharealike Attributions Noncommercial
Replication Considerations

• Master-Slave
 Writes are sent to a single master which replicates the data to
multiple slave nodes
 Replication maybe cascaded
 Simple setup
 No conflict management required
• Multi-Master
 Writes can be sent to any of the multiple masters which replicate
them to other masters and slaves
 Conflict Management required
 Deadlocks possible if same data is simultaneously modified at
multiple places

25
Creative Commons Sharealike Attributions Noncommercial
Replication Considerations
• Asynchronous
 Guaranteed, but out-of-band replication from Master to Slave
 Master updates its own db and returns a response to client
 Replication from Master to Slave takes place asynchronously
 Faster response to a client
 Slave data is marginally behind the Master
 Requires modification to App to send critical reads and writes to
master, and load balance all other reads
• Synchronous
 Guaranteed, in-band replication from Master to Slave
 Master updates its own db, and confirms all slaves have updated
their db before returning a response to client
 Slower response to a client
 Slaves have the same data as the Master at all times
 Requires modification to App to send writes to master and load
balance all reads
26
Creative Commons Sharealike Attributions Noncommercial
Replication Considerations

• Replication at RDBMS level


 Support may exists in RDBMS or through 3rd party tool
 Faster and more reliable
 App must send writes to Master, reads to any db and critical reads
to Master
• Replication at Driver / DAO level
 Driver / DAO layer ensures
• writes are performed on all connected DBs
• Reads are load balanced
• Critical reads are sent to a Master
 In most cases RDBMS agnostic
 Slower and in some cases less reliable

27
Creative Commons Sharealike Attributions Noncommercial
Real Application Cluster

• All DB Servers in the cluster


share a common storage
area on a SAN DBServer DBServer DBServer
• All DB servers mount the
same block device
• The filesystem must be a Database
clustered file system (eg SAN
GFS / OFS)
• Currently only supported by
Oracle Real Application
Cluster
• Can be very expensive
(licensing fees)
28
Creative Commons Sharealike Attributions Noncommercial
Recommendation

• Try and choose a DB which


natively supports Master-
Slave replication Load Balanced
• Use Master-Slave Async App Servers

replication
• Write your DAO layer to
ensure
 writes are sent to a single DB
DBServer
 reads are load balanced
 Critical reads are sent to a DBServer DBServer
master
Writes & Critical Reads
Other Reads

29
Creative Commons Sharealike Attributions Noncommercial
Step 6 – Horizontal Scaling (DB)

• Our architecture now looks like


Load Balanced
this
App Servers • Positives
 As Web servers grow, Database
nodes can be added
 DB Server is no longer SPOF

DB DB DB
• Negatives
 Finite limit
DB Cluster

SAN

30
Creative Commons Sharealike Attributions Noncommercial
Step 6 – Horizontal Scaling (DB)

• Shared nothing clusters have a


Reads Writes
finite scaling limit
 Reads to Writes – 2:1
 So 8 Reads => 4 writes
DB1 DB2
 2 DBs
• Per db – 4 reads and 4 writes
 4 DBs
• Per db – 2 reads and 4 writes
 8 DBs
• Per db – 1 read and 4 writes
• At some point adding another
node brings in negligible
incremental benefit

31
Creative Commons Sharealike Attributions Noncommercial
Step 7 – Vertical / Horizontal Partitioning (DB)

• Introduction
Load Balanced
 Increasing the number of DB
App Servers Clusters by dividing the data
• Options
 Vertical Partitioning - Dividing
tables / columns
 Horizontal Partitioning - Dividing by
DB DB DB
rows (value)
DB Cluster

SAN

32
Creative Commons Sharealike Attributions Noncommercial
Vertical Partitioning (DB)

• Take a set of tables and move


them onto another DB
 Eg in a social network - the users
table and the friends table can be on App Cluster
separate DB clusters
• Each DB Cluster has different
tables DB Cluster 1 DB Cluster 2
• Application code or DAO / Driver Table 1 Table 3
code or a proxy knows where a Table 2 Table 4
given table is and directs queries
to the appropriate DB
• Can also be done at a column
level by moving a set of columns
into a separate table

33
Creative Commons Sharealike Attributions Noncommercial
Vertical Partitioning (DB)

• Negatives
 One cannot perform SQL joins or
maintain referential integrity
(referential integrity is as such over- App Cluster
rated)
 Finite Limit

DB Cluster 1 DB Cluster 2
Table 1 Table 3
Table 2 Table 4

34
Creative Commons Sharealike Attributions Noncommercial
Horizontal Partitioning (DB)

• Take a set of rows and move


them onto another DB
 Eg in a social network – each DB
Cluster can contain all data for 1 App Cluster
million users
• Each DB Cluster has identical
tables DB Cluster 1 DB Cluster 2
• Application code or DAO / Driver Table 1 Table 1
code or a proxy knows where a Table 2 Table 2
Table 3 Table 3
given row is and directs queries Table 4 Table 4
to the appropriate DB 1 million users 1 million users
• Negatives
 SQL unions for search type queries
must be performed within code

35
Creative Commons Sharealike Attributions Noncommercial
Horizontal Partitioning (DB)

• Techniques
 FCFS
• 1st million users are stored on cluster 1 and the next on cluster 2
 Round Robin
 Least Used (Balanced)
• Each time a new user is added, a DB cluster with the least users is
chosen
 Hash based
• A hashing function is used to determine the DB Cluster in which the
user data should be inserted
 Value Based
• User ids 1 to 1 million stored in cluster 1 OR
• all users with names starting from A-M on cluster 1
 Except for Hash and Value based all other techniques also
require an independent lookup map – mapping user to Database
Cluster
 This map itself will be stored on a separate DB (which may
further need toCreative
be replicated)
Commons Sharealike Attributions Noncommercial
36
Step 7 – Vertical / Horizontal Partitioning (DB)

• Our architecture now looks


Load Balanced like this
App Servers
Lookup
• Positives
Map  As App servers grow, Database
Clusters can be added
• Note: This is not the same as
DB DB DB DB DB DB table partitioning provided by
DB Cluster DB Cluster
the db (eg MSSQL)
• We may actually want to
further segregate these into
SAN Sets, each serving a
collection of users (refer next
slide

37
Creative Commons Sharealike Attributions Noncommercial
Step 8 – Separating Sets

• Now we consider each deployment as a single Set serving


a collection of users
Global
Lookup
Global Redirector Map

Load Balanced Load Balanced


App Servers App Servers
Lookup Lookup
Map Map

DB DB DB DB DB DB DB DB DB DB DB DB

DB Cluster DB Cluster DB Cluster DB Cluster

SAN SAN

SET 1 – 10 million users SET 2 – 10 million users


38
Creative Commons Sharealike Attributions Noncommercial
Creating Sets

• The goal behind creating sets is easier manageability


• Each Set is independent and handles transactions for a
set of users
• Each Set is architecturally identical to the other
• Each Set contains the entire application with all its data
structures
• Sets can even be deployed in separate datacenters
• Users may even be added to a Set that is closer to them
in terms of network latency

39
Creative Commons Sharealike Attributions Noncommercial
Step 8 – Horizontal Partitioning (Sets)

• Our architecture now looks


Global Redirector like this
• Positives
 Infinite Scalability
App Servers
Cluster
App Servers
Cluster
• Negatives
 Aggregation of data across sets
is complex
DB Cluster DB Cluster  Users may need to be moved
across Sets if sizing is improper
DB Cluster DB Cluster
 Global App settings and
preferences need to be
SAN SAN
replicated across Sets
SET 1 SET 2

40
Creative Commons Sharealike Attributions Noncommercial
Step 9 – Caching

• Add caches within App Server


 Object Cache
 Session Cache (especially if you are using a Central Session
Store)
 API cache
 Page cache
• Software
 Memcached
 Teracotta (Java only)
 Coherence (commercial expensive data grid by Oracle)

41
Creative Commons Sharealike Attributions Noncommercial
Step 10 – HTTP Accelerator

• If your app is a web app you should add an HTTP


Accelerator or a Reverse Proxy
• A good HTTP Accelerator / Reverse proxy performs the
following –
 Redirect static content requests to a lighter HTTP server (lighttpd)
 Cache content based on rules (with granular invalidation support)
 Use Async NIO on the user side
 Maintain a limited pool of Keep-alive connections to the App Server
 Intelligent load balancing
• Solutions
 Nginx (HTTP / IMAP)
 Perlbal
 Hardware accelerators plus Load Balancers

42
Creative Commons Sharealike Attributions Noncommercial
Step 11 – Other cool stuff

• CDNs
• IP Anycasting
• Async Nonblocking IO (for all Network Servers)
• If possible - Async Nonblocking IO for disk
• Incorporate multi-layer caching strategy where required
 L1 cache – in-process with App Server
 L2 cache – across network boundary
 L3 cache – on disk
• Grid computing
 Java – GridGain
 Erlang – natively built in

43
Creative Commons Sharealike Attributions Noncommercial
Platform Selection Considerations

• Programming Languages and Frameworks


 Dynamic languages are slower than static languages
 Compiled code runs faster than interpreted code -> use
accelerators or pre-compilers
 Frameworks that provide Dependency Injections, Reflection,
Annotations have a marginal performance impact
 ORMs hide DB querying which can in some cases result in poor
query performance due to non-optimized querying
• RDBMS
 MySQL, MSSQL and Oracle support native replication
 Postgres supports replication through 3rd party software (Slony)
 Oracle supports Real Application Clustering
 MySQL uses locking and arbitration, while Postgres/Oracle use
MVCC (MSSQL just recently introduced MVCC)
• Cache
 Teracotta vs memcached vs Coherence
44
Creative Commons Sharealike Attributions Noncommercial
Tips

• All the techniques we learnt today can be applied in any


order
• Try and incorporate Horizontal DB partitioning by value
from the beginning into your design
• Loosely couple all modules
• Implement a REST-ful framework for easier caching
• Perform application sizing ongoingly to ensure optimal
utilization of hardware

45
Creative Commons Sharealike Attributions Noncommercial
Intelligent People. Uncommon Ideas.

Questions??

bhavin.t@directi.com

http://directi.com
http://careers.directi.com

Download slides: http://wiki.directi.com

46

You might also like