Professional Documents
Culture Documents
1)
Try to break the problem into simpler
modules ( Top down approach )
2) Talk about the trade offs
-
( No solution is perfect )
calculate the impact on system based on
heproblem
←
Focus on interviewer 's
T ] intentions .
Ask
Abstract
questions
( constraints & ]
Functionality
Requirements )
finding bottlenecks
idudas
System Design Basics ccontd ) ②
consistent Hashing
-
f-
CAP Theorem ✓
-
Load balancing ✓
-
queues
-
caching -
Replication
I
-
59L vs No -
SQL
Indexes ✓
-
Proxies
-
Data Partitioning
✓
1-
③
Load Balancing
( Distributed system )
Random
f
Types of distribution -
Round -
robin
<
Random ( weights for &
memory
CPU
cycles)
To utilize full scalability & redundancy ,
add 3 LB
( Internal platform )
3) Internal platform DB .
#W
\er
Client LB
DB
T ,, LB
Smart clients
Takes a pool of service hosts & balances load .
*
Attractive solution for developers
( small scale systems)
As system grows
→ LBS ( standalone servers)
e.
g . Citrix Net scaler
[
'
hybrid approach
HA Proxy OSS Load balancer
-4
1)
Running on client machine
'
2
(
e
locally bound port)
local host : 9000
I
g
-
.
F-
managed by HA Proxy
( with efficient
management
of requests on the port )
2) Running on intermediate server Proxies running beth
:
[
Manages health checks
removal & addition of machines
S9Lvs.NoS÷
iaa
1) structured D Unstructured .
¥'s::L:L: stores
column Separate data points
9sa¥fl
mysar
Oracle
wait:p : :3 ;
' DB
Postgres
Maria DB
÷ ÷ ÷ ÷ ÷ ÷ ÷ ÷÷÷÷÷÷÷÷÷÷
an E o
e
f÷ ÷¥÷ ÷ ÷ ÷ ÷
O
ai
s I E
E cs O
S O
th s I 3 is Os
J
e
to
O
6
of =
O
G
S
IT u
E
E s
'
U -
s
I
g O
Et -
s
U
o
te
O O
G
e e O
tr
:
- I
'
J S T
I E
① ÷EE
O e
g
E O or I
is
E E
o o ET
± .
or or o J
-
E
E e E s
if÷ ÷f ÷ i ÷ :f÷: f÷ i ÷
is 3 Er er .
-3
I 8 Ess E o
o -
u
S
.
OS s
G o c
E
d d
- E
J
OE s
w t
o E o o E
=
O E E
G d Id O is 8 t
E
e
E
r
-
°
i n u EE s E E O
-
E
E
-
d n
s - s
gu -
so
§ E
d
f G
c O
s O s . -
e
U
u f O
d O O - 88 I 0
u
or 6 S by us is •
w S s s W s e
d O o
-
U
j u
E o
g s a ou ou N & si u
883 88
A
O
or it's too
O
>
u
⑤
s
I E
ch t g - o o s 't
I
it
Cs cs T g J as a
Notre u
E
O o
-
v. I d g.
E n
S
E KE u e I
o
O
→
E Z of E s
o s 8 s I I 8 of to
t -
n y
U ou u @ at so .
-
In
T
-
O 6
Es O s @ Ed O t
N
E w o i
T S e e O
ng O w - u
E
.
& U -
o E on s a •
.
o - S U
R w O -
-
O O N
g s D z or
G
b d J O
to Nos y
U - u
s cry
O
so I E O si
'
-
-
t O
u z u
w Dw
>
or • ← we og t w
O
Fa 8 c
3 s n
-
o
S
o u
T Z 's f o
- I o
- E u 06 I d. E
w es
°
6
§ 03
O
T E s
G S T od
O 89 f- I O
'
z s
U
s G o Tu Es
u
E o 8- or
6
# '
O
I
E D
@ u w o
J as t
O D O 6
Es
'
O U
L
62
t
U u
N s g
or
y
es
d
or
u g- O
w u
>
o
u
g
d
U s
y
nd
E E
=
d E b Eire E
-
-
o #
T ooo E
'
I n
- EEE -
o
a
O o
-
I
as u
's
•
1- o x G as
, +
I o
'
88 go.IT IT IT IT
-
if ¥4
w
q El
s
.- e d
J o
I 08 u
s
us
u g
on
Ev or
or
Reasonstouses.cl#BJ
1) You need to ACID
ensure compliance :
ACID compliance
Reduces anomalies
Protects integrity of the database .
for
"
→
ACID compliant DB
is the first choice .
pareieiontdera£
.
"
at same time)
before allowing "
"
reads
g) e.
E Not
%%%
g
,
Availability
[ cassandra Couch DB]
Hr Hr
.
.in:6#eo:::i:::::ia::i:n:::ans'
We cannot build a data store which is :
D
continually available
2)
sequentially consistent
3) failure tolerant
partition .
Because ,
other partitions
↳ client reads data from out-of-date partition
After having read from up-to-date partition .
partition .
↳ service is no longer
100% available .
Redundancy&ReplicationJ
Duplication of critical data & services
↳
increasing reliability of system .
servers 1 databases .
Deter Dow
Primary server secondary server
& Data
2
- - a
Ig Replication 2
Active data Mirrored data
# Service
Redundancy : shared -
nothing architecture .
Requests
in€.÷ .
data
miss
response
Tcatedy
✓
!m£
same request different nodes
d
More
our:c D
by Global caches
2) Distributed caches
Distributed
Gimme
⑤gµeautts.-€#
-
Divided
using consistent
hashing function
③
##
Easy to increase cache by adding
space more hordes
##
Disadvantage :
# # Even
if node disappears
request can pull data from Origin .
flabellate
#
Single cache space
for all the hacks .
↳
Adding a cache source I file store ( faster than original store)
#
Difficult to manage
if no
af clients I request increases -
effective if
Y
fixed dataset that needs to be cached
2)
special Hlw fast Ho .
Database
%fbtae.mn?hdEtahhYm
GEE { database
Global
{
cache
%gz Database
App
"
of static media .
if not
available
Request CDN -
Baek - End
- server
if
L
a
available (
local ( static Media)
storage
4µA transition
yaursuuice.com
using lightweight Nginse serves
to a CDN later
Cache Invalidation
# Cached data needs to be coherent with the database
# 3 schemes :
I Write -
through cache :
cache
Data is written
same time in
Data
DB hath cache & DB .
+
Complete data consistency C cache DB) =
+
Fault tolerance in case of failure Club data loss)
-
2)
Write around cache
Cache
Data
DB
+ No cache
flooding foe writes
written data
read request for newly miss
-
d
higher latency
B) Write back cache :
cache DB
Data
after some
£ interval as under
client
some sepciifird conditions
data is written to DB
from cache
+ law latency &
high throughput bae write intensive app
"
-
one
copy
D FIFO
2) LIFO Ae FILO
3) LRV
4) MRU
5) LF U
Random Replacement
S harding 11 Data Partitioning
# Data
Partitioning splitting up DD I table
:
multiple across
**
After a certain scale paint ,
it is cheaper and more feasible
to scale horizontally by adding money instead of
vertical
scaling by adding beefiness
# Methods of Partitioning :
1)
Horizontal Partitioning Different :
rows into diff .
tables
Table 2 :
Lips with 7 Loo ooo
different tables
and so on
e.
g .
Instagram - DB sauce 1 :
user info
DB sauce 2 : followers
DB server 3 :
photos
* A
straightforward to implement
* A lane impact on app .
if additional growth
- -
→
app
need to partition feature specific DB across various sources
( e -
& abstracts it
away from the DB access code .
l
Mapping tuple key → D8 sauce)
to add DD towers
Easy or
change partitioning scheme .
Partitioning Criteria
Hash based
D
key or
partitioning :
Kay atte -
af Hash
function →
Partition
the data number
Effectively fines of
# the total number sauces 1partitions
So
if we add new source I partition To
Nuo → hookup
for record
→
record stare the
n .
partitions
the tuple assigned to partition
'
is is
( i mad n)
4
Composite Partitioning :
combination af above
partitioning schemes
flashing t List consistent Hashing
Hr
Hash reduces the key space to a
# Problems
Common
of Shouting :
Iharded DB :
Entree constraints on the diff operations
.
Hr
operations -
across multiple tables or
no
longer running
in
single severe .
" Jains A Denoumalizatiom :
Jains on
single sauce
tables on
straightforward .
multiple servers)
(
so that the queries that
previously read jains .
can be
performed
from a
single table .
2)
Referential integrity :
stranded DB .
"
jobs to
↳ it in app
enforce
"
code C SOL
clean up
dangling references)
3)
Rebalancing :
a) horn -
2) rebalance
{
↳
change in partitioning scheme
↳ data monument
↳
downtime
We based
can use
directory -
partitioning
↳
highly complex
↳
single paint of failure
( service
lookup 1 table)
Indexes
Well known because -
of databases .
Shauna writes
↳ write the data
↳
Update the index
Can be created
using one or more columns
* random lookups
Rapid
&
efficient access of ordered records .
# Data structure
column → Painter to whale raw
→ Create
different views of the same data -
.
# datasets ( TB
Used foe in size) & small
payload ( KB)
I
spied over several
client
client
>
>
Peony >
MY Backend
> sauce
Source
>
^
a
-
filters requests
log requests
transform
-
add / remain headers
encryption / decryption
frequently compression
used resources request co -
ordination
↳
collapsing requests collapsed forwarding
for data that is spatially
Ipsfminimize reads from origin
-
.
Queues
writes
fast
→
In small systems
→
are .
→
In complex systems →
high incoming load
4- individual writes take more time
To high performance
*
achieve & availability
↳ to be
system needs asynchronous
↳
Queue
#
Synchronous behaviour →
degrades performance
d
Load balancing
ye
difficult for fair
&
balanced distribution
server
C T,
{
,
T2
r
C, -13
T,
gag ↳
,
T2
Queue
Cu
# Queues communication protocol
asynchronous
:
# Limit on the
sispafeeguest
& number of requests in
queue
# Queue :
Provides fault -
tolerance
[
↳
protection from service
outage /failure
highly robust
retry failed
↳
[ service request
Enforces Quality of Service
guarantee
L Does NOT expose clients to outages )
# Queues :
distributed communication
↳
Open source implementations
↳ BeanstalkD
Rabbit ma ,
Zoeoma ,
Active MQ ,
.
Consistent Hashing
# Distributed Hash Table
index =
hash -
function C
key)
# Suppose designing distributed
we're
caching system
with n cache servers
↳ hash .
function (
key % n )
Drawbacks :
1) NOT
horizontally scalable
( downtime of system )
2) NOT load balanced
l because -
af uniform
non - distribution of data )
1-
Some caches : hat & saturated
&
Other caches :
idk empty
minimizes reorganization in
scaling up / dawn
→
.
→
only kin keys needs to be remapped .
How it works ?
In consistent hashing ,
imagine all
of these integers are placed on a
ring .
255 0
254 • •
of
, 2
•
253 "
,
255 0
C A
2)
Map key to a serum :
a) it
Hash to
single integer
b) Mane CLK wise until
you find Laura
c)
map key to that server .
255 0
hL key -
1)
"
of A
h (
key
'
- 2)
B
-
hL key -
1)
"
C A
D
a
h (
key
'
- 2)
Removing IA
'
will result in the key -1 to II
'
server ,
morning
255 O
①
h ( key -
1)
D
a
h (
key
'
- 2)
B •
Consider real world scenario
data distributed
randomly
→
↳ unbalanced cactus .
Virtual Replicas
Instead -
↳ ( more number
of replicas
↳ more
equal distribution
↳
good load
balancing )
255 0
D C
h ( key -
1)
•
A
a-
B
C
AD
^ D
h [ Key 2)
-
A
B
tikbsoekrts Serves Sent Events
Long Palling vs us -
'
↳
Client -
Senior Communication Protocols
# Protocol
NJ
HTTP :
request
>
client prepare Remorse Serum
<
Response
.
# AJAX Patting :
↳ Lot
'
of empty
'
uspomsgy.au
↳ HTTP Overhead .
.
Request .
)
4
Response
Request .
>
Client a sooner
response
\
.
Request >
<
Response
•
# HTTP Long Patting :
a)
immediately after receiving response
d)
after a pause to allow acceptable latency period
5) Each request has timeout .
LP Request .
7
<
full Response
Client
LP
<
Request .
.
My full Response
>
Showa
LP Request >
a
full Response
Wet Sockets
→
duplex communication channel over
single TCP connection .
Provides persistent
'
→ communication
'
Handshake
socket Request
pep
Success
Handshake Response
<
i.
-
Client
:
<
'
.
is Source
.
Communication
channel
}
→ Lower overheads
→
Real time data transfer
Server -
↳
Requires another technology / protocol .
data request
'
=
Client
i.
< .
•
ing
Always -
open unidirectional
Source
<
communication
<
channel
.
-
responses whenever now data available
→
best when we need real time data
-