Load
Balancing
for
Hypertable:


A
Rich
Challenge
for
AI


Cloud
Compu)ng
and
Data
Centers
 AAAI
2011
Workshop

Zvents,
Inc.
 Cork
Constraint
 ComputaCon
Centre


Gordon
Rios


Doug
Judd

Hypertable,
Inc.


IntroducCon 

 A
distributed.
column
oriented
database
 Designed
for
large
scale
cluster
operaCons
 WriCng
to
cluster
of
servers
 Redundant
parCCons
of
table
space
 Open
source
soKware
primary
goals
are
 performance
and
extensibility
 •  Extensibility:
flexible
load
balancing
interface
 •  •  •  •  •  .


or
adding
new
servers
 •  OpCmize
balancing
policy?
AdapCve?
 .
&
15
minute
intervals
 •  Simple
balancing
policy:
regular
intervals.
5.Load
Balancing
 •  ObjecCve
is
to
balance
resource
uClizaCon
 across
cluster
 •  UClizaCon
is
CPU
load
average
(loadavg)
 measured
over
1.
 range
splits.


 •  •  •  •  .Basic
Balancer:
Overview
 Naive
allocaCon
is
round
robin
 Baseline
load
balancing
via
the
Basic
Balancer
 ObjecCve
is
to
reduce
sum
of
deviaCons
 Dynamic
is
simply
moving
ranges
from
servers
 with
high
loadavg
based
on
esCmated
effect
 •  OperaConal
heurisCcs
such
as
skipping
 recently
moved
ranges.
etc.

r
×
LALEs
 LALEs
=
LAs
/
Σ
LEs.r
=
LEs.1
 Δs.1
 LE1.Basic
Balancer:
Algorithm
 While
Δs.1
 Servers
[s]
 .r
 LA4
+
Δ1.r
≥
ΔTHRESH
do:


 LA1
–
Δ1.

Load
EsCmates
(LE)
 •  HeurisCc
esCmate
of
LE
(impact
on
loadavg)
is
 2·bytes_written/s + disk_bytes_read/s •  Machine
learned
Load
EsCmates
based
on
stats
 from
sys/RS METRICS
for
table
and
range:
 –  Range
count
 –  Scans
/
updates
per
second
 –  Disk
Bytes
read
per
second
(disk_bytes_read/s)
 –  Bytes
wriben
per
second
(bytes_written/s)
 –  Disk
and
memory
used
 •  More
than
50
range
server
stats
from
SIGAR
 .

ObjecCve
FuncCon?
 •  •  •  •  •  Challenge
for
AI
/
Modeling
 How
best
to
model
domain
for
opCmizaCon?
 Average
Latency
of
system
 Total
throughput
of
system
 Basic
balancer
simply
minimizes
sum
of
 absolute
deviaCon:
 ∑ s∈Servers loadavgs − loadavg .

Constraints
for
Load
Balancing
 •  For
the
basic
balancer
a
simple
heurisCc
is
 used
to
stop
balancing:
Δs.r
≥
ΔTHRESH
 •  EsCmaCng
correct
value
for
ΔTHRESH
 •  Mine
the
domain
for
constraints
such
as:
 –  RestricCons
on
range
assignments
based
on
table
 membership
or
server
characterisCcs
 –  Constraints
on
number
of
moves
could
impact
 policy
 –  Constrain
total
moves
over
Cme
periods
 .

Output
Load
Balancing
Plan
 •  Batch
opCmizaCon
 •  Output
set
of
assignments
of
ranges
from
 server
to
server
(range
moves)
 •  Possibly
represent
a
sequence
of
assignments
 if
order
of
moves
is
significant
 •  Explore
ideas
for
sequenCal
opCmizaCon
/
 incremental
algorithms
and
strategies
 .

Test
Load
/
Experiments
 •  Test
load
/
experiments
need
to
be
devised
 •  Candidate
load
could
include:
 –  storing
and
querying
twiber
stream
data
 –  meteorological
/
climate
data
 –  bioinformaCcs
data
(UCSF
currently
running
an
HT
 cluster)
 •  Hypertable
is
highly
instrumented
system
 providing
web
based
tools
for
monitoring
and
 reporCng
cluster
operaConal
staCsCcs
 .

} .     virtual void register_plan(BalancePlanPtr &plan).     virtual bool move_complete(const TableIdentifier &table.     virtual void deregister_plan(BalancePlanPtr &plan).C++
Interface
 class LoadBalancer {   public:     LoadBalancer(ContextPtr context). int32_t error=0).     virtual bool get_destination(const TableIdentifier &table.     virtual void balance(const String &algorithm=String()) = 0. const RangeSpec &range. String &location). const RangeSpec &range.     virtual void transfer_monitoring_data(vector<RangeServerStatistics> &stats).

QuesCons?
 •  Gordon
Rios
(gordon@zvents.
Inc.)
 .
 &
Cork
Constraint
ComputaCon
Center
(UCC
 Ireland)
 •  Doug
Judd
(doug@hypertable.
Inc.
(formerly
Zvents.
Inc.com)
Zvents.com)
 Hypertable.

Sign up to vote on this title
UsefulNot useful