Professional Documents
Culture Documents
Openstack Object Storage Whitepaper
Openstack Object Storage Whitepaper
OpenStack
TM
February 2012
Contents
Introduction............................................................................................................... 4
Introducing
Swift ....................................................................................................... 5
Swift
Characteristics..............................................................................................................................6
Swift
is
Scalable .......................................................................................................................................6
Swift
is
Extremely
Durable..................................................................................................................7
Swift
is
Open
Source
Software...........................................................................................................7
Swift
is
Similar
to
AWS
S3 ...................................................................................................................8
Swift
is
Built
on
Industry-standard
Components ......................................................................8
Swift
Can
Be
Deployed
In-House
or
As-a-Service ......................................................................9
Using
Swift............................................................................................................... 10
Swift
Commands
-
The
Basics ......................................................................................................... 10
Client
Libraries...................................................................................................................................... 11
How
Swift
Works ..................................................................................................... 12
Building
Blocks...................................................................................................................................... 12
Proxy
Servers ......................................................................................................................................... 12
The
Ring ................................................................................................................................................... 12
Zones:
Failure
Boundaries ............................................................................................................... 13
Accounts
&
Containers....................................................................................................................... 14
Partitions ................................................................................................................................................. 14
Replication .............................................................................................................................................. 14
How
these
are
all
tied
together...................................................................................................... 15
Swift
Cluster
Architecture ........................................................................................ 17
Access
Tier............................................................................................................................................... 17
Storage
Nodes........................................................................................................................................ 18
Configuring
Networking ........................................................................................... 20
Large-Scale
Networking ................................................................................................................... 20
Medium-Scale
Networking .............................................................................................................. 21
Management
Network ....................................................................................................................... 21
2012
SwiftStack,
Inc.
All
rights
reserved.
February 2012
Hardware
Recommendations................................................................................... 22
Proxy
Nodes ............................................................................................................................................ 22
Storage
Nodes........................................................................................................................................ 22
Networking ............................................................................................................................................. 23
Sizing
Your
Swift
Cluster .................................................................................................................. 23
Planning
a
Deployment ............................................................................................ 24
Datacenter
Facility
Planning.......................................................................................................... 24
Integrations............................................................................................................................................ 25
Monitoring .............................................................................................................................................. 25
Administration
&
Maintenance ..................................................................................................... 26
Understanding
TCO............................................................................................................................. 26
Rolling
Swift
Out
in
Your
Organization ..................................................................................... 27
Managing
Swift
with
SwiftStack ............................................................................... 28
Like
to
Learn
More? ................................................................................................. 29
February 2012
Introduction
In
this
era
of
connected
devices,
the
demands
on
storage
systems
are
increasing
exponentially.
Users
are
producing
and
consuming
more
data
than
ever.
Social
media,
online
video,
user-uploaded
content,
gaming,
and
software-as-a-service
applications
are
all
contributing
to
the
vast
need
for
easily
consumable
storage
systems
that
can
grow
without
bounds.
To
accommodate
these
demands,
storage
systems
must
be
able
to
handle
web-scale
workloads
with
many
concurrent
readers
and
writers
to
a
data-store.
Some
data
is
frequently
written
and
retrieved,
such
as
database
files
and
virtual
machine
images.
Other
data,
such
as
documents,
images,
and
backups
are
generally
written
once
and
rarely
accessed.
Web
and
mobile
data
assets
also
need
to
be
accessible
over
the
web
via
a
URL
to
support
todays
web/mobile
applications.
A
one-size-fits-all
data
storage
solution
is
therefore
neither
practical
nor
economical.
Public
cloud
storage
systems
have
risen
to
the
task
of
handling
web-scale
workloads.
Cloud
storage-as-a-services
include
Amazon.coms
Simple
Storage
Service
(S3)
and
Rackspaces
Cloud
Files,
which
both
has
grown
tremendously
in
usage.
For
instance,
in
April
2010,
Amazon.com
reported
that
100
billion
objects
were
stored
in
S3,
a
100%
growth
from
the
previous
year.
In
October
2011,
Amazon.com
reported
566
billion
objects
were
stored
in
S3.
Technology
trend
analyzers
at
BuiltWith.com
now
tracks
more
than
60,000
websites
that
serve
content
directly
from
S3.
However,
not
every
organization
will
utilize
a
public
storage
cloud
such
as
S3
for
cost,
regulatory,
or
control
reasons.
With
the
OpenStack
Object
Storage
system,
aka
Swift,
there
is
now
an
open-source
option
for
organizations
needing
a
highly
durable
object
storage
system
that
is
deployed
on
industry-standard
hardware
in
their
own
datacenter.
These
systems
can
serve
as
the
core
for
a
private
storage
cloud
or
public
storage-as-a-service
offerings.
The
purpose
of
this
white-paper
is
to
help
those
who
are
considering
deploying
an
object
storage
system
based
on
OpenStack
Swift
and
complements
the
official
Swift
documentation
which
is
available
at
http://swift.openstack.org.
Not
every
topic
related
to
getting
Swift
up
and
running
in
your
environment
is
covered
in
this
white-paper,
but
it
provides
an
overview
of
the
key
areas
to
be
aware
of,
including
what
Swift
is,
how
it
works,
how
to
deploy,
manage
and
monitor
Swift
with
the
SwiftStack
Platform,
how
to
use
Swift
and
some
general
deployment
considerations.
February 2012
Introducing
Swift
Swift
is
a
multi-tenant,
highly
scalable
and
durable
object
storage
system
that
is
designed
to
store
large
amounts
of
unstructured
data
at
low
cost.
Highly
scalable,
means
that
it
can
scale
from
a
few
nodes
and
a
handful
of
drives
to
thousands
of
machines
with
multiple
Petabytes
of
storage.
Swift
is
designed
to
be
horizontally
scalablethere
is
no
single
point-of-failure.
Swift
is
used
by
Fortune
500
enterprises,
web
companies
and
service
providers
worldwide.
It
is
typically
used
to
store
unstructured
data
such
as
documents,
web
content,
backups,
images
and
virtual
machine
snaphots.
Originally
developed
as
the
engine
behind
RackSpace
Cloud
Files,
it
was
open-sourced
under
the
Apache
2
license
under
the
OpenStack
project
in
2010.
With
now
more
than
100
companies
and
thousands
of
developers
participating
in
the
OpenStack
project,
the
usage
of
Swift
is
increasing
rapidly.
Swift
is
not
a
traditional
file
system
or
a
raw
block
device.
Instead,
it
enables
you
to
store,
retrieve
and
delete
objects
(with
its
associated
metadata)
in
containers
(buckets
in
Amazon
S3
terminology)
via
a
RESTful
HTTP
API.
.
Developers
can
either
write
directly
to
the
Swift
API
or
use
one
of
the
many
client
libraries
that
exist
for
all
popular
programming
languages,
such
as
Java,
Python,
Ruby
and
C#.
Amazon
S3
and
RackSpace
Cloud
Files
users
should
feel
very
familiar
with
Swift.
For
users
who
have
not
used
an
object
storage
system
before,
it
will
require
a
different
approach
and
mindset
than
using
a
traditional
filesystem.
Benefits
of
Swift
for
developers
include:
February 2012
Benefits
to
IT
operations
teams
include:
Swift
Characteristics
Swift
cant
be
mounted
like
a
folder
in
your
operating
system.
There
is
no
random
access
within
a
files
content
and
there
can
be
multiple
concurrent
writers,
which
makes
it
unsuitable
for
transactional
applications
such
as
traditional
relational
databases
for
which
a
Storage
Area
Network
(SAN)
or
Networked
Attached
Storage
(NAS)
system
may
be
a
better
fit.
Also,
since
object
storage
systems
dont
provide
raw
data
blocks
that
an
operating
system
can
form
into
a
filesystem,
Swift
is
unsuitable
for
booting
an
operating
system.
The
key
characteristics
and
benefits
of
Swift
include:
Swift
is
Scalable
To
support
thousands
of
concurrent
users,
todays
application
architects
must
take
advantage
of
the
latest
in
distributed
architectures,
using
distributed
noSQL
databases
(CouchDB,
Cassandra,
MongoDB),
distributed
message
/
queuing
systems
(ActiveMQ,
RabbitMQ)
and
distributed
processing
systems
like
Hadoop.
To
that
end,
application
architects
need
their
storage
system
to
scale
along
with
their
application.
Available
space
isnt
a
useful
statistic
on
its
own.
A
key
benchmark
is
the
storage
systems
concurrency.
The
ability
to
handle
a
great
number
of
simultaneous
2012
SwiftStack,
Inc.
All
rights
reserved.
February 2012
connections
from
within
a
datacenter
or
across
the
web
is
critical
to
satisfy
the
needs
of
applications
that
are
built
for
web
scale
usage.
Swift
is
designed
to
have
linear
growth
characteristics.
As
the
system
grows
in
usage
and
the
number
of
requests
increase,
performance
doesnt
degrade.
To
scale
up,
the
system
is
designed
to
grow
where
needed
by
adding
storage
nodes
to
increase
storage
capacity,
adding
proxy
nodes
as
requests
increase,
and
growing
network
capacity
where
choke
points
are
detected.
3.
February 2012
As
the
source
code
is
publicly
available,
it
can
be
reviewed
by
many
more
developers
than
what
is
the
case
for
proprietary
software.
This
means
that
potential
bugs
also
tend
to
be
more
visible
and
more
rapidly
corrected
than
for
proprietary
software.
In
the
long
term,
open
generally
wins
-
and
Swift
might
be
considered
the
Linux
of
storage.
February 2012
Swift
Is
Supported
As
an
OpenStack
project,
Swift
has
the
benefit
of
a
rich
community,
which
includes
more
than
100
participating
companies
and
1000+
developers.
The
following
support
options
are
available
for
Swift:
Commercial
support
and
tools
are
available
through
SwiftStack,
which
has
experience
deploying,
running
and
supporting
Swift
at
scale
Community
support
is
provided
through
OpenStack
community,
where
best
practices
can
be
shared
with
other
organizations
and
users
that
are
using
Swift
Swifts
documentation
is
publicly
available
at
http://swift.openstack.org
February 2012
Using
Swift
Once
deployed,
all
communication
with
Swift
is
done
over
a
REST-ful
HTTP
API.
Application
Developers
whod
like
to
take
advantage
of
Swift
for
storing
content,
documents,
files,
images
etc.
can
use
one
of
the
many
client
libraries
that
exist
for
all
all
popular
programming
languages,
including
Java,
Python,
Ruby,
C#
and
PHP.
Existing
backups,
data
protection
and
archiving
applications
which
currently
support
either
Rackspace
Cloud
Files
or
Amazon
S3
can
also
use
Swift
as
their
storage
back-end
with
minor
modifications.
Base: swift.example.com/v1/
Object:
Objects
are
where
the
actual
data
is
stored
in
swift.
Object
names
may
contain
/,
so
pseudo-nested
directories
are
possible.
To
get
a
list
of
all
containers
in
an
account,
use
the
GET
command
on
the
account:
GET
http://swift.example.com/v1/account/
To
create
new
containers,
use
the
PUT
command
with
the
name
of
the
new
container:
PUT
http://swift.example.com/v1/account/new_container
To
list
all
object
in
a
container,
use
the
GET
command
on
the
container:
GET
http://swift.example.com/v1/account/container/
2012
SwiftStack,
Inc.
All
rights
reserved.
10
February 2012
Client
Libraries
Several
client
libraries
for
Swift
are
available,
including:
C#/.NET:
http://github.com/rackspace/csharp-cloudfiles
Java:
http://github.com/rackspace/java-cloudfiles
PHP:
http://github.com/rackspace/php-cloudfiles
Python:
http://github.com/rackspace/python-cloudfiles
Ruby:
http://github.com/rackspace/ruby-cloudfiles
In
addition,
a
Ruby
library
is
also
available
through
the
fog
client:
http://github.com/geemus/fog
The
Fuse
client
can
be
used
to
map
a
filesystem
to
Swift-
http://github.com/redbo/cloudfuse/
For
more
information
building
client
libraries
for
Swift
see:
https://github.com/rackspace/docs-cloudfiles
11
February 2012
Zones:
Each
Zone
isolates
data
from
other
Zones.
A
failure
in
one
Zone
doesnt
impact
the
rest
of
the
cluster
because
data
is
replicated
across
the
Zones.
Accounts
&
Containers:
Each
Account
and
Container
are
individual
databases
that
are
distributed
across
the
cluster.
An
Account
database
contains
the
list
of
Containers
in
that
Account.
A
Container
database
contains
the
list
of
Objects
in
that
Container.
Proxy
Servers
The
Proxy
Servers
are
the
public
face
of
Swift
and
handle
all
incoming
API
requests.
Once
a
Proxy
Server
receive
a
request,
it
will
determine
the
storage
node
based
on
the
URL
of
the
object,
e.g.
https://swift.example.com/v1/account/container/object.
The
Proxy
Servers
also
coordinates
responses,
handles
failures
and
coordinates
timestamps.
Proxy
servers
use
a
shared-nothing
architecture
and
can
be
scaled
as
needed
based
on
projected
workloads.
A
minimum
of
two
Proxy
Servers
should
be
deployed
for
redundancy.
Should
one
proxy
server
fail,
the
others
will
take
over.
The
Ring
The
Ring
maps
Partitions
to
physical
locations
on
disk.
When
other
components
need
to
perform
any
operation
on
an
object,
container,
or
account,
they
need
to
interact
with
the
Ring
to
determine
its
location
in
the
cluster.
The
Ring
maintains
this
mapping
using
zones,
devices,
partitions,
and
replicas.
Each
partition
in
the
Ring
is
replicated
three
times
by
default
across
the
cluster,
and
the
2012
SwiftStack,
Inc.
All
rights
reserved.
12
February 2012
locations
for
a
partition
are
stored
in
the
mapping
maintained
by
the
Ring.
The
Ring
is
also
responsible
for
determining
which
devices
are
used
for
handoff
should
a
failure
occur.
The
Ring
maps
partitions
to
physical
locations
on
disk.
When
a
disk,
node,
or
zone
fails,
replica
data
is
distributed
to
the
other
zones
to
ensure
there
are
three
copies
of
the
data
2012
SwiftStack,
Inc.
All
rights
reserved.
13
February 2012
To
keep
track
of
object
data
location,
each
account
in
the
system
has
a
database
that
references
all
its
containers,
and
each
container
database
references
each
object.
Partitions
A
Partition
is
a
collection
of
stored
data,
including
Account
databases,
Container
databases,
and
objects.
Partitions
are
core
to
the
replication
system.
Think
of
a
Partition
as
a
bin
moving
throughout
a
fulfillment
center
warehouse.
Individual
orders
get
thrown
into
the
bin.
The
system
treats
that
bin
as
a
cohesive
entity
as
it
moves
throughout
the
system.
A
bin
full
of
things
is
easier
to
deal
with
than
lots
of
little
things.
It
makes
for
fewer
moving
parts
throughout
the
system.
The
system
replicators
and
object
uploads/downloads
operate
on
Partitions.
As
the
system
scales
up,
behavior
continues
to
be
predictable
as
the
number
of
Partitions
is
a
fixed
number.
The
implementation
of
a
Partition
is
conceptually
simple
a
partition
is
just
a
directory
sitting
on
a
disk
with
a
corresponding
hash
table
of
what
it
contains.
Replication
In
order
to
ensure
that
there
are
three
copies
of
the
data
everywhere,
replicators
2012
SwiftStack,
Inc.
All
rights
reserved.
14
February 2012
continuously
examine
each
Partition.
For
each
local
Partition,
the
replicator
compares
it
against
the
replicated
copies
in
the
other
Zones
to
see
if
there
are
any
differences.
How
does
the
replicator
know
if
replication
needs
to
take
place?
It
does
this
by
examining
hashes.
A
hash
file
is
created
for
each
Partition,
which
contains
hashes
of
each
directory
in
the
Partition.
Each
of
the
three
hash
files
is
compared.
For
a
given
Partition,
the
hash
files
for
each
of
the
Partition's
copies
are
compared.
If
the
hashes
are
different,
then
it
is
time
to
replicate
and
the
directory
that
needs
to
be
replicated
is
copied
over.
This
is
where
the
Partitions
come
in
handy.
With
fewer
things
in
the
system,
larger
chunks
of
data
are
transferred
around
(rather
than
lots
of
little
TCP
connections,
which
is
inefficient)
and
there
are
a
consistent
number
of
hashes
to
compare.
The
cluster
has
eventually
consistent
behavior
where
the
newest
data
wins.
If
a
zone
goes
down,
one
of
the
nodes
containing
a
replica
notices
and
proactively
copies
data
to
a
handoff
location.
Upload
A
client
uses
the
REST
API
to
make
a
HTTP
request
to
PUT
an
object
into
an
existing
Container.
The
cluster
receives
the
request.
First,
the
system
must
figure
out
where
the
data
is
going
to
go.
To
do
this,
the
Account
name,
Container
name
and
Object
name
are
all
used
to
determine
the
Partition
where
this
object
should
live.
Then
a
lookup
in
the
Ring
figures
out
which
storage
nodes
contain
the
Partitions
in
2012
SwiftStack,
Inc.
All
rights
reserved.
15
February 2012
question.
The
data
then
is
sent
to
each
storage
node
where
it
is
placed
in
the
appropriate
Partition.
A
quorum
is
required
at
least
two
of
the
three
writes
must
be
successful
before
the
client
is
notified
that
the
upload
was
successful.
Next,
the
Container
database
is
updated
asynchronously
to
reflect
that
there
is
a
new
object
in
it.
Download
A
request
comes
in
for
an
Account/Container/object.
Using
the
same
consistent
hashing,
the
Partition
name
is
generated.
A
lookup
in
the
Ring
reveals
which
storage
nodes
contain
that
Partition.
A
request
is
made
to
one
of
the
storage
nodes
to
fetch
the
object
and
if
that
fails,
requests
are
made
to
the
other
nodes.
16
February 2012
Large-scale
deployments
segment
off
an
Access
Tier.
This
tier
is
the
Grand
Central
of
the
Object
Storage
system.
It
fields
incoming
API
requests
from
clients
and
moves
data
in
and
out
of
the
system.
This
tier
is
composed
of
front-end
load
balancers,
ssl-
terminators,
authentication
services,
and
it
runs
the
(distributed)
brain
of
the
object
storage
system
the
proxy
server
processes.
Having
the
access
servers
in
their
own
tier
enables
read/write
access
to
be
scaled
out
independently
of
storage
capacity.
For
example,
if
the
cluster
is
on
the
public
Internet
and
requires
ssl-termination
and
has
high
demand
for
data
access,
many
access
servers
can
be
provisioned.
However,
if
the
cluster
is
on
a
private
network
and
it
is
being
used
primarily
for
archival
purposes,
fewer
access
servers
are
needed.
As
this
is
an
HTTP
addressable
storage
service,
a
load
balancer
can
be
incorporated
into
2012
SwiftStack,
Inc.
All
rights
reserved.
17
February 2012
Factors
to
Consider
For
most
publicly
facing
deployments
as
well
as
private
deployments
available
across
a
wide-reaching
corporate
network,
SSL
will
be
used
to
encrypt
traffic
to
the
client.
SSL
adds
significant
processing
load
to
establish
sessions
between
clients;
more
capacity
in
the
access
layer
will
need
to
be
provisioned.
SSL
may
not
be
required
for
private
deployments
on
trusted
networks.
Storage Nodes
2012
SwiftStack,
Inc.
All
rights
reserved.
18
February 2012
The
next
component
is
the
storage
servers
themselves.
Generally,
most
configurations
should
have
each
of
the
five
Zones
with
an
equal
amount
of
storage
capacity.
Storage
nodes
use
a
reasonable
amount
of
memory
and
CPU.
Metadata
needs
to
be
readily
available
to
quickly
return
objects.
The
object
stores
run
services
not
only
to
field
incoming
requests
from
the
Access
Tier,
but
to
also
run
replicators,
auditors,
and
reapers.
Object
stores
can
be
provisioned
with
single
gigabit
or
10
gigabit
network
interface
depending
on
expected
workload
and
desired
performance.
Currently
2TB
or
3TB
SATA
disks
deliver
good
price/performance
value.
Desktop-grade
drives
can
be
used
where
there
are
responsive
remote
hands
in
the
datacenter,
and
enterprise-grade
drives
can
be
used
where
this
is
not
the
case.
Factors
to
Consider
Desired
I/O
performance
for
single-threaded
requests
should
be
kept
in
mind.
This
system
does
not
use
RAID,
so
each
request
for
an
object
is
handled
by
a
single
disk.
Disk
performance
impacts
single-threaded
response
rates.
To
achieve
apparent
higher
throughput,
the
object
storage
system
is
designed
with
concurrent
uploads/downloads
in
mind.
The
network
I/O
capacity
(1GbE,
bonded
1GbE
pair,
or
10GbE)
should
match
your
desired
concurrent
throughput
needs
for
reads
and
writes.
19
February 2012
Configuring
Networking
Below
are
two
examples
of
deployments
at
two
scales:
the
larger
deployments
with
a
two-tier
networking
architecture,
and
smaller
deployments
with
a
single
networking
tier.
Note
that
when
a
write
comes
into
the
proxy
server,
there
is
three
times
the
traffic
going
to
the
object
stores
to
write
the
three
replicas.
Systems
must
be
designed
to
account
for
the
expected
read/write
traffic.
Large-Scale Networking
Aggregation
A
pair
of
aggregation
switches
with
two
links
back
to
the
access
network
/
border
network
are
used
to
connect
to
two
pools
of
the
Access
Tier
and
to
each
of
the
five
Zone
switches
that
connect
the
Object
Stores.
All
connections
to
the
Access
Tier
and
the
Zones
are
10GbE.
Zone
Network
Each
Zone
has
a
switch
to
connect
itself
to
the
aggregation
network.
Its
possible
to
use
a
single,
non-redundant
switch
as
the
system
is
designed
to
sustain
a
Zone
failure.
2012
SwiftStack,
Inc.
All
rights
reserved.
20
February 2012
Medium-Scale Networking
A
single
network
tier
is
used
for
smaller
deployments
in
the
range
of
300-500TB.
Either
1GbE
or
10GbE
switches
can
be
used
for
this
purpose
depending
on
the
throughput
the
cluster
is
expected
to
sustain.
The
Access
Tier
services
still
contain
two
interfaces
and
a
VLAN
is
created
for
each
front-facing
API
request
and
back-end
network
connecting
the
object
server
Zones.
Management
Network
A
management
network
is
critical
to
maintaining
the
health
of
the
cluster.
A
separate
1GbE
management
network
is
created
for
IPMI,
monitoring,
and
out-of-band
access
to
every
machine
in
the
cluster.
However,
it
is
typically
possible
to
use
the
higher-
bandwidth
connections
during
provisioning
for
operating
system
installation.
21
February 2012
Hardware
Recommendations
Swift
is
designed
to
store
and
retrieve
whole
files
via
HTTP
across
a
cluster
of
industry-
standard
x86
servers
and
drives,
using
replication
to
ensure
data
reliability
and
fault
tolerance.
While
this
model
provides
great
flexibility
(and
low
cost)
from
a
hardware
perspective,
it
requires
some
upfront
planning,
testing
and
validation
to
ensure
that
the
hardware
you
select
is
suitable
not
just
for
Swift
itself,
but
also
for
the
expected
workload
that
you
are
designing
your
cluster
for.
Your
operations
team
may
also
have
opinions
on
the
hardware
selection,
as
they
prefer
to
work
with
hardware
they
are
already
familiar
with.
Proxy
Nodes
Proxy
nodes
use
a
moderate
amount
of
RAM
and
are
network
IO
intensive.
Typically,
Proxy
nodes
are
1U
systems
with
a
minimum
of
12
GB
RAM.
As
these
systems
field
each
incoming
API
request,
it
is
wise
to
provision
them
with
two
high-throughput
(10GbE)
interfaces.
One
interface
is
used
for
'front-end'
incoming
requests
and
the
other
for
'back-end'
access
to
the
object
storage
nodes
to
put
and
fetch
data.
For
small
Swift
deployments,
the
storage
nodes
can
serve
as
proxy
nodes.
Storage
Nodes
Storage
nodes
are
typically
high-density
3U
or
4U
nodes
with
16-36
SATA
disks
each.
These
nodes
use
a
reasonable
amount
of
memory
and
CPU.
The
storage
nodes
run
services
not
only
to
field
incoming
requests
from
the
proxy
nodes,
but
also
replication,
auditing
and
other
processes
to
ensure
durability.
Storage
nodes
can
be
provisioned
with
single
gigabit
or
10GbE
network
interface
depending
on
expected
workload
and
desired
performance.
For
storage
nodes,
we
recommend
the
following
specifications:
CPU
64-bit
x86
CPU
(Intel/AMD),
quad-core
or
greater,
running
at
least
2-2.5GHz
RAM
A
good
rule
of
thumb
is
approximately
1
GB
of
RAM
for
each
TB
of
Disk.
I.e.
for
a
node
with
24
drives,
36-48GB
of
RAM
should
be
used.
The
memory
is
used
for
the
many
processes
used
field
incoming
object
requests
and
XFS
inode
caching.
Drives
Either
2TB
or
3TB
7200
RPM
SATA
drives,
which
deliver
good
price/performance
value.
Desktop-grade
drives
can
be
used
where
there
are
responsive
remote
hands
in
the
data
center,
and
enterprise-grade
drives
can
be
used
2012
SwiftStack,
Inc.
All
rights
reserved.
22
February 2012
where
that
is
not
the
case.
We
dont
recommend
using
green
drives.
Swift
is
continuously
ensuring
data
integrity
and
the
power-down
functions
of
green
drives
may
result
in
excess
wear.
Extreme
container
update
workload
consideration
Where
the
application
needs
to
ingest
many
millions
of
files
in
a
single
container,
it
may
be
necessary
to
use
higher-
performing
media
(RAID
10
with
15k
drives
or
SSDs)
for
the
container
indexes.
The
data
set
is
relatively
very
small
in
size,
so
few
space
is
needed
on
the
higher
performing
media
to
store
this
data.
Controller
Cards
Swift
replicates
data
across
zones
so
there
is
no
need
for
data
redundancy
to
be
provided
by
the
controller.
Swift
therefore
uses
standard
SATA
controller
cards
without
RAID,
such
as
LSI
9211-8i
6Gb/s
SAS
/
SATA
HBA.
However,
if
the
controller
card
requires
RAID
volumes
to
be
created,
set
up
a
RAID
0
group
(without
striping)
for
each
drive.
Network
Cards
Depending
on
the
use
case,
single
gigabit
ethernet
(1GbE)
on
each
host
may
be
all
that
is
required.
However,
it
is
possible
to
configure
bonded
1GbE
or
10GbE
if
the
workload
demands
it.
Networking
A
typical
deployment
would
have
a
front-facing
access
network
and
a
back-end
storage
network.
When
designing
the
network
capacity,
keep
in
mind
that
writes
fan-
out
in
triplicate
in
the
storage
network.
As
there
are
three
copies
of
each
object,
an
incoming
write
is
sent
to
three
storage
nodes.
Therefore
network
capacity
for
writes
needs
to
be
considered
in
proportion
to
overall
workload.
23
February 2012
Planning
a
Deployment
Power provisioning
24
February 2012
Cooling
Physical
space
requirements
Networking
capacity
and
planning
Port
layouts
Rack
layouts
SwiftStack
can
assist
with
best
practices
and
datacenter
facilities
planning
advice
for
a
successful
Swift
implementation.
Integrations
When
getting
Swift
up
and
running
in
your
data-center,
there
are
several
potential
integrations
with
3rd
party
systems
and
services
to
consider,
including:
Each
of
these
areas
can
be
integrated
with
your
Swift
environment
but
requirements
will
differ
based
on
your
specific
requirements
and
use-case.
While
this
white-paper
does
not
cover
these
areas
in
any
depth,
SwiftStack
can
provide
advice
and
best
practices
on
how
to
integrate
with
3rd
party
systems
and
services.
Monitoring
There
are
many
tools
available
for
application
developers
and
IT
operations
teams
to
measure
the
health
of
applications
and
servers.
While
many
of
these
tools
are
helpful,
for
a
Swift
cluster
they
may
be
more
complex
and
provide
more
data
than
you
need.
To
ensure
that
users
can
quickly
measure
the
overall
health
of
their
SwiftStack
environment,
the
SwiftStack
Platform
is
tracking
the
key
metrics
that
allow
you
to
quickly
determine
the
status
of
your
overall
Swift
cluster
and
of
the
individual
nodes.
For
the
overall
Swift
cluster,
the
key
metrics
monitored
by
the
SwiftStack
Platform
are
node
CPU
utilization,
top
5
least
free
disks,
disk
I/O
and
network
I/O.
For
individual
nodes,
the
same
key
metrics
are
reported,
which
can
be
used
to
tell
the
overall
health
of
the
node.
In
addition,
external
monitoring
systems
can
be
configured
to
consume
SNMP
polling
of
system
information,
and
SNMP
traps
provided
through
the
SwiftStack
platform.
2012
SwiftStack,
Inc.
All
rights
reserved.
25
February 2012
Understanding
TCO
Finally,
when
planning
a
deployment,
understanding
the
total-cost
of
ownership
for
the
cluster
is
a
critical
so
all
direct
and
indirect
costs
are
included.
Costs
should
include:
Design/Development
Hardware
Hardware
Standup
Datacenter
Space
Power/Cooling
26
February 2012
Network
Access
Ongoing
Software
Maintenance
and
Support
Monitoring
and
Operational
Support
27
February 2012
Drastically
simplifies
the
process
of
getting
Swift
up
and
running
in
your
data-
center
Deploys
Swift
on
nodes
and
configures
the
cluster
Enables
you
to
start
with
one
node
and
add
nodes
as
your
data
grows
Provides
a
central
management
console
for
your
Swift
nodes
and
cluster
Monitors
and
alerts
for
issues
in
nodes,
disks
and
other
resources
Enables
you
to
easily
expand
your
cluster
and
tune
for
performance
Provides
diagnostics
for
issues,
which
simplifies
support
and
administration
The
SwiftStack
Platform
incorporates
deployment
and
operational
best
practices
for
Swift
and
provides
a
single
plane
of
glass
for
your
entire
Swift
environment.
To
get
started
with
SwiftStack,
the
first
step
is
to
download
the
SwiftStack
ISO
consisting
of:
Ubuntu
OpenStack
Swift
SwiftStack
Agents
2012
SwiftStack,
Inc.
All
rights
reserved.
28
February 2012
After
logging
in
at
the
SwiftStack
Platform,
it
will
guide
you
through
the
process
of
creating
a
new
cluster,
creating
accounts
and
users,
installing
and
provisioning
Swift
on
cluster
nodes,
formatting
drives,
configuring
zones
and
the
other
tasks
required
to
set
up
your
Swift
environment.
Once
your
Swift
environment
has
been
deployed,
the
SwiftStack
Platform
helps
provides
the
on-going
administration,
management
and
monitoring
of
your
Swift
environment.
29