MemSQL Technical Overview

TECHNICAL OVERVIEW.
JUNE 2012 ..........................
CONTACT INFO.
sales@memsql.com
WEB.
www.memsql.com
380 10th ST, STE 25

San Francisco, CA 94103
TECHNICAL OVERVIEW.
TABLE OF CONTENTS.
2 PRODUCT. 3 BIG DATA IS ALSO SPEED. 3 HORIZONTAL SCALABILITY. 4 EASE OF USE. 4 DURABILITY & REPLICATION. 5 LOCK-FREE DATA STRUCTURES. 5 CODE GENERATION. 6 USE CASES. 6 PLANNING FOR GROWTH.
1 ..........................
TECHNICAL OVERVIEW.
PRODUCT.
MemSQL is a memory-optimized, distributed database that enables an application to read and write data 30x faster than relational databases on disk. As data volumes expand and rates of access accelerate, disk-backed databases are no longer suitable to handle these increasingly intensive workloads. Shifting the locus of data to an in-memory tier is the easiest way to achieve massive throughput and scale with clear advantages: Main memory is orders of magnitude faster to access than ash or hard disk. Main memory computing enables predictable, sub-millisecond response time for storing and retrieving data in a high throughput environment, as well as real-time analytics over massive datasets. Clustering technologies enable extremely resilient systems to guarantee uptime. DRAM is abundant and aordable, with prices falling 40% year over year.
1.
2.
3.
4.
While caching data in memory alleviates some of the load from disk, it only complicates infrastructure with incompatible interfaces, invalidation of stale data, and lack of persistence. And while some NoSQL alternatives can store data in memory, the disadvantages are considerable: the data model moves into the application, data becomes dicult to query, and data can lose consistency. MemSQL is a modern, relational database that is optimized at every level for running in memory and is aware of its inherently distributed nature, yet still retains a familiar SQL interface.
2 ..........................
TECHNICAL OVERVIEW.
BIG DATA IS ALSO ABOUT SPEED.

The volume of data that companies store is growing exponentially. In addition, that data is accessed more frequently by more users than ever before, meaning that the rate at which data must be stored and retrieved is also growing rapidly. The OLTP component of big data is about expanding the pipeline through which data can be transferred and accessed. MemSQL can push hundreds of thousands of ACID transactions per second on a single instance of commodity hardware. When combined into a MemSQL cluster, it is possible to achieve millions of transactions per second. MemSQL also delivers sub-millisecond response times per query, enabling real-time analytics on data sets that are in continuous ux as well as ensuring that applications are fast and responsive to end users.
HORIZONTAL SCALABILITY.
As a core capability, MemSQL is a distributed system that enables an engineering team to scale capacity and throughput, whether infrastructure resides in the cloud or in a private data center. Most importantly, MemSQL manages the complexity of data movement, including partitioning a data set across a cluster. The application tier interacts with MemSQL through a single, unied interface.
3 ..........................
TECHNICAL OVERVIEW.
MemSQL uses leaf nodes and aggregators to distribute data across a cluster of machines. Data is partitioned across leaf nodes by selecting a set of columns as the partition key. An aggregator resides next to the application and synchronizes database metadata - information about tables, columns, and how data is distributed across leaf nodes - with other aggregators. Since every aggregator shares the same state, queries can be routed eciently to one or more leaf nodes and aggregated into a nal result.
EASE OF USE.
MemSQL is designed to be easy to use by leveraging the MySQL learning curve, and an engineer familiar with MySQL will instantly know how to use MemSQL. Since MemSQL uses the MySQL protocol on the wire, an application connects to MemSQL using standard MySQL drivers and requires no code change at the application tier. Moreover, MemSQL facilitates data manageability with a suite of tools. For example, users can control a MemSQL cluster on the command line and through a web dashboard. Within the dashboard, a user can monitor cluster performance, spin up new nodes, view, maintain, and rollback snapshots, and repartition data across the cluster.
DURABILITY & REPLICATION.

MemSQL achieves durability by enforcing a combination of snapshotting and write-ahead logging (also known as journaling). By taking snapshots of the
4 ..........................
TECHNICAL OVERVIEW.
continuously or at user-dened intervals, MemSQL can recover to a consistent state in the event of hardware failure. MemSQL runs with durability turned on by default, though it is possible to fully disable durability and achieve even higher throughput. It is possible to replicate from one MemSQL database to another in a master/ slave conguration. Preferably, each database runs in a separate datacenter as a best practice.
LOCK-FREE DATA STRUCTURES.

MemSQL leverages lock-free data structures to model the complex relationships of a dataset. Using a combination of lock-free hash tables and skip lists, MemSQL can serve data as quickly as Memcached. MemSQL integrates multi-version concurrency control (MVCC) with its lock-free indexes to guarantee that read operations will never block write operations. This behavior is well-suited for workloads that are write-heavy or have an equal number of read and write operations.
CODE GENERATION.
Code generation is at the heart of why MemSQL runs so fast. Rather than dynamically interpreting a SQL query, MemSQL precompiles the query into native instructions for utmost eciency of the CPU. Every query is automatically parameterized: strings and integers are temporarily removed while MemSQL determines if a matching precompiled plan already exists. If the plan is present, parameters are re-substituted directly into the compiled shared object, essentially removing code interpretation on subsequent read and write queries.
5 ..........................
TECHNICAL OVERVIEW.
USE CASES.
MemSQL is well-suited for applications that require high-throughput, lowlatency processing. Examples include ad serving, real-time analytics, high volume web trac, and machine-generated data.
PLANNING FOR GROWTH.

By scaling across multiple servers and delivering 30x faster performance than databases on disk, MemSQL ensures an engineering team can increase capacity and throughput for the application. Whether the application must process machine-generated data, withstand trac spikes, or access highly volatile data, MemSQL oers a relational interface instantly familiar to any engineer.
6 ..........................
TECHNICAL OVERVIEW.
THANK YOU.
To learn more about MemSQL, visit www.memsql.com. To discuss your high performance database needs with a MemSQL engineer, please contact us at 415-886-7652 or send us an email to sales@memsql.com.
CONTACT INFO.
sales@memsql.com
WEB.
www.memsql.com
380 10th ST, STE 25

San Francisco, CA 94103
7 ..........................

MemSQL Technical Overview

Uploaded by

Document Information

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

MemSQL Technical Overview

Uploaded by

Copyright:

Available Formats

TECHNICAL OVERVIEW.

JUNE 2012 ..........................

380 10th ST, STE 25

BIG DATA IS ALSO ABOUT SPEED.

DURABILITY & REPLICATION.

LOCK-FREE DATA STRUCTURES.

PLANNING FOR GROWTH.

380 10th ST, STE 25

You might also like