Astronomy, Petabytes, and MySQL

MySQL Conference Santa Clara, CA April 16, 2008

Kian-Tat Lim Stanford Linear Accelerator Center

Outline

LSST LSST Database LSST Database + MySQL
MySQL Conference April 16, 2008 Santa Clara, CA 2 / 47

LSST

What Is It? Why Build It?

MySQL Conference April 16, 2008 Santa Clara, CA

3 / 47

LSST

What Is It? Why Build It?

MySQL Conference April 16, 2008 Santa Clara, CA

4 / 47

Telescope

Proposed telescope to be built in Chile
MySQL Conference April 16, 2008 Santa Clara, CA 5 / 47

Large
3.2 gigapixel camera 8.4 meter diameter mirror

MySQL Conference April 16, 2008 Santa Clara, CA

6 / 47

Synoptic Survey

Wide Deep Fast
MySQL Conference April 16, 2008 Santa Clara, CA 7 / 47

LSST

What Is It? Why Build It?

MySQL Conference April 16, 2008 Santa Clara, CA

8 / 47

Dark Matter and Energy

Photo: J. A. Tyson, W. Colley, E.  L. Turner, and NASA

MySQL Conference April 16, 2008 Santa Clara, CA

9 / 47

Variable Objects

MySQL Conference April 16, 2008 Santa Clara, CA

10 / 47

Transient Objects

MySQL Conference April 16, 2008 Santa Clara, CA

11 / 47

Moving Objects

Photo: D. Roddy, Lunar and Planetary Institute

MySQL Conference April 16, 2008 Santa Clara, CA

12 / 47

LSST Database

What’s In It? How Big? How Often? What Queries? Unusual Needs
MySQL Conference April 16, 2008 Santa Clara, CA 13 / 47

LSST Database

What’s In It? How Big? How Often? What Queries? Unusual Needs
MySQL Conference April 16, 2008 Santa Clara, CA 14 / 47

Database: Components

Moving Objects Catalog

Object Catalog
Provenance Statistics Summaries

Source Catalog
Difference Image Source Catalog Image Metadata Calibration

Engineering and Facility Database
15 / 47

MySQL Conference April 16, 2008 Santa Clara, CA

Astronomical Objects

Moving Objects Catalog

Object Catalog
Provenance Statistics Summaries

Source Catalog
Difference Image Source Catalog Image Metadata Calibration

Engineering and Facility Database
16 / 47

MySQL Conference April 16, 2008 Santa Clara, CA

Sources

Moving Objects Catalog

Object Catalog
Provenance Statistics Summaries

Source Catalog
Difference Image Source Catalog Image Metadata Calibration

Engineering and Facility Database
17 / 47

MySQL Conference April 16, 2008 Santa Clara, CA

Changes

Moving Objects Catalog

Object Catalog
Provenance Statistics Summaries

Source Catalog
Difference Image Source Catalog Image Metadata Calibration

Engineering and Facility Database
18 / 47

MySQL Conference April 16, 2008 Santa Clara, CA

Image Metadata

Moving Objects Catalog

Object Catalog
Provenance Statistics Summaries

Source Catalog
Difference Image Source Catalog Image Metadata Calibration

Engineering and Facility Database
19 / 47

MySQL Conference April 16, 2008 Santa Clara, CA

Calibration and Facility

Moving Objects Catalog

Object Catalog
Provenance Statistics Summaries

Source Catalog
Difference Image Source Catalog Image Metadata Calibration

Engineering and Facility Database
20 / 47

MySQL Conference April 16, 2008 Santa Clara, CA

LSST Database

What’s In It? How Big? How Often? What Queries? Unusual Needs
MySQL Conference April 16, 2008 Santa Clara, CA 21 / 47

Sagans of Rows

49 billion objects 2.8 trillion sources
MySQL Conference April 16, 2008 Santa Clara, CA 22 / 47

Lots of Columns

308 columns for objects 56 columns for sources (for now)
MySQL Conference April 16, 2008 Santa Clara, CA 23 / 47

Database Size

Grows to >14 PB
MySQL Conference April 16, 2008 Santa Clara, CA 24 / 47

LSST Database

What’s In It? How Big? How Often? What Queries? Unusual Needs
MySQL Conference April 16, 2008 Santa Clara, CA 25 / 47

Frequency

Nightly updates Semi-annual data releases
MySQL Conference April 16, 2008 Santa Clara, CA 26 / 47

LSST Database

What’s In It? How Big? How Often? What Queries? Unusual Needs
MySQL Conference April 16, 2008 Santa Clara, CA 27 / 47

Queries

•All about an object •All objects meeting criteria •All objects near objects meeting

criteria •All objects with interesting time series •All pairs of objects with similar time series
MySQL Conference April 16, 2008 Santa Clara, CA 28 / 47

LSST Database

What’s In It? How Big? How Often? What Queries? Unusual Needs
MySQL Conference April 16, 2008 Santa Clara, CA 29 / 47

Unusual Needs

Flexibility Provenance
MySQL Conference April 16, 2008 Santa Clara, CA 30 / 47

LSST Database + MySQL

Why MySQL? Scalability? Performance?
MySQL Conference April 16, 2008 Santa Clara, CA 31 / 47

LSST Database + MySQL

Why MySQL? Scalability? Performance?
MySQL Conference April 16, 2008 Santa Clara, CA 32 / 47

MySQL

Relational database management system
MySQL Conference April 16, 2008 Santa Clara, CA 33 / 47

Open Source

Vibrant community Strong company support
MySQL Conference April 16, 2008 Santa Clara, CA 34 / 47

Hardware

Runs on commodity hardware

MySQL Conference April 16, 2008 Santa Clara, CA

35 / 47

In-Memory Tables

Needed for near-real-time processing

MySQL Conference April 16, 2008 Santa Clara, CA

36 / 47

LSST Database + MySQL

Why MySQL? Scalability? Performance?
MySQL Conference April 16, 2008 Santa Clara, CA 37 / 47

“MySQL Grid”

MySQL Conference April 16, 2008 Santa Clara, CA

38 / 47

Partitioning

Large tables partitioned spatially
MySQL Conference April 16, 2008 Santa Clara, CA 39 / 47

Replication

Dimension tables likely replicated
MySQL Conference April 16, 2008 Santa Clara, CA 40 / 47

Needs: Distributor/Combiner

LSST will build prototype Need long-term support

MySQL Conference April 16, 2008 Santa Clara, CA

41 / 47

LSST Database + MySQL

Why MySQL? Scalability? Performance?
MySQL Conference April 16, 2008 Santa Clara, CA 42 / 47

Per-Column Indexing

2X data size
MySQL Conference April 16, 2008 Santa Clara, CA 43 / 47

Needs: Optimizer

Efficient use of multiple (20-30) indexes

MySQL Conference April 16, 2008 Santa Clara, CA

44 / 47

Needs: Indexes

Bitmap/compressed indexes

MySQL Conference April 16, 2008 Santa Clara, CA

45 / 47

Needs: Storage Engine

“Shared scan” for longrunning full-table queries
MySQL Conference April 16, 2008 Santa Clara, CA 46 / 47

Summary

Building a petabyte DB MySQL can be a core component
MySQL Conference April 16, 2008 Santa Clara, CA 47 / 47

Sign up to vote on this title
UsefulNot useful