Database Storage: Intro To Database Systems Andy Pavlo

Database Storage
04 Part II
Intro to Database Systems Andy Pavlo

15-445/15-645
Fall 2019 AP Computer Science
Carnegie Mellon University
2
ADMINISTRIVIA
Homework #1 is due September 11th @ 11:59pm
Project #1 will be released on September 11th
CMU 15-445/645 (Fall 2019)

3
U P C O M I N G D ATA B A S E E V E N T S
SalesForce Talk
→ Friday Sep 13th @ 12:00pm
→ CIC 4th Floor
Impira Talk
→ Monday Sep 16th @ 4:30pm
→ GHC 8102
Vertica Talk
→ Monday Sep 23rd @ 4:30pm
→ GHC 8102
CMU 15-445/645 (Fall 2019)
4
D I S K- O R I E N T E D A R C H I T E C T U R E
The DBMS assumes that the primary storage

location of the database is on non-volatile disk.
The DBMS's components manage the movement

of data between non-volatile and volatile storage.
CMU 15-445/645 (Fall 2019)

5
S L O T T E D PA G E S
Slot Array
The most common layout scheme is
called slotted pages. Header
The slot array maps "slots" to the

tuples' starting position offsets.
Tuple #4 Tuple #3
The header keeps track of:
Tuple #2 Tuple #1
→ The # of used slots
→ The offset of the starting location of the
last slot used. Fixed/Var-length
Tuple Data
CMU 15-445/645 (Fall 2019)
5
S L O T T E D PA G E S
Slot Array
The most common layout scheme is
called slotted pages. Header
The slot array maps "slots" to the

tuples' starting position offsets.
Tuple #4 Tuple #3
The header keeps track of:
Tuple #2 Tuple #1
→ The # of used slots
→ The offset of the starting location of the
last slot used. Fixed/Var-length
Tuple Data
CMU 15-445/645 (Fall 2019)
6
LO G -S T R U C T U R E D F I L E O R G A N I Z AT I O N
Page
Instead of storing tuples in pages, the
DBMS only stores log records. INSERT id=1,val=a
New Entries
INSERT id=2,val=b
The system appends log records to the DELETE id=4
file of how the database was modified: INSERT id=3,val=c

→ Inserts store the entire tuple. UPDATE val=X (id=3)
→ Deletes mark the tuple as deleted. UPDATE val=Y (id=4)
→ Updates contain the delta of just the
…
attributes that were modified.
CMU 15-445/645 (Fall 2019)

7
Page
To read a record, the DBMS scans the
log backwards and "recreates" the INSERT id=1,val=a
tuple to find what it needs. INSERT id=2,val=b
Reads
DELETE id=4
INSERT id=3,val=c
UPDATE val=X (id=3)
UPDATE val=Y (id=4)
…
CMU 15-445/645 (Fall 2019)
7
Page
log backwards and "recreates" the INSERT id=1,val=a
tuple to find what it needs. id=1

INSERT id=2,val=b
DELETE id=4
id=2
Build indexes to allow it to jump to id=3
INSERT id=3,val=c
locations in the log. UPDATE val=X (id=3)
id=4
UPDATE val=Y (id=4)
…
CMU 15-445/645 (Fall 2019)
7
Page
log backwards and "recreates" the id=1,val=a
id=2,val=b
tuple to find what it needs. id=3,val=X
id=4,val=Y
Build indexes to allow it to jump to
locations in the log.
Periodically compact the log.
CMU 15-445/645 (Fall 2019)

7
Page
log backwards and "recreates" the id=1,val=a
id=2,val=b
tuple to find what it needs. id=3,val=X
id=4,val=Y
Build indexes to allow it to jump to
locations in the log.
Periodically compact the log.
CMU 15-445/645 (Fall 2019)

8
T O D AY ' S A G E N D A
Data Representation
System Catalogs
Storage Models
CMU 15-445/645 (Fall 2019)

9
TUPLE STORAGE
A tuple is essentially a sequence of bytes.

It's the job of the DBMS to interpret those bytes
into attribute types and values.
The DBMS's catalogs contain the schema

information about tables that the system uses to
figure out the tuple's layout.
CMU 15-445/645 (Fall 2019)

10
D ATA R E P R E S E N TAT I O N
INTEGER/BIGINT/SMALLINT/TINYINT
→ C/C++ Representation
FLOAT/REAL vs. NUMERIC/DECIMAL
→ IEEE-754 Standard / Fixed-point Decimals
VARCHAR/VARBINARY/TEXT/BLOB
→ Header with length, followed by data bytes.
TIME/DATE/TIMESTAMP
→ 32/64-bit integer of (micro)seconds since Unix epoch
CMU 15-445/645 (Fall 2019)

11
VA R I A B L E P R E C I S I O N N U M B E R S
Inexact, variable-precision numeric type that uses

the "native" C/C++ types.
→ Examples: FLOAT, REAL/DOUBLE
Store directly as specified by IEEE-754.
Typically faster than arbitrary precision numbers

but can have rounding errors…
CMU 15-445/645 (Fall 2019)

12
Rounding Example Output

#include <stdio.h> x+y = 0.300000
0.3 = 0.300000
int main(int argc, char* argv[]) {
float x = 0.1;
float y = 0.2;
printf("x+y = %f\n", x+y);
printf("0.3 = %f\n", 0.3);
}
CMU 15-445/645 (Fall 2019)

12
Rounding Example Output

#include <stdio.h> x+y = 0.300000
0.3 = 0.300000
int#include
main(int<stdio.h>
argc, char* argv[]) {
float x = 0.1; x+y = 0.30000001192092895508
int main(int
float argc, char* argv[]) {
y = 0.2; 0.3 = 0.29999999999999998890
float x = =0.1;
printf("x+y %f\n", x+y);
float y = 0.2;
printf("0.3 = %f\n", 0.3);
} printf("x+y = %.20f\n", x+y);
printf("0.3 = %.20f\n", 0.3);
}
CMU 15-445/645 (Fall 2019)

13
FIXED PRECISION NUMBERS
Numeric data types with arbitrary precision and

scale. Used when round errors are unacceptable.
→ Example: NUMERIC, DECIMAL
Typically stored in a exact, variable-length binary

representation with additional meta-data.
→ Like a VARCHAR but not stored as a string
Demo: Postgres, SQL Server, Oracle
CMU 15-445/645 (Fall 2019)

14
POSTGRES: NUMERIC
# of Digits
typedef unsigned char NumericDigit;
Weight of 1st Digit typedef struct {
int ndigits;
Scale Factor int weight;
int scale;
Positive/Negative/NaN int sign;
NumericDigit *digits;
Digit Storage } numeric;
CMU 15-445/645 (Fall 2019)

14
POSTGRES: NUMERIC
# of Digits
typedef unsigned char NumericDigit;
Weight of 1st Digit typedef struct {
int ndigits;
Scale Factor int weight;
int scale;
Positive/Negative/NaN int sign;
NumericDigit *digits;
Digit Storage } numeric;
CMU 15-445/645 (Fall 2019)

15
L A R G E VA L U E S
Tuple
Most DBMSs don't allow a tuple to Header a b c d e
exceed the size of a single page.
To store values that are larger than a

page, the DBMS uses separate Overflow Page
overflow storage pages. VARCHAR DATA
→ Postgres: TOAST (>2KB)
→ MySQL: Overflow (>½ size of page)
→ SQL Server: Overflow (>size of page)
CMU 15-445/645 (Fall 2019)

16
E X T E R N A L VA L U E S T O R A G E
Tuple
Some systems allow you to store a Header a b c d e
really large value in an external file.
Treated as a BLOB type.
→ Oracle: BFILE data type
→ Microsoft: FILESTREAM data type External File
The DBMS cannot manipulate the
contents of an external file. Data
→ No durability protections.
→ No transaction protections.
CMU 15-445/645 (Fall 2019)

16
E X T E R N A L VA L U E S T O R A G E
Tuple
Some systems allow you to store a Header a b c d e
really large value in an external file.
Treated as a BLOB type.
→ Oracle: BFILE data type
→ Microsoft: FILESTREAM data type External File
The DBMS cannot manipulate the
contents of an external file. Data
→ No durability protections.
→ No transaction protections.
CMU 15-445/645 (Fall 2019)

17
S Y S T E M C ATA L O G S
A DBMS stores meta-data about databases in its

internal catalogs.
→ Tables, columns, indexes, views
→ Users, permissions
→ Internal statistics
Almost every DBMS stores their a database's

catalog in itself.
→ Wrap object abstraction around tuples.
→ Specialized code for "bootstrapping" catalog tables.
CMU 15-445/645 (Fall 2019)

18
S Y S T E M C ATA L O G S
You can query the DBMS’s internal

INFORMATION_SCHEMA catalog to get info about
the database.
→ ANSI standard set of read-only views that provide info
about all of the tables, views, columns, and procedures in
a database
DBMSs also have non-standard shortcuts to

retrieve this information.
CMU 15-445/645 (Fall 2019)

19
A C C E S S I N G TA B L E S C H E M A
List all the tables in the current database:

SELECT * SQL-92
FROM INFORMATION_SCHEMA.TABLES
WHERE table_catalog = '<db name>';
\d; Postgres
SHOW TABLES; MySQL
.tables; SQLite
CMU 15-445/645 (Fall 2019)

20
A C C E S S I N G TA B L E S C H E M A
List all the tables in the student table:

SELECT * SQL-92
FROM INFORMATION_SCHEMA.TABLES
WHERE table_name = 'student'
\d student; Postgres
DESCRIBE student; MySQL
.schema student; SQLite
CMU 15-445/645 (Fall 2019)

23
O B S E R VAT I O N
The relational model does not specify that we

have to store all of a tuple's attributes together in a
single page.
This may not actually be the best layout for some

workloads…
CMU 15-445/645 (Fall 2019)

24
WIKIPEDIA EXAMPLE
CREATE TABLE useracct ( CREATE TABLE pages (
userID INT PRIMARY KEY, pageID INT PRIMARY KEY,
userName VARCHAR UNIQUE, title VARCHAR UNIQUE,
⋮ latest INT
); ⮱REFERENCES revisions (revID),
);
CREATE TABLE revisions (

revID INT PRIMARY KEY,
userID INT REFERENCES useracct (userID),
pageID INT REFERENCES pages (pageID),
content TEXT,
updated DATETIME
);
CMU 15-445/645 (Fall 2019)
25
O LT P
SELECT P.*, R.*
On-line Transaction Processing: FROM pages AS P
→ Simple queries that read/update a small INNER JOIN revisions AS R
amount of data that is related to a single ON P.latest = R.revID
entity in the database. WHERE P.pageID = ?
This is usually the kind of application UPDATE useracct

that people build first. SET lastLogin = NOW(),
hostname = ?
WHERE userID = ?
INSERT INTO revisions

VALUES (?,?…,?)
CMU 15-445/645 (Fall 2019)
26
OLAP
SELECT COUNT(U.lastLogin),
On-line Analytical Processing: EXTRACT(month FROM
→ Complex queries that read large portions U.lastLogin) AS month
FROM useracct AS U
of the database spanning multiple entities. WHERE U.hostname LIKE '%.gov'
GROUP BY
EXTRACT(month FROM U.lastLogin)
You execute these workloads on the
data you have collected from your
OLTP application(s).
CMU 15-445/645 (Fall 2019)

Operation Complexity W O R K L O A D C H A R A C T E R I Z AT I O N
Complex
OLAP
HTAP
OLTP
Simple
Writes Reads
Workload Focus [SOURCE]
CMU 15-445/645 (Fall 2019)
28
D ATA S T O R A G E M O D E L S
The DBMS can store tuples in different ways that

are better for either OLTP or OLAP workloads.
We have been assuming the n-ary storage model

(aka "row storage") so far this semester.
CMU 15-445/645 (Fall 2019)

29
N-ARY STORAGE MODEL (NSM)
The DBMS stores all attributes for a single tuple

contiguously in a page.
Ideal for OLTP workloads where queries tend to

operate only on an individual entity and insert-
heavy workloads.
CMU 15-445/645 (Fall 2019)

30

Header userID userName userPass hostname lastLogin ←Tuple #1

Header - - - - - ←Tuple #4
CMU 15-445/645 (Fall 2019)

30

NSM Disk Page

Header userID userName userPass hostname lastLogin
Header - - - - -
CMU 15-445/645 (Fall 2019)

31

SELECT * FROM useracct
WHERE userName = ? Lecture 7
AND userPass = ?
Index
NSM Disk Page

Header - - - - -
CMU 15-445/645 (Fall 2019)

31

SELECT * FROM useracct
WHERE userName = ? Lecture 7
AND userPass = ?
Index
INSERT INTO useracct
VALUES (?,?,…?)
NSM Disk Page
Header userID
- userName
- userPass
- hostname
- lastLogin
-
CMU 15-445/645 (Fall 2019)

32

EXTRACT(month FROM U.lastLogin) AS month
FROM useracct AS U
WHERE U.hostname LIKE '%.gov'
GROUP BY EXTRACT(month FROM U.lastLogin)
CMU 15-445/645 (Fall 2019)

32

FROM useracct AS U
NSM Disk Page

CMU 15-445/645 (Fall 2019)

32

FROM useracct AS U
NSM Disk Page

CMU 15-445/645 (Fall 2019)

32

FROM useracct AS U
NSM Disk Page

CMU 15-445/645 (Fall 2019)

32

FROM useracct AS U
NSM Disk Page

Useless Data
CMU 15-445/645 (Fall 2019)
33
N-ARY STORAGE MODEL
Advantages
→ Fast inserts, updates, and deletes.
→ Good for queries that need the entire tuple.
Disadvantages
→ Not good for scanning large portions of the table and/or
a subset of the attributes.
CMU 15-445/645 (Fall 2019)

34
DECOMPOSITION STORAGE MODEL (DSM)
The DBMS stores the values of a single attribute

for all tuples contiguously in a page.
→ Also known as a "column store".
Ideal for OLAP workloads where read-only

queries perform large scans over a subset of the
table’s attributes.
CMU 15-445/645 (Fall 2019)

35

CMU 15-445/645 (Fall 2019)

35

DSM Disk Page

hostname hostname hostname hostname hostname hostname
userID lastLogin hostname hostname hostname hostname hostname hostname

userName
userPass
CMU 15-445/645 (Fall 2019)

36
DECOMPOSITION STORAGE MODEL

(DSM)
FROM useracct AS U
CMU 15-445/645 (Fall 2019)

36
DECOMPOSITION STORAGE MODEL

(DSM)
FROM useracct AS U
DSM Disk Page

CMU 15-445/645 (Fall 2019)

37
T U P L E I D E N T I F I C AT I O N
Choice #1: Fixed-length Offsets

→ Each value is the same length for an attribute.
Choice #2: Embedded Tuple Ids
→ Each value is stored with its tuple id in a column.
Offsets Embedded Ids

A B C D A B C D
0 0 0 0 0
1 1 1 1 1
2 2 2 2 2
3 3 3 3 3
CMU 15-445/645 (Fall 2019)
38
Advantages
→ Reduces the amount wasted I/O because the DBMS only
reads the data that it needs.
→ Better query processing and data compression (more on
this later).
Disadvantages
→ Slow for point queries, inserts, updates, and deletes
because of tuple splitting/stitching.
CMU 15-445/645 (Fall 2019)

39
DSM SYSTEM HISTORY
1970s: Cantor DBMS

1980s: DSM Proposal
1990s: SybaseIQ (in-memory only)
2000s: Vertica, VectorWise, MonetDB
2010s: Everyone
CMU 15-445/645 (Fall 2019)

40
CONCLUSION
The storage manager is not entirely independent

from the rest of the DBMS.
It is important to choose the right storage model

for the target workload:
→ OLTP = Row Store
→ OLAP = Column Store
CMU 15-445/645 (Fall 2019)

41
D ATA B A S E S T O R A G E
Problem #1: How the DBMS represents the

database in files on disk.
Problem #2: How the DBMS manages its memory

and move data back-and-forth from disk. ← Next
CMU 15-445/645 (Fall 2019)

Database Storage: Intro To Database Systems Andy Pavlo

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Database Storage: Intro To Database Systems Andy Pavlo

Uploaded by

Copyright:

Available Formats

Database Storage

Intro to Database Systems Andy Pavlo

Homework #1 is due September 11th @ 11:59pm

Project #1 will be released on September 11th

CMU 15-445/645 (Fall 2019)

The DBMS assumes that the primary storage

The DBMS's components manage the movement

CMU 15-445/645 (Fall 2019)

The slot array maps "slots" to the

The slot array maps "slots" to the

The system appends log records to the DELETE id=4

file of how the database was modified: INSERT id=3,val=c

CMU 15-445/645 (Fall 2019)

tuple to find what it needs. INSERT id=2,val=b

tuple to find what it needs. id=1

Periodically compact the log.

CMU 15-445/645 (Fall 2019)

Periodically compact the log.

CMU 15-445/645 (Fall 2019)

CMU 15-445/645 (Fall 2019)

A tuple is essentially a sequence of bytes.

The DBMS's catalogs contain the schema

CMU 15-445/645 (Fall 2019)

CMU 15-445/645 (Fall 2019)

Inexact, variable-precision numeric type that uses

Store directly as specified by IEEE-754.

Typically faster than arbitrary precision numbers

CMU 15-445/645 (Fall 2019)

Rounding Example Output

CMU 15-445/645 (Fall 2019)

Rounding Example Output

CMU 15-445/645 (Fall 2019)

FIXED PRECISION NUMBERS

Numeric data types with arbitrary precision and

Typically stored in a exact, variable-length binary

Demo: Postgres, SQL Server, Oracle

CMU 15-445/645 (Fall 2019)

CMU 15-445/645 (Fall 2019)

CMU 15-445/645 (Fall 2019)

To store values that are larger than a

CMU 15-445/645 (Fall 2019)

CMU 15-445/645 (Fall 2019)

CMU 15-445/645 (Fall 2019)

A DBMS stores meta-data about databases in its

Almost every DBMS stores their a database's

CMU 15-445/645 (Fall 2019)

You can query the DBMS’s internal

DBMSs also have non-standard shortcuts to

CMU 15-445/645 (Fall 2019)

List all the tables in the current database:

SHOW TABLES; MySQL

CMU 15-445/645 (Fall 2019)

List all the tables in the student table:

DESCRIBE student; MySQL

.schema student; SQLite

CMU 15-445/645 (Fall 2019)

The relational model does not specify that we

This may not actually be the best layout for some

CMU 15-445/645 (Fall 2019)

CREATE TABLE revisions (

This is usually the kind of application UPDATE useracct

INSERT INTO revisions

CMU 15-445/645 (Fall 2019)