You are on page 1of 5

Writing data

Your client requests PostgreSQL to write data after inserting, updating, or

deleting a row on a table. PostgreSQL needs to bring the page into memory (if

not there), dirty it, and eventually write the page to disk. To improve

performance, it usually writes the transaction first to the WAL log, and the

page gets written to disk by a separate process.

Transactions also complicate the equation, since PostgreSQL might need to

keep several different versions of the data in memory. This is done using

multi-version concurrency control (MVCC), described later.

Learn how PostgreSQL processes a write request below.

● Client issues write request.


● User backend checks for data in shared_buffers.
● If not, reads into shared_buffers.
● User backend dirties data in shared_buffers.
● User backend records transaction in WAL.
● On commit, walwriter commits wal_buffers to pg_xlog.
● bgwriter (clock sweep) or checkpoint (forced) writes dirtied buffers to disk.
Multi-version concurrency control
Given that several clients may access the same data simultaneously, database

systems need to enable concurrency control. Relational database systems

provide the concept of transactions, or units of work that are executed

atomically. This can change several rows in different tables, which makes

concurrency control even harder.

Many database systems use locks to implement concurrency control, which is

feasible, but can result in decreased concurrency as early transactions block

newer ones. PostgreSQL uses multi-version concurrency control (MVCC),

keeping several versions of the same data in memory when needed, and

providing the appropriate version to each transaction. This provides for

concurrent access, but leads to problems if you mix long-lived transactions

with lots of short ones.

Learn more about concurrency control in PostgreSQL below.


Why MVCC?
Maintains data consistency internally.
Enables each transaction to see a snapshot of data (a database version) as it
was some time ago while querying a database.
Prevents transactions from viewing inconsistent data.
Provides transaction isolation in concurrent transactions.
Ensures that readers do not block writers, and writers do not block readers.

Logical and physical layouts


PostgreSQL lays out many of its structures like a file system. Here we will cover

the most important logical structures, and how they might map to physical

structures. Some logical items, like tablespaces and users, do not map to a

particular physical structure.

Learn more about PostgreSQL layout structures below.

instance

● Ties to a directory (data directory)

● Corresponds to one TCP/IP port (for postmaster)

● One postmaster

● One set of shared_buffers


database

● Multiple databases per instance

● Each database maps to a directory under base in the instance’s data directory

(named by its oid, which you can find on the pg_database table)

schema/namespace

● Purely logical grouping of relations

● A schema only exists in catalogs, with no real physical structure

tablespace

● A directory on disk where PostgreSQL can store relations

● Used for tiering storage

user

● Can be assigned across databases

● Does not have physical structure (other than in catalogs)

You might also like