You are on page 1of 43

Demystifying

Presented by
Asher Snyder
co-founder of

www.postgresql.org
Latest PostgreSQL version 9.0.1

@ashyboy
What is PostgreSQL?
• Completely Open Source Database System
– Started in 1995
– Open source project
– Not owned by any one company
– Controlled by the community
– Can’t be bought (Looking at you ORACLE)
• ORDMBS
– Object-relational database management system?
• RDBMS with object-oriented database model
– That’s right, it’s not just an RDMBS
– You can create your own objects
– You can inherit
• Fully ACID compliant
– Atomicity, Consistency, Isolation, Durability. Guarantees that
database transactions are processed reliably.
• ANSI SQL compliant
Notable Features

• Transactions
• Functions (Stored procedures)
• Rules
• Views
• Triggers
• Inheritance
• Custom Types
• Referential Integrity
• Array Data Types
• Schemas
• Hot Standby (as of 9.0)
• Streaming Replication (as of 9.0)
Support

• Excellent Personal Support!


– Vibrant Community
• Active Mailing List
• Active IRC - #postgreSQL on irc.freenode.net
– Absolutely amazing

• Complete, Extensive and Detailed Documentation


– http://www.postgresql.org/docs/
• Regular and frequent releases and updates
– Public Roadmap
• Support for older builds
– Currently support and release updates to builds as
old as 5 years
Support

• Numerous Shared Hosts


– Such as A2hosting (http://www.a2hosting.com)
• Numerous GUI Administration Tools
– pgAdmin (http://www.pgadmin.org/)
– php pgAdmin
(http://phppgadmin.sourceforge.net/)
– Notable commercial tools
• Navicat (http://www.navicat.com)
• EMS (http://sqlmanager.net)
– Many many more
• http://wiki.postgresql.org/wiki/Community_Guide_to_
PostgreSQL_GUI_Tools
Installation

• Despite what you’ve heard


Postgres is NOT hard to install
On Ubuntu or Debian:
$ apt-get install postgresql

On Gentoo:
$ emerge postgresql

Manual Installation:
$ adduser postgres
mkdir /usr/local/pgsql/data
chown postgres /usr/local/pgsql/data
su - postgres
/usr/local/pgsql/bin/initdb -D /usr/local/pgsql/data
/usr/local/pgsql/bin/postgres -D /usr/local/pgsql/data
Installation (cont)

• Regardless of what installation method


you choose. Make sure to modify
postgresql.conf and pg_hba.conf
configuration files if you want to allow
outside access.
pg_hba.conf - Controls which hosts are allowed to connect
# TYPE DATABASE USER CIDR-ADDRESS METHOD
host all all 0.0.0.0/0 md5
Allows for connection from any outside connection with md5 verification
postgresql.conf - PostgreSQL configuration file
# - Connection Settings
listen_addresses = '*' # IP address to listen on
#listen_addresses = 'localhost' # default
Allows PostgreSQL to listen on any address
Getting Started

pg_hba.conf - Controls which hosts are allowed to connect

Start from distro


$ /etc/init.d/postgresql start

alternatively, you can start it manually


$ /usr/local/pgsql/bin/postgres -D /usr/local/pgsql/data
Creating Your Database

Launch psql

$ psql
Create your first PostgreSQL database

CREATE DATABASE first_db;


Connect to your newly created database

\c first_db
Tables
Tables - Creating

CREATE TABLE users (


"user_id" SERIAL,
"email" TEXT,
"firstname" TEXT,
"lastname" TEXT,
"password" TEXT,
PRIMARY KEY("user_id")
);

• What’s a SERIAL?
– Short for INTEGER NOT NULL with default value of
nextval('users_user_id_seq')
• Similar to AUTO_INCREMENT property of other databases.
Tables - Inserting

We can explicitly define each field and the value associated with it.
This will set user_id to the default value.

INSERT INTO users (email, firstname, lastname, password)


VALUES ('asnyder@noloh.com', 'Asher', 'Snyder', 'Pass');

Alternatively, we can not specify the column names and insert


based on the column order, using DEFAULT for the user_id.

INSERT INTO users VALUES (DEFAULT, 'john@example.com',


'John', 'Doe', 'OtherPass');

Lets see the results


SELECT * FROM users;
user_id | email | firstname | lastname | password
--------+-------------------+-----------+------------------+-----------+----------------------------
1 | asnyder@noloh.com | Ash | Snyder | Pass
2 | john@example.com | John | Doe | OtherPass
Views
Views

• Views allow you store a query for easy retrieval later.


– Query a view as if it were a table
– Allows you to name your query
– Use views as types
– Abstract & encapsulate table structure changes
• Allows for easier modification & extension of your database

Create basic view


CREATE VIEW v_get_all_users AS
SELECT * FROM users;

Query the view


SELECT * FROM v_get_all_users;
user_id | email | firstname | lastname | password
--------+-------------------+-----------+------------------+-----------+----------------------------
1 | asnyder@noloh.com | Ash | Snyder | Pass
2 | john@example.com | John | Doe | OtherPass
Views (cont)

Alter Table
ALTER TABLE users ADD COLUMN "time_created"
TIMESTAMP WITHOUT TIME ZONE DEFAULT

Now, if we were to query the table we would see a timestamp showing us when
the user was created.

SELECT * FROM users;


user_id | email | firstname | lastname | password | time_created
--------+-------------------+-----------+------------------+-----------+----------------------------
1 | asnyder@noloh.com | Ash | Snyder | Pass | 2010-10-27 14:30:07.335936
2 | john@example.com | John | Doe | OtherPass | 2010-10-27 14:30:07.335936

As you can see, this is not very useful for humans. This is where a view can
come in to make your life easier.
Views (cont)

Alter View
CREATE OR REPLACE VIEW v_get_all_users
AS
SELECT user_id,
email,
firstname,
lastname,
password,
time_created,
to_char(time_created, 'FMMonth
FMDDth, YYYY FMHH12:MI:SS AM') as
friendly_time
FROM users;
Views (cont)

Now, when we query the view we can actually interpret time_created

SELECT * FROM v_get_all_users;


user_id | email | firstname | lastname | password | time_created
---------+-------------------+-----------+------------------+-----------+----------------------------
1 | asnyder@noloh.com | Ash | Snyder | Pass | 2010-10-27 14:30:07.335936
2 | john@example.com | John | Doe | OtherPass | 2010-10-27 15:20:05.235936

| friendly_time
+-------------------------------
October 27th, 2010 2:30:07 PM
October 27th, 2010 3:20:05 PM
Views (cont) – Joined View
Finally, lets create a joined view

Create companies table


CREATE TABLE companies (
"company_id" SERIAL,
"company_name" TEXT,
PRIMARY KEY("company_id")
);
Add company
INSERT INTO companies VALUES(DEFAULT, 'NOLOH LLC.');

Add company_id to users


ALTER TABLE users ADD COLUMN company_id INTEGER;

Update users
UPDATE users SET company_id = 1;
Views (cont) – Joined View
Alter view
CREATE OR REPLACE VIEW v_get_all_users
AS
SELECT user_id,
email,
firstname,
lastname,
password,
time_created,
to_char(time_created, 'FMMonth FMDDth, YYYY
FMHH12:MI:SS AM') as friendly_time,
t2.company_id,
t2.company_name
FROM users t1
LEFT JOIN companies t2 ON (t1.company_id =
t2.company_id);
Views (cont)
Query view

SELECT * FROM v_get_all_users;


user_id | email | firstname | lastname | password | time_created
---------+-------------------+-----------+------------------+-----------+----------------------------
1 | asnyder@noloh.com | Ash | Snyder | SomePass | 2010-10-27 14:30:07.335936
2 | john@example.com | John | Doe | OtherPass | 2010-10-27 15:20:05.235936

| friendly_time | company_id | company_name


+------------------------------+------------+--------------
October 27th, 2010 2:30:07 PM | 1 | NOLOH LLC.
October 27th, 2010 3:20:05 PM | 1 | NOLOH LLC.

Nice! Now instead of having to modify a query each time we can just use
v_get_all_users. We can even use this VIEW as a return type when
creating your own database functions.
Functions
Functions
• Also known as Stored Procedures
• Allows you to carry out operations that would normally
take several queries and round-trips in a single function
within the database
• Allows database re-use as other applications can interact
directly with your stored procedures instead of a middle-
tier or duplicating code
• Can be used in other functions
• First class citizens
– Query functions like tables
– Create functions in language of your choice
• SQL, PL/pgSQL, C, Python, etc.
– Allowed to modify tables and perform multiple operations
– Defaults
– In/Out parameters
Functions (cont)
Create basic function
CREATE FUNCTION f_add_company(p_name TEXT) RETURNS
INTEGER
AS
$func$
INSERT INTO companies (company_name) VALUES ($1)
RETURNING company_id;
$func$
LANGUAGE SQL;

Call f_add_company
SELECT f_add_company('Google');
f_add_company
---------------
2
(1 row)
Functions (cont) – Syntax Explained
• Return type
– In this case we used INTEGER
• Can be anything you like including your views and own custom types
– Can even return an ARRAY of types, such as INTEGER[]
• Do not confusing this with returning a SET of types.
• Returning multiple rows of a particular types is done
through RETURN SETOF. For example, RETURN SETOF
v_get_all_users.
• $func$ is our delimiter separating the function body. It
can be any string you like. For example $body$ instead of
$func$.
• $1 refers to parameter corresponding to that number. We
can also use the parameter name instead. In our example
this would be 'p_name'.
Functions (cont) – PL/pgSQL

• If you’re using PostgreSQL < 9.0 make


sure you add PL/pgSQL support

CREATE TRUSTED PROCEDURAL LANGUAGE "plpgsql"


HANDLER "plpgsql_call_handler"
VALIDATOR "plpgsql_validator";
Functions (cont) – PL/pgSQL

CREATE OR REPLACE FUNCTION f_add_company(p_name TEXT) RETURNS INTEGER


AS
$func$
DECLARE
return_var INTEGER;
BEGIN
SELECT INTO return_var company_id FROM companies WHERE
lower(company_name) = lower($1);
IF NOT FOUND THEN
INSERT INTO companies (company_name) VALUES ($1) RETURNING
company_id INTO return_var;
END IF;
RETURN return_var;
END;
$func$
LANGUAGE plpgsql;
Functions (cont)

Call f_add_company
SELECT f_add_company('Zend');
f_add_company
---------------
3
(1 row)

Call f_add_company with repeated entry


SELECT f_add_company('Google');
f_add_company
---------------
2
(1 row)

• We can see that using functions in our database allows us


to integrate safeguards and business logic into our
functions allowing for increased modularity and re-use.
Triggers
Triggers
• Specifies a function to be called BEFORE or AFTER any
INSERT, UPDATE, or DELETE operation
– Similar to an event handler (but clearly confined) in event
based programming
– BEFORE triggers fire before the data is actually inserted.
– AFTER triggers fire after the statement is executed and
data is inserted into the row
• Function takes NO parameters and returns type
TRIGGER
• Most commonly used for logging or validation
• Can be defined on entire statement or a per-row basis
– Statement level triggers should always return NULL
• As of 9.0 can be specified for specific columns and
specific WHEN conditions
Triggers (cont)
Create logs table
CREATE TABLE logs (
"log_id" SERIAL,
"log" TEXT,
PRIMARY KEY("log_id")
);

Create trigger handler


CREATE FUNCTION "tr_log_handler"()
RETURNS trigger AS
$func$
DECLARE
log_string TEXT;
BEGIN
log_string := 'User ' || OLD.user_id || ' changed ' || CURRENT_TIMESTAMP;
IF NEW.email != OLD.email THEN
log_string := log_string || 'email changed from '
|| OLD.email || ' to ' || NEW.email || '. ';
END IF;
IF NEW.firstname != OLD.firstname THEN
log_string := log_string || 'firstname changed from '
|| OLD.firstname || ' to ' || NEW.firstname || '. ';
END IF;
IF NEW.lastname != OLD.lastname THEN
log_string := log_string || 'lastname changed from '
|| OLD.lastname || ' to ' || NEW.lastname || '. ';
END IF;
INSERT INTO logs (log) VALUES (log_string);
RETURN NEW;
END;
$func$
LANGUAGE plpgsql;
Triggers (cont)
Create trigger
CREATE TRIGGER "tr_log" AFTER UPDATE
ON users FOR EACH ROW
EXECUTE PROCEDURE "tr_log_handler"();

Update user
UPDATE users SET firstname = 'Ash' WHERE user_id = 1;

Display logs
SELECT * FROM logs;
log_id | log
--------+---------------------------------------------------------------------------------
1 | User 1 changed 2010-08-30 23:43:23.771347-04 firstname changed from Asher to Ash.
(1 row)
Full-Text Search
Full-Text Search

• Allows documents to be preprocessed


• Search through text to find matches
• Sort matches based on relevance
– Apply weights to certain attributes to
increase/decrease relevance
• Faster than LIKE, ILIKE, and LIKE with regular
expressions
• Define your own dictionaries
– Define your own words, synonyms, phrase
relationships, word variations
Full-Text Search

• tsvector
– Vectorized text data
• tsquery
– Search predicate
• Converted to Normalized lexemes
– Ex. run, runs, ran and running are forms of the same lexeme

• @@
– Match operator
Full-Text Search (cont)
Match Test 1
SELECT zendcon 2010 is a great conference'::tsvector
@@ 'conference & great'::tsquery;
?column?
---------
t
(1 row)

Match Test 2
SELECT zendcon 2010 is a great conference'
@@ 'conference & bad'
?column?
---------
f
(1 row)
Full-Text Search (cont) – to_
to_tsvector & plainto_tsquery
SELECT to_tsvector(zendcon 2010 is a great conference‘)
@@ plainto_tsquery('conference & great') ;
?column?
---------
t
(1 row)

Search through tables


SELECT * FROM companies WHERE to_tsvector(company_name) @@
to_tsquery('unlimited');
company_id | company_name
------------+--------------
(0 rows)
Full-Text Search

• setweight
– Possible weights are A, B, C, D
• equivalent 1.0, 0.4, 0.2, 0.1 respectively
• ts_rank
– Normal ranking function
• ts_rank_cd
– uses the cover density method of ranking, as
specified in Clarke, Cormack, and Tudhope’s
“Relevance Ranking for One to Three Term
Queries” in the journal, “Information Processing
and Management”, 1999.
Full-Text Search (cont) - Ranking
setweight & ts_rank_cd
SELECT case_id, file_name,
ts_rank_cd(setweight(to_tsvector(coalesce(t1.file_name)), 'A') ||
setweight(to_tsvector(coalesce(t1.contents)), 'B'), query) AS rank
FROM cases.cases t1, plainto_tsquery('copyright') query
WHERE to_tsvector(file_name || ' ' || contents) @@ query
ORDER BY rank DESC LIMIT 10

case_id | file_name | rank


---------+------------------------------------------------------+------
101 | Harper & Row v Nation Enter 471 US 539.pdf | 84.8
113 | IN Re Literary Works Elec Databases 509 F3d 522.pdf | 76
215 | Lexmark Intl v Static Control 387 F3d 522.pdf | 75.2
283 | Feist Pubs v Rural Tel 499 US 340.pdf | 67.2
216 | Lucks Music Library v Ashcroft 321 F Supp 2d 107.pdf | 59.2
342 | Blue Nile v Ice 478 F Supp 2d 1240.pdf | 50.8
374 | Perfect 10 v Amazon 487 F3d 701.pdf | 43.6
85 | Pillsbury v Milky Way 215 USPQ 124.pdf | 43.6
197 | St Lukes Cataract v Sanderson 573 F3d 1186.pdf | 42
272 | Religious Technology v Netcom 907 F Supp 1361.pdf | 42
(10 rows)
Communication
Communication - Native

• Use native pgsql functions


– pg_connect, pg_query, pg_fetch_row, etc.
Connect to database
$dbconn = pg_connect("dbname=mary");

//alternatively connect with host & port information


$dbconn2 = pg_connect("host=localhost port=5432 dbname=mary");

Query database
$users = pg_query('SELECT * FROM users');
while($user = pg_fetch_row($users))
print_r($user);
C o m m u n i c a t i o n – 3 rd P a r t y

• PDO
• Doctrine
• Propel
• Redbean
• Zend Framework
• phpDataMapper
• PHP Framework Implementation
– Most PHP Frameworks provide some sort of
communication and connection functionality
Communication –

Create Connection
//Create Connection
$firstDB = new DataConnection(Data::Postgres, 'first_db');

Query database
$users = $firstDB->ExecSQL('SELECT * FROM users');

Query database with params


$firstDB->ExecSQL('INSERT INTO companies (company_name) VALUES($1)', $company);

Query function
$user = $firstDB
->ExecFunction('f_add_user', $email, $first, $last, $pass, $company);

Query view
$users = $firstDB->ExecView('v_get_all_users');

Data::$Links – Persistent method to connect and access your database


Data::$Links->FirstDb = new DataConnection(Data::Postgres, 'first_db');
Data::$Links->FirstDb->ExecSQL('SELECT * FROM users');
Questions

? Presented by
Asher Snyder
co-founder of

@ashyboy