Professional Documents
Culture Documents
SQ L
SQ L
WHERE
SELECT
column_name1,
column_name2
FROM table_name
WHERE condition
SYNTAX
SELECT
*
FROM payment
WHERE amount = 10.99
SYNTAX
SELECT
first_name,
last_name
FROM customer
WHERE first_name = 'ADAM'
SYNTAX
SELECT
first_name,
last_name
FROM customer
WHERE first_name = 'ADAM'
Result
How many payment were made by the customer
with customer_id = 100?
SELECT
*
FROM payment
WHERE amount > 10.99
SYNTAX
SELECT
*
FROM payment
WHERE amount < 10.99
SYNTAX
SELECT
*
FROM payment
WHERE amount < 10.99
ORDER BY amount DESC
SYNTAX
SELECT
*
FROM payment
WHERE amount <= 10.99
ORDER BY amount DESC
SYNTAX
SELECT
*
FROM payment
WHERE amount >= 10.99
ORDER BY amount DESC
SYNTAX
SELECT
*
FROM payment
WHERE amount != 10.99
SYNTAX
SELECT
*
FROM payment
WHERE amount <> 10.99
SYNTAX
SELECT
*
FROM payment
WHERE amount <= 10.99
ORDER BY amount DESC
SYNTAX
SELECT
first_name,
last_name
FROM customer
WHERE first_name is null
Side note!
SYNTAX
Case sensitive or not?
SELECT
first_name,
last_name
FROM customer
WHERE first_name is not null
Side note!
SYNTAX
Case sensitive or not?
select
first_name,
last_name
FROM customer
WHERE first_name = 'ADAM'
Side note!
SYNTAX
Case sensitive or not?
seLEcT
first_name,
last_name
FROM customer
WHERE first_name = 'ADAM'
Side note!
SYNTAX
Case sensitive or not?
select
FIRst_nAme,
last_name
FROM customer
WHERE first_name = 'ADAM'
Side note!
SYNTAX
Case sensitive or not?
select
"first_name",
last_name
FROM customer
WHERE first_name = 'ADAM'
SYNTAX
select
"First_name",
last_name
FROM customer
WHERE first_name = 'ADAM'
Result
The inventory manager asks you how rentals have
not been returned yet (return_date is null).
The sales manager asks you how for a list of all the
payment_ids with an amount less than or equal to $2.
Include payment_id and the amount.
SELECT
*
FROM payment
WHERE amount = 10.99
AND customer_id = 426
SYNTAX
SELECT
*
FROM payment
WHERE amount = 10.99
OR amount = 9.99
SYNTAX
SELECT
*
FROM payment
WHERE amount = 10.99 OR 9.99
SYNTAX
SELECT
*
FROM payment
WHERE amount = 10.99
OR amount = 9.99
AND customer_id = 426
SYNTAX
SELECT
*
FROM payment
WHERE amount = 10.99
OR (amount = 9.99
AND customer_id = 426)
SYNTAX
SELECT
*
FROM payment
WHERE (amount = 10.99
OR amount = 9.99)
AND customer_id = 426
SYNTAX
SELECT
first_name,
last_name
FROM customer
WHERE first_name = 'ADAM'
The suppcity manager asks you about a list of all the payment of
the customer 322, 346 and 354 where the amount is either less
than $2 or greater than $10.
Result
BETWEEN … AND …
SELECT
payment_id,
amount
FROM payment
WHERE amount NOT BETWEEN 1.99 AND 6.99
SYNTAX
SELECT
payment_id,
amount
'YYYY-MM-DD'
FROM payment
WHERE amount BETWEEN '2020-01-24' AND '2020-01-26'
SYNTAX
SELECT
payment_id,
amount
FROM payment
WHERE amount BETWEEN '2020-01-24 0:00' AND '2020-01-26 0:00'
SYNTAX
SELECT
payment_id,
amount
FROM payment
WHERE amount BETWEEN '2020-01-24 0:00' AND '2020-01-26 12:00'
There have been 6 complaints of customers about their
payments.
customer_id: 12,25,67,93,124,234
The concerned payments are all the payments of these
customers with amounts 4.99, 7.99 and 9.99 in January 2020.
Write a SQL query to get a list of the concerned payments!
Result
It should be 7 payments!
There have been some faulty payments and you need to help to
found out how many payments have been affected.
How many payments have been made on January 26th and 27th
2020 (including entire 27th) with an amount between 1.99 and 3.99?
Result
LIKE
SELECT
*
FROM actor
WHERE first_name LIKE 'A%'
Note!
It's case-sensitive
SYNTAX
SELECT
first_name
FROM actor
WHERE first_name LIKE 'a%'
Note!
It's case-sensitive
SYNTAX
SELECT
first_name
FROM actor
WHERE first_name ILIKE 'a%'
Note!
ILIKE is case-INsensitive
SYNTAX
SELECT
*
FROM actor
WHERE first_name LIKE 'A%'
Note!
It's case-sensitive
SYNTAX
SELECT
*
FROM actor
WHERE first_name LIKE '%A%'
SYNTAX
SELECT
*
FROM actor
WHERE first_name LIKE '_A%'
SYNTAX
SELECT
*
FROM actor
WHERE first_name LIKE '__A%'
SYNTAX
SELECT
*
FROM actor
WHERE first_name NOT LIKE '%A%'
SYNTAX
SELECT
*
FROM payment
WHERE amount = 10.99
AND customer_id = 426
You need to help the inventory manager to find out:
Result
How many customers are there with a first name that is
3 letters long and either an 'X' or a 'Y' as the last letter in the last
name?
Result
Commenting & Aliases
✓ Comment to make code more readable & understandable
/*
✓ Use Multiple lines comment
[…]*/
Note!
Use comments to
explain your code!
SYNTAX
-- 2020/07/01 by Nikolai
-- Query that filters amount
SELECT
*
FROM payment
WHERE amount = 10.99
Note!
Comments will not be SYNTAX
processed as code!
/* 2020/07/01 by Nikolai
Query to filter by amount*/
SELECT
*
FROM payment
WHERE amount = 10.99
SYNTAX
SELECT
*
FROM payment
WHERE amount = 10.99
--AND customer_id = 426
SYNTAX
SELECT
*
FROM payment
--WHERE amount = 10.99
--AND customer_id = 426
SYNTAX
SELECT
*
FROM payment
/*WHERE amount = 10.99
AND customer_id = 426*/
SYNTAX
SELECT
payment_id AS invoice_no
FROM payment
SYNTAX
SELECT
COUNT(*)
FROM payment
WHERE amount = 10.99
AND customer_id = 426
for today
1. How many movies are there that contain 'Saga'
in the description and where the title starts either
with 'A' or ends with 'R'?
Use the alias 'no_of_movies'.
SELECT
COUNT(*) AS number_of_movie
FROM film
WHERE description LIKE '%Saga%'
AND (title LIKE 'A%'
OR title LIKE '%R')
Solution 2
2. Create a list of all customers where the first name
contains 'ER' and has an 'A' as the second letter.
Order the results by the last name descendingly.
SELECT
*
FROM customer
WHERE first_name LIKE '%ER%'
AND first_name LIKE '_A%'
ORDER BY last_name DESC
Solution 3
3. How many payments are there where the amount
is either 0 or between 3.99 and 7.99 and in the
same time have happened on 2020-05-01.
SELECT
COUNT(*)
FROM payment
WHERE (amount = 0
OR amount BETWEEN 3.99 AND 7.99)
AND payment_date BETWEEN '2020-05-01' AND '2020-05-02'
Solution 2
4. How many distinct last names of the
customers are there?
SELECT
COUNT(DISTINCT last_name)
FROM customer
Day 3
Aggregation functions
SUM
Aggregation functions
AVG
Most common
aggregation functions
SUM()
AVG()
MIN()
MAX()
COUNT()
SYNTAX
SELECT
SUM(amount)
FROM payment
SYNTAX
SELECT
COUNT(*)
FROM payment
What we can't do…
SELECT
SUM(amount)
payment_id
FROM payment
No mixing possible!
SELECT
SUM(amount)
Only possible with
payment_id grouping!
FROM payment
Multiple aggregations is possible!
SELECT
SUM(amount),
COUNT(*),
AVG(amount)
FROM payment
Your manager wants to which of the two employees (staff_id)
is responsible for more payments?
Result
Solution
SELECT
MIN(replacement_cost),
MAX(replacement_cost),
ROUND(AVG(replacement_cost),2) AS AVG,
SUM(replacement_cost)
FROM film
GROUP BY
SUM
SYNTAX
SELECT
customer_id,
SUM(amount)
FROM payment
GROUP BY customer_id
SYNTAX
SELECT
customer_id,
SUM(amount)
FROM payment
WHERE customer_id >3
GROUP BY customer_id
SYNTAX
SELECT
customer_id,
SUM(amount)
FROM payment
WHERE customer_id >3
GROUP BY customer_id
ORDER BY customer_id
SYNTAX
SELECT
customer_id, Every column:
In GROUP BY or
SUM(amount) in aggregate functions
FROM payment
GROUP BY customer_id
SYNTAX
SELECT
customer_id, Every column:
In GROUP BY or
SUM(amount) in aggregate functions
FROM payment
GROUP BY customer_id
There are two competitions between the two employees.
Result
Your manager wants to get a better understanding of the
films.
That's why you are asked to write a query to see the
• Minimum
• Maximum
• Average (rounded)
• Sum
of the replacement cost of the films.
Write a SQL query to get the answers!
Result
HAVING
HAVING
COUNT(*)>400
Note!
HAVING an only be used
with GROUP BY!
SYNTAX
SELECT
customer_id,
SUM(amount)
FROM payment
GROUP BY customer_id
HAVING SUM(amount)>200
Solution
SELECT
MIN(replacement_cost),
MAX(replacement_cost),
ROUND(AVG(replacement_cost),2) AS AVG,
SUM(replacement_cost)
FROM film
In 2020, April 28, 29 and 30 were days with very high revenue.
That's why we want to focus in this task only on these days
(filter accordingly).
Find out what is the average payment amount grouped by
customer and day – consider only the days/customers with
more than 1 payment (per customer and day).
Order by the average amount in a descending order.
Result
Day 4
In the email system there was a problem with names where
either the first name or the last name is more than 10 characters
long.
Find these customers and output the list of these first and last
names in all lower case.
Result
In this challenge you have only the email address and the last
name of the customers.
You need to extract the first name from the email address and
concatenate it with the last name. It should be in the form:
"Last name, First name".
Write a SQL query to find out!
Result
Extract the last 5 characters of the email address first.
How can you extract just the dot '.' from the email address?
Result
SUBSTRING
SUBSTRING
SYNTAX
Length,
How many characters?
EXTRACT (day)
EXTRACT
EXTRACT (seconds)
Date/time types
Result
TO_CHAR
TO_CHAR (YYYY-MM)
TO_CHAR
TO_CHAR (Month)
SYNTAX
date/time/interval/number Format
SYNTAX
date/time/interval/number Format
SYNTAX
date/time/interval/number Format
You need to sum payments and group in the following formats:
Result
You need to create a list for the suppcity team of all rental
durations of customer with customer_id 35.
Result
Day 5
Mathematical functions and operators
SUM (replacement_cost) + 1
Result
CASE
CASE
WHEN condition1 THEN result1
WHEN condition2 THEN result2
WHEN conditionN THEN resultN
ELSE result Conditions & results
END
Result
You need to find out how many flights have departed in the
following seasons:
• Winter: December, January, Februar
• Spring: March, April, May
• Summer: June, July, August
• Fall: September, October, November
Result
You want to create a tier list in the following way:
1. Rating is 'PG' or 'PG-13' or length is more then 210 min:
'Great rating or long (tier 1)
2. Description contains 'Drama' and length is more than 90min:
'Long drama (tier 2)'
3. Description contains 'Drama' and length is not more than 90min:
'Shcity drama (tier 3)'
4. Rental_rate less than $1:
'Very cheap (tier 4)'
If one movie can be in multiple categories it gets the higher tier assigned.
How can you filter to only those movies that appear in one of these 4 tiers?
Result
COALESCE
CAST TO TEXT
SYNTAX
INNER JOIN
OUTER JOIN
LEFT JOIN
RIGHT JOIN
What are Joins
✓ Theory + practice
✓ Aliases help with writing & reading the code more easily
SYNTAX
✓ Aliases help with writing & reading the code more easily
SYNTAX
✓ INNER JOIN: Only rows where reference column value is in both tables
You need to work on the seats table and the boarding_passes table.
Result
FULL OUTER JOIN
FULL OUTER JOIN
FULL OUTER JOIN
FULL OUTER JOIN
Sabine YES
Peter NO
Manuel YES
Simon NO
FULL OUTER JOIN
employee city sales
Sandra Frankfurt 500
Sabine Munich 300
Peter Hamburg 200
Manuel Hamburg 400
employee city sales bonus
Michael Munich 100
Sandra Frankfurt 500 YES
Frank Frankfurt 100
Sabine Munich 300 YES
Manuel YES
Simon NO
FULL OUTER JOIN
employee city sales
Sandra Frankfurt 500
Sabine Munich 300
Peter Hamburg 200
Manuel Hamburg 400
employee city sales bonus
Michael Munich 100
Sandra Frankfurt 500 YES
Frank Frankfurt 100
Sabine Munich 300 YES
Manuel YES
Simon NO
FULL OUTER JOIN
employee city sales
Sandra Frankfurt 500
Sabine Munich 300
Peter Hamburg 200
Manuel Hamburg 400
employee city sales bonus
Michael Munich 100
Sandra Frankfurt 500 YES
Frank Frankfurt 100
Sabine Munich 300 YES
Simon NO
FULL OUTER JOIN
employee city sales
Sandra Frankfurt 500
Sabine Munich 300
Peter Hamburg 200
Manuel Hamburg 400
employee city sales bonus
Michael Munich 100
Sandra Frankfurt 500 YES
Frank Frankfurt 100
Sabine Munich 300 YES
Simon NO
FULL OUTER JOIN
employee city sales
Sandra Frankfurt 500
Sabine Munich 300
Peter Hamburg 200
Manuel Hamburg 400 bonus.employee sales.employee city sales bonus
Michael Munich 100 Sandra Sandra Frankfurt 500 YES
Frank Frankfurt 100 Sabine Sabine Munich 300 YES
Manuel YES
Simon NO
LEFT OUTER JOIN
employee city sales
Sandra Frankfurt 500
Sabine Munich 300 LEFT
Peter Hamburg 200
Manuel Hamburg 400
b.employee s.employee city sales bonus
Michael Munich 100
Sandra Sandra Frankfurt 500 YES
Frank Frankfurt 100
Sabine Sabine Munich 300 YES
Result
Try to find out which line (A, B, …, H) has been chosen most
frequently.
Result
You want to create a tier list in the following way:
1. Rating is 'PG' or 'PG-13' or length is more then 210 min:
'Great rating or long (tier 1)
2. Description contains 'Drama' and length is more than 90min:
'Long drama (tier 2)'
3. Description contains 'Drama' and length is not more than 90min:
'Shcity drama (tier 3)'
4. Rental_rate less than $1:
'Very cheap (tier 4)'
If one movie can be in multiple categories it gets the higher tier assigned.
How can you filter to only those movies that appear in one of these 4 tiers?
Result
RIGHT OUTER JOIN
RIGHT OUTER JOIN
FULL OUTER JOIN
What are the customers (first_name, last_name, phone number and their
district) from Texas?
Are there any (old) addresses that are not related to any customer?
Result
Multiple join conditions
Multiple join conditions
What are the customers (first_name, last_name, phone number and their
district) from Texas?
Are there any (old) addresses that are not related to any customer?
Result
Two join tables
Write a query to get first_name, last_name, email and the country from all
customers from Brazil.
Result
Day 7
UNION
FULL OUTER JOIN
employee city sales
Sandra Frankfurt 500
Sabine Munich 300
Peter Hamburg 200
Manuel Hamburg 400 bonus.employee sales.employee city sales bonus
Michael Munich 100 null Sandra Frankfurt 500 YES
Frank Frankfurt 100 null Sabine Munich 300 YES
Combining rows
Combining multiple
select statements
UNION
Duplicates are
decoupled!
Correlated subqueries
name sales name sales
300
Correlated subqueries
name sales city
independently!
Peter 200 Dallas for every single row!
Max 100 Berlin
Anna 400 Berlin
SELECT first_name, sales FROM employees e1
WHERE sales >
(SELECT AVG(sales) FROM employees e2
WHERE e1.city=e2.city )
Show only those movie titles, their associated film_id and replacement_cost
with the lowest replacement_costs for in each rating category – also show the
rating.
Result
Show only those movie titles, their associated film_id and the length that have
the highest length in each rating category – also show the rating.
Result
Correlated subqueries
name sales city min
Result
Show only those films with the highest replacement costs in their rating
category plus show the average replacement cost in their rating category.
Result
Show only those payments with the highest payment for each customer's first
name - including the payment_id of that payment.
How would you solve it if you would not need to see the payment_id?
Result
Day 8
Chapter 8: Challenge
Question 1:
Level: Simple
Topic: DISTINCT
Task: Create a list of all the different (distinct) replacement costs of the films.
Answer: 9.99
Question 2:
Level: Moderate
Task: Write a query that gives an overview of how many films have replacements
costs in the following cost ranges
Question: How many films have a replacement cost in the "low" group?
Answer: 514
Question 3:
Level: Moderate
Topic: JOIN
Task: Create a list of the film titles including their title, length and category name
ordered descendingly by the length. Filter the results to only the movies in the
category 'Drama' or 'Sports'.
Question: In which category is the longest film and how long is it?
Question 4:
Level: Moderate
Task: Create an overview of how many movies (titles) there are in each category
(name).
Question: Which category (name) is the most common among the films?
Question 5:
Level: Moderate
Task: Create an overview of the actors first and last names and in how many movies
they appear.
Question 6:
Level: Moderate
Topic: LEFT JOIN & FILTERING
Task: Create an overview of the addresses that are not associated to any customer.
Answer: 4
Question 7:
Level: Moderate
Task: Create an overview of the cities and how much sales (sum of amount) have
occured there.
Question 8:
Task: Create an overview of the revenue (sum of amount) grouped by a column in the
format "country, city".
Question 9:
Level: Difficult
Topic: Uncorrelated subquery
Task: Create a list with the average of the sales amount each staff_id has per
customer.
Question 10:
Task: Create a query that shows average daily revenue of all Sundays.
Answer: 1410.65
Question 11:
Task: Create a list of movies - with their length and their replacement cost - that are
longer than the average length in each replacement cost group.
Question: Which two movies are the shortest in that list and how long are they?
Question 12:
Task: Create a list that shows how much the average customer spent in total (customer
life-time value) grouped by the different districts.
Question: Which district has the highest average customer life-time value?
Question 13:
Task: Create a list that shows all payments including the payment_id, amount and the
film category (name) plus the total amount that was made in this category. Order the
results ascendingly by the category (name) and as second order criterion by the
payment_id ascendingly.
Question: What is the total revenue of the category 'Action' and what is the lowest
payment_id in that category 'Action'?
Answer: Total revenue in the category 'Action' is 4375.85 and the lowest payment_id
in that category is 16055.
Task: Create a list with the top overall revenue of a film title (sum of amount per title)
for each category (name).
CREATE Table
Schemas
ALTER Data Definition
DROP
Managing tables
CREATE INSERT
DROP DELETE
Views
Data Structures Data itself
Creating database
Very simple
Very simple
Storage size
When to use
Differences Allowed values
which one?
Possible operations
How to store ZIP codes?
Data Types
Other
https://www.postgresql.org/docs/current/datatype.html
Data Types
Numeric
-9223372036854775808 to
BIGINT 8 bytes +9223372036854775807
Large integers
-9223372036854775808 to
BIGINT 8 bytes numeric(precision,
+9223372036854775807
scale) Large integers
numeric(4,2)
Data Types
Strings
yes no
1 0
on off
Data Types
Others
boolean
name state of true or false
phone is_in_stock TRUE, FALSE, NULL
arrayFrank {'+41-39190643'}
Stores a list of values text[] or int[] Depending on type
boolean
name state of true or false
phone is_in_stock TRUE, FALSE, NULL
arrayFrank +41-39190643
Stores a list of values text[] or int[] Depending on type
Maya +42-66764453
boolean
name state of true or false
phone is_in_stock TRUE, FALSE, NULL
Column name
Data type
Constraints
Constraints
What is a constraint?
REFERENCES Ensures referential integrity (only values of another column can be used)
CHECK Ensures that the values in a column satisfies a specific condition
Constraints
TABLE
What constraints do we have?
CONSTRAINTS
PRIMARY KEY
FOREIGN KEY
REFERENCING table
CHILD table
REFERENCED table
PARENT table
Primary & Foreign Key
Notes
SERIAL DEFAULT
INSERT
SERIAL DEFAULT
INSERT
SERIAL DEFAULT
INSERT
RENAME columns
DROP
ALTER TABLE
DROP
ALTER TABLE
DROP
ALTER TABLE
DROP ADD
ALTER TABLE
DROP ADD
ALTER TABLE
DROP ADD
ALTER TABLE
NOT NULL
ALTER TABLE
UPDATE <table>
SET <column>=value
UPDATE
UPDATE <table>
SET <column>=value
UPDATE
UPDATE songs
SET genre='Country music'
UPDATE
UPDATE songs
SET genre='Country music'
UPDATE
UPDATE songs
SET genre='Pop music'
WHERE song_id=4
UPDATE
UPDATE songs
SET genre='Pop music'
WHERE song_id=4
UPDATE
UPDATE songs
SET price=song_id+0.99
UPDATE
UPDATE songs
SET price=song_id+0.99
INSERT
SERIAL DEFAULT
Update all rental prices that are 0.99 to 1.99.
REFERENCES Ensures referential integrity (only values of another column can be used)
CHECK Ensures that the values in a column satisfies a specific condition
Constraints
TABLE
What constraints do we have?
CONSTRAINTS
CONSTRAINT
ALTER TABLE
CONSTRAINT
ALTER TABLE
Pyhsical storage needed! Alternative: Create a view and just store the statement!
Data can change!
CREATE VIEW … AS
Problem: That table will not be updated if data in the underlying tables change!
CREATE VIEW … AS
Problem: That table will not be updated if data in the underlying tables change!
Managing views
Problem: That table will not be updated if data in the underlying tables change!
VIEW
UPDATE songs
SET genre='Country music'
Import & Export
Window
Write a query that returns the list of movies including
- film_id,
- title,
- length,
- category,
- average length of movies in that category.
Result
Write a query that returns all payment details including
- the number of payments that were made by this customer
and that amount
Result
Write a query that returns the running total of how late the flights are
(difference between actual_arrival and scheduled arrival) ordered by flight_id
including the departure airport.
As a second query, calculate the same running total but partition also by the
departure airport.
Result
Write a query that returns the customers' name, the country and how many
payments they have. For that use the existing view customer_list.
Afterwards create a ranking of the top customers with most sales for each
country. Filter the results to only the top 3 customers per country.
Result
Write a query that returns the revenue of the day and the revenue of the
previous day.
Result
Write a query that calculates now the share of revenue each staff_id makes per
customer. The result should look like this:
Hint
You need to use a subquery.
Day 12
CUBE & ROLLUP
GROUP BY
CUBE (column1, column2, column3)
GROUP BY
GROUPING SETS (
(column1, column2, column3),
(column1, column2),
(column1, column3),
(column2, column3),
(column1),
(column2),
(column3),
()
)
CUBE & ROLLUP
GROUP BY
ROLLUP (column1, column2, column3)
GROUP BY
GROUPING SETS (
(column1, column2, column3),
(column1, column2), Hierarchy
(column1),
()
)
Write a query that calculates a booking amount rollup for the hierarchy of
quarter, month, week in month and day.
Hint
Pattern for week in month is 'w'.
Self-join
Referencing employee_id
Self-join
Referencing employee_id
Self-join
Referencing employee_id
Self-join
Standard join with itself
SELECT
t1.column1,
t2.column1
[,…]
FROM table1 t1
LEFT JOIN table1 t2
ON t1.column1=t2.column1
Self-join
SELECT
t1.column1,
t2.column1
[,…]
FROM employee t1
LEFT JOIN employee t2
ON t1.column1=t2.column1
Self-join
SELECT
t1.column1,
t2.column1
[,…]
FROM employee emp
LEFT JOIN employee mng
ON t1.column1=t2.column1
Self-join
SELECT
t1.column1,
t2.column1
[,…]
FROM employee emp
LEFT JOIN employee mng
ON emp.manager_id=t2.column1
Self-join
SELECT
t1.column1,
t2.column1
[,…]
FROM employee emp
LEFT JOIN employee mng
ON emp.manager_id=mng.employee_id
Self-join
SELECT
emp.employee_id,
t2.column1
[,…]
FROM employee emp
LEFT JOIN employee mng
ON emp.manager_id=mng.employee_id
Self-join
SELECT
emp.employee_id,
emp.name AS employee
[,…]
FROM employee emp
LEFT JOIN employee mng
ON emp.manager_id=mng.employee_id
Self-join
SELECT
emp.employee_id,
emp.name AS employee,
mng.name AS manager
FROM employee emp
LEFT JOIN employee mng
ON emp.manager_id=mng.employee_id
Find all the pairs of films with the same length!
CROSS JOIN
Cartesian product
All combinations of rows
CROSS JOIN
number letter
number 1 A
letter
1 2 A
A
2 3 A
B
3 1 B
2 B
3 B
CROSS JOIN
Cartesian product
All combinations of rows
CROSS JOIN
number letter
number 1 A
letter
1 1 A
A
1 3 A
B
3 1 B
1 B
3 B
CROSS JOIN
SELECT
t1.column1,
t2.column1
FROM table1 t1
CROSS JOIN table2 t2
NATURAL JOIN
Just like a normal JOIN
SELECT
*
FROM payment
NATURAL LEFT JOIN customer
NATURAL JOIN
Just like a normal JOIN
SELECT
*
FROM payment
NATURAL INNER JOIN customer
Self-join
SELECT
emp.employee_id,
emp.name AS employee
[,…]
FROM employee emp
LEFT JOIN employee mng
ON emp.manager_id=mng.employee_id
CROSS JOIN
Cartesian product
All combinations of rows
CROSS JOIN
number letter
number 1 A
letter
1 1 A
A
1 3 A
B
3 1 B
1 B
3 B
Day 13
Challenge 14:
Task 1
Difficulty: Moderate
Task 1.1
In your company there hasn't been a database table with all the employee information
yet.
You need to set up the table called employees in the following way:
first_name,
last_name ,
job_position,
start_date DATE,
birth_date DATE
Task 1.2
Task 2
Difficulty: Moderate
- Add the column end_date with an appropriate data type (one that you think makes
sense).
- Add a constraint called birth_check that doesn't allow birth dates that are in the
future.
Task 3
Difficulty: Moderate
Task 3.1
There will be most likely an error when you try to insert the values.
So, try to insert the values and then fix the error.
Columns:
(emp_id,first_name,last_name,position_title,salary,start_date,birth_date,stor
e_id,department_id,manager_id,end_date)
Values:
(1,'Morrie','Conaboy','CTO',21268.94,'2005-04-30','1983-07-
10',1,1,NULL,NULL),
(2,'Miller','McQuarter','Head of BI',14614.00,'2019-07-23','1978-11-
09',1,1,1,NULL),
(3,'Christalle','McKenny','Head of Sales',12587.00,'1999-02-05','1973-01-
09',2,3,1,NULL),
(4,'Sumner','Seares','SQL Analyst',9515.00,'2006-05-31','1976-08-
03',2,1,6,NULL),
(5,'Romain','Hacard','BI Consultant',7107.00,'2012-09-24','1984-07-
14',1,1,6,NULL),
(6,'Ely','Luscombe','Team Lead Analytics',12564.00,'2002-06-12','1974-08-
01',1,1,2,NULL),
(7,'Clywd','Filyashin','Senior SQL Analyst',10510.00,'2010-04-05','1989-07-
23',2,1,2,NULL),
(8,'Christopher','Blague','SQL Analyst',9428.00,'2007-09-30','1990-12-
07',2,2,6,NULL),
(9,'Roddie','Izen','Software Engineer',4937.00,'2019-03-22','2008-08-
30',1,4,6,NULL),
(10,'Ammamaria','Izhak','Customer Support',2355.00,'2005-03-17','1974-07-
27',2,5,3,2013-04-14),
(11,'Carlyn','Stripp','Customer Support',3060.00,'2013-09-06','1981-09-
05',1,5,3,NULL),
(12,'Reuben','McRorie','Software Engineer',7119.00,'1995-12-31','1958-08-
15',1,5,6,NULL),
(13,'Gates','Raison','Marketing Specialist',3910.00,'2013-07-18','1986-06-
24',1,3,3,NULL),
(14,'Jordanna','Raitt','Marketing Specialist',5844.00,'2011-10-23','1993-03-
16',2,3,3,NULL),
(15,'Guendolen','Motton','BI Consultant',8330.00,'2011-01-10','1980-10-
22',2,3,6,NULL),
(16,'Doria','Turbat','Senior SQL Analyst',9278.00,'2010-08-15','1983-01-
11',1,1,6,NULL),
(17,'Cort','Bewlie','Project Manager',5463.00,'2013-05-26','1986-10-
05',1,5,3,NULL),
(18,'Margarita','Eaden','SQL Analyst',5977.00,'2014-09-24','1978-10-
08',2,1,6,2020-03-16),
(19,'Hetty','Kingaby','SQL Analyst',7541.00,'2009-08-17','1999-04-
25',1,2,6,'NULL'),
(20,'Lief','Robardley','SQL Analyst',8981.00,'2002-10-23','1971-01-
25',2,3,6,2016-07-01),
(21,'Zaneta','Carlozzi','Working Student',1525.00,'2006-08-29','1995-04-
16',1,3,6,2012-02-19),
(22,'Giana','Matz','Working Student',1036.00,'2016-03-18','1987-09-
25',1,3,6,NULL),
(23,'Hamil','Evershed','Web Developper',3088.00,'2022-02-03','2012-03-
30',1,4,2,NULL),
(24,'Lowe','Diamant','Web Developper',6418.00,'2018-12-31','2002-09-
07',1,4,2,NULL),
(25,'Jack','Franklin','SQL Analyst',6771.00,'2013-05-18','2005-10-
04',1,2,2,NULL),
(26,'Jessica','Brown','SQL Analyst',8566.00,'2003-10-23','1965-01-
29',1,1,2,NULL)
Task 3.2
Task 4
Difficulty: Moderate
Task 4.1
Jack Franklin gets promoted to 'Senior SQL Analyst' and the salary raises to 7200.
Task 4.3
All SQL Analysts including Senior SQL Analysts get a raise of 6%.
Task 4.4
Question:
What is the average salary of a SQL Analyst in the company (excluding Senior SQL
Analyst)?
Answer:
8834.75
Task 5
Difficulty: Advanced
Task 5.1
Write a query that adds a column called manager that contains first_name and
last_name (in one column) in the data output.
Secondly, add a column called is_active with 'false' if the employee has left the
company already, otherwise the value is 'true'.
Task 5.2
Task 6
Difficulty: Moderate
Write a query that returns the average salaries for each positions with appropriate
roundings.
Question:
Answer:
6028.00
Task 7
Difficulty: Moderate
Answer:
6107.41
Task 8
Difficulty: Advanced
Task 8.1
emp_id,
first_name,
last_name,
position_title,
salary
and a column that returns the average salary for every job_position.
Answer:
Task 9:
Difficulty: Advanced
Write a query that returns a running total of the salary development ordered by the
start_date.
In your calculation, disregard that fact that people have left the company (write the
query as if they were still working for the company).
Question:
Answer:
180202.70
Task 10:
Create the same running total but now also considder the fact that people were leaving
the company.
Note:
Don't worry if you can't solve it you are not expected to do so.
Question:
Answer:
166802.84
Hint 1:
Use the view v_employees_info.
Hint 2:
Create to separate queries one with all employees and one with the people that have
already left
Hint 3:
In the first query use start_date and in the second query use end_date instead of the
start_date
Hint 4:
Multiply the salary of the second query with (-1).
Hint 5:
Create a subquery from that UNION and use a window function in that to create the
running total.
Task 11
Task 11.1
Write a query that outputs only the top earner per position_title including first_name
and position_title and their salary.
Question:
Answer:
Task 11.2
Remove those employees from the output of the previous query that have the same
salary as the average of their position_title.
These are the people that are the only ones with their position_title.
Task 12
Difficulty: Pro
- sum of salary,
- number of employees,
- average salary
- division,
- department,
- position_title.
Write a query that returns all employees (emp_id) including their position_title,
department their salary and the rank of that salary partitioned by department.
Question:
emp_id 26
Task 14
Difficulty: Pro
Write a query that returns only the top earner of each department including
Question:
Answer:
emp_id 8
Day 14
USER-DEFINED FUNCTIONS
Extend functionality
SQL Python
PL/pgSQL C
BEGIN
Unit of work
Bank transfer
100
BEGIN TRANSACTION;
TRANSACTIONS
BEGIN WORK;
TRANSACTIONS
BEGIN;
TRANSACTIONS
BEGIN;
BEGIN;
COMMIT;
TRANSACTIONS
BEGIN;
OPERATION1;
OPERATION2;
COMMIT;
The two employees Miller McQuarter and Christalle McKenny have agreed
to swap their positions incl. their salary.
ROLLBACK
BEGIN;
OPERATION1;
OPERATION2;
COMMIT;
ROLLBACK
BEGIN;
BEGIN;
OPERATION1;
OPERATION2;
OPERATION3;
OPERATION4;
ROLLBACK;
COMMIT;
ROLLBACK
BEGIN;
OPERATION1;
OPERATION2;
SAVEPOINT op2;
OPERATION3;
OPERATION4;
BEGIN;
OPERATION1;
OPERATION2;
SAVEPOINT op2; Savepoints work only within a
OPERATION3; current transaction
SAVEPOINT op3;
OPERATION4;
Downside:
They cannot execute transactions
✓ Stored procedures:
They support transactions
STORED PROCEDURES
Bank transfer
100
COMMIT;
END;
$$
STORED PROCEDURES
CREATE PROCEDURE sp_transfer (tr_amount INT, sender INT,recipient INT)
LANGUAGE plpgsql
AS
$$
BEGIN
-- subtract from sender's balance
UPDATE acc_balance
SET amount = amount – tr_amount
WHERE id = sender;
COMMIT;
END;
$$
STORED PROCEDURES
CREATE PROCEDURE sp_transfer (tr_amount INT, sender INT,recipient INT)
LANGUAGE plpgsql
AS
$$
BEGIN
-- subtract from sender's balance
UPDATE acc_balance
SET amount = amount – tr_amount
WHERE id = sender;
COMMIT;
END;
$$
STORED PROCEDURES
CREATE PROCEDURE sp_transfer (tr_amount INT, sender INT,recipient INT)
LANGUAGE plpgsql
AS
$$
BEGIN
-- subtract from sender's balance
UPDATE acc_balance
SET amount = amount – tr_amount
WHERE id = sender;
COMMIT;
END;
$$
STORED PROCEDURES
Assign
✓ SELECT,
✓ INSERT,
Assign
✓ UPDATE,
Create new roles role ✓ DELETE,
✓ TRUNCATE,
✓ USAGE
User management
Assign
Assign ✓ SELECT,
Create new roles Analyst ✓ USAGE
CREATE USER
GRANT privilege
ON database_object
TO USER | ROLE | PUBLIC
Privileges
✓ SELECT,
✓ INSERT,
✓ UPDATE,
✓ DELETE, Priviliges
✓ TRUNCATE,
✓ USAGE,
✓ ALL
GRANT SELECT
ON customer
TO nikolai
Privileges
✓ SELECT,
✓ INSERT,
✓ UPDATE, TABLES
✓ DELETE,
✓ TRUNCATE,
✓ USAGE,
✓ ALL
GRANT SELECT
ON customer
TO nikolai
Privileges
✓ SELECT,
✓ INSERT,
✓ UPDATE, TABLES
✓ DELETE,
✓ TRUNCATE,
✓ USAGE,
✓ ALL
GRANT SELECT
ON ALL TABLES IN SCHEMA schema_name
TO nikolai
Privileges
✓ SELECT,
✓ INSERT,
✓ UPDATE, TABLES
✓ DELETE,
✓ TRUNCATE,
✓ USAGE,
✓ ALL
GRANT ALL
ON ALL TABLES IN SCHEMA schema_name
TO nikolai
Privileges
✓ SELECT,
✓ INSERT,
✓ UPDATE, TABLES
✓ DELETE,
✓ TRUNCATE,
✓ USAGE,
✓ ALL
superuser
GRANT ALL
ON ALL TABLES IN SCHEMA schema_name
TO nikolai
owner
Privileges
✓ SELECT,
✓ INSERT,
✓ UPDATE, TABLES
✓ DELETE,
✓ TRUNCATE,
✓ USAGE,
✓ ALL
superuser
GRANT SELECT
ON ALL TABLES IN SCHEMA schema_name
TO nikolai WITH GRANT OPTION
owner
Privileges
✓ SELECT,
✓ INSERT,
✓ UPDATE,
✓ DELETE, Priviliges
✓ TRUNCATE,
✓ USAGE,
✓ ALL
REVOKE privilege
ON database_object
FROM USER | ROLE | PUBLIC
Privileges
✓ SELECT,
✓ INSERT,
✓ UPDATE,
✓ DELETE, Priviliges
✓ TRUNCATE,
✓ USAGE,
✓ ALL
REVOKE privilege
ON database_object
FROM USER | ROLE | PUBLIC
GRANTED BY USER | ROLE
Privileges
✓ SELECT,
✓ INSERT,
✓ UPDATE,
✓ DELETE, Priviliges
✓ TRUNCATE,
✓ USAGE,
✓ ALL
DELETE TABLE
TRUNCATE TABLE
CONNECT DATABASE
USAGE SCHEMA
Privileges
How to grant acces?
Typical statements
CREATE USER
GRANT ALL
ON ALL TABLES IN SCHEMA public
TO amar ;
GRANT ALL
ON DATABASE greencycles
TO amar ;
Privileges
How to grant acces?
Typical statements
GRANT createdb
REVOKE ROLE
GRANT SELECT
ON ALL TABLES IN SCHEMA
<schema_name>
TO nikolai
Privileges
USAGE Users
Priveleges Users
Users
Users
Roles
✓
✓
SELECT,
INSERT,
CREATE ROLE <user_name>
✓ UPDATE, WITH LOGIN PASSWORD 'pwd123'
✓ DELETE,
✓ TRUNCATE,
✓ USAGE
Using indexes
Read Write
operations operations
Using indexes
Read Write
operations operations
Understanding Types of
Guidelines Demo
indexes indexes
Using indexes
SELECT
product_id
FROM sales
WHERE customer_id = 5
3, P0625, 5, visa
Table scan
4, P0432, 8, mastercard 6, P0058, 5, mastercard
SELECT
product_id
FROM sales
WHERE customer_id = 5
Location Value
1, P0494, 4, visa 2, P0221, 5, visa
1 4
6, P0058, 5, mastercard 4, P0432, 8, mastercard
2 5
3, P0625, 5, visa
5 8
✓ Indexes help to make data reads faster!
❖ Additional storage
SELECT
product_id
FROM sales
WHERE customer_id = 5
Location Value
1, P0494, 4, visa 2, P0221, 5, visa
1 4
6, P0058, 5, mastercard 4, P0432, 8, mastercard
2 5
3, P0625, 5, visa
5 8
Using indexes
Location Value ✓ Different types of indexes
1 4 for different situations
2 5
1
20 ✓ Breaks data down into pages or blocks
AB AD
1 AE
AC ✓ Should be used for high-cardinality
(unique) columns
1 ABA… 15
ABB… ✓ Not entire table (costy in terms of storage)
…
ACA…
ACB…
❖ Bitmap index
1 visa 11100
Good for many repeating values
4 mastercard 00011
(dimensionality)
❖ Bitmap index
✓ Particularily good for dataware houses
Value 1 2 3 4 5 6 7 8
Good for many repeating values
mastercard x x
(dimensionality)
visa x x x
Guidelines
Database o
SQL o
Retrieve data
Analyze data
o Define data
o Change data
Why learn SQL?
✓ Data is everywhere and mostly in databases
Database
Tables
Database
Manged by
Schema 2
Database management system (DBMS)
PostgreSQL
Database management system (DBMS)
PostgreSQL
Graphical Interface
PgAdmin
Different DBMS
Different dialects
One language
✓ Very popular
PgAdmin
The Project
GreenCYCLES
YOU: BUSINESS:
Data Analyst Online Movie Rental Shop
The Project
SELECT
column_name
FROM table_name
Example
SELECT
first_name
FROM actor
Multiple columns
SELECT
first_name,
last_name
FROM actor
All columns
SELECT
*
FROM actor
Remarks!
1. Formatting doesn't matter!
SELECT first_name,last_name
SELECT FROM actor
first_name,
last_name
FROM actor SELECT first_name,last_name
FROM actor
SELECT
first_name,
last_name
FROM actor
ORDER BY first_name
DESC / ASC
SELECT
first_name,
last_name
FROM actor
ORDER BY first_name DESC
DESC / ASC
SELECT
first_name,
last_name
FROM actor
ORDER BY first_name ASC
Multiple columns
SELECT
first_name,
last_name
FROM actor
ORDER BY first_name, last_name
DESC / ASC
SELECT
first_name,
last_name
FROM actor
ORDER BY first_name DESC, last_name
SELECT DISTINCT
SELECT DISTINCT
column_name1
FROM table_name
Example
SELECT DISTINCT
first_name
FROM actor
Example
SELECT DISTINCT
first_name
FROM actor
ORDER BY first_name
Multiple columns
SELECT DISTINCT
first_name,
last_name
FROM actor
ORDER BY first_name
Note!
Distinct in terms of
all selected columns!
Multiple columns
SELECT
first_name,
last_name
FROM actor
ORDER BY first_name, last_name
Result
A marketing team member asks you about the
different prices that have been paid.
SELECT
column_name1,
column_name2
FROM table_name
LIMIT n
SYNTAX
SELECT
first_name
FROM actor
LIMIT 4
Example
SELECT
first_name
FROM actor
ORDER BY first_name
LIMIT 4
Multiple columns
SELECT DISTINCT
first_name,
last_name
FROM actor
ORDER BY first_name
Note!
Distinct in terms of
all selected columns!
Result
A marketing team member asks you about the
different prices that have been paid.
SELECT
COUNT(*)
FROM table_name
SYNTAX
SELECT
COUNT(column_name)
FROM table_name
Note!
Nulls will not be
counted in that case!
Example
SELECT
COUNT(first_name)
FROM actor
Example
SELECT
COUNT(*)
FROM actor
Example
SELECT
COUNT(DISTINCT first_name)
FROM actor
Multiple columns
SELECT DISTINCT
first_name,
last_name
FROM actor
ORDER BY first_name
Note!
Distinct in terms of
all selected columns!
for today
1. Create a list of all the distinct districts
customers are from.
SELECT DISTINCT
district
FROM address
Solution 2
2. What is the latest rental date?
SELECT
rental_date
FROM rental
ORDER BY rental_date DESC
LIMIT 1
Solution 2
3. How many films does the company have?
SELECT
COUNT(*)
FROM film
Solution 2
4. How many distinct last names of the
customers are there?
SELECT
COUNT(DISTINCT last_name)
FROM customer
Result
A marketing team member asks you about the
different prices that have been paid.