You are on page 1of 11

CSCA20 Worksheet Databases

1 Format for SQLite Commands


We will create tables and retrieve data from the tables using Python and SQLite. You can find a
list of Python and SQLite commands at the end of this handout as well as the tables we will use
as examples. You may even find it helpful to pull it off so that you can have it at your side.
As always, to get more/better information on what is available to you, browse the documenta-
tion!

2 Creating Tables
Lets write a function to read in a file and populate the table Precipitation that we saw earlier.

def create_precipitation(db, data_file):


"""(str, reader) -> NoneType
Populate the database db with the contents of data_file.
Create a table called Precipitation, with four columns: City
(text), Snow (real), Total (integer), Days (integer).

Parameters:
db: name/path to a database

data_file: contains one city, snowfall amount, total


precipitation amount, and number of days per line, separated by
comma.
"""

# connect to database

# get cursor

# create a new table

# process each line in the input file


for line in data_file:
# get the data from the file

# write the data into the database

# commit the changes to the database

1
# close cursor and connection

3 Accessing Data
How can we access the data in the database?
def print_all_precipitation(db):
"""(str) -> NoneType
Print all records in database db (name/path to a database) in
Precipitation table, one per line.
"""

conn = sqlite3.connect(db)
cur = conn.cursor()
cur.execute(select * from Precipitation)
for record in cur: print(cur.fetchall()) record = cur.fetchone()
print(record) while record is not None:
print(record)
record = cur.fetchone()

cur.close()
conn.close()

4 Simple Queries
Lets write the following functions to retrieve data from the table.
def city_snow(db):
"""(str) -> list of tuple
Return a list of tuples [(city, snow), ...] from the Precipitation
table in the database db (name/path to a database)."""

conn = sqlite3.connect(db)
cur = conn.cursor()
# execute query
# get all results
cur.close()
conn.close()
return result

def city_total(db):
"""(str) -> list of tuple

Return a list of tuples [(city, total), ...] from the


Precipitation table in the database db (name/path to a database)."""

conn = sqlite3.connect(db)
cur = conn.cursor()
# execute query
# get all results

2
cur.close()
conn.close()
return result

Notice that the only line changing in the two functions is the one with cur.execute(). We can
take advantage of this by writing a general run query function!
def run_query(db, q):
"""(str, str) -> list of tuple
Return the result of executing the query q in database db
(name/path to a database)"""

conn = sqlite3.connect(db)
cur = conn.cursor()
cur.execute(q)
result = cur.fetchall()
cur.close()
conn.close()
return result

We can now re-write our functions using run query():


def city_snow(db):
"""(str) -> list of tuple
Return a list of tuples [(city, snow), ...] from the Precipitation
table in the database db (name/path to a database)."""

def city_total(db):
"""(str) -> list of tuple

Return a list of tuples [(city, total), ...] from the


Precipitation table in the database db (name/path to a database)."""

return run_query(db, select City, Total from Precipitation)

Much better!

4.1 Adding a where condition


We can restrict the rows that we select by adding where conditions. Think of this as adding a
filter on the rows that the select returns. conditions is one or more boolean requirements of the
data. For example, it might be that we only want those cities with precipitation on more than 180
days of the year. We would execute the select statement:

cur.execute(select City, Days from Precipitation where Days > 180)

or (notice the tuple of one element!):

3
cur.execute(select City, Days from Precipitation where Days > ?,
(180,))
Lets write a function that returns all cities from Precipitation table where the amount of snow
falling is larger than 200:
def city_snow_over_200(db):
"""Return a list of cities from the Precipitation table in the
database db, where the amount of snowfall is > 200 cm.

Parameters: db: str; name/path to a database."""

It might be nice to be able to make a more generalized function that allows us to specify an
amount for the snow and total precipitation as an argument in the function. Complete the function
snow total over below.
def snow_total_over(db, x, y):
"""(str, number, number) -> list of tuple
Return the cities and snowfall amounts of the cities with at least
x cm of snowfall and y total precipitation from the Precipitation table of
database db.
"""

conn = sqlite3.connect(db)
cur = conn.cursor()

(*)

result = cur.fetchall()
cur.close()
conn.close()
return result

Q. Can we use the helper function run query for line (*)?

A.

Lets improve run query() to allow us to pass query arguments. Notice the header: we are
introducing a new way to pass parameters, args=None means if there isnt a third argument
passed to run query() then assume that the value is None.
def run_query_2(db, q, args=None):
"""(str, str, tuple) -> list of tuple
Return the results of running query q with arguments args on
database db."""

conn = sqlite3.connect(db)
cur = conn.cursor()
if args is None:

else:

data = cur.fetchall()

4
cur.close()
conn.close()
return data

Lets use the new run query() function.

def snow_total_over_2(db, x, y):


"""(str, number, number) -> list of tuple
Return the cities and snowfall amounts of the cities with at least
x cm of snowfall and y total precipitation from the Precipitation table of
database db.
"""

Notice that we can still use run query 2 as we did run query for function city snow:
def city_snow_3(db):
"""(str) -> list of tuple
Return a list of tuples [(city, snow), ...] from the Precipitation
table in the database db (name/path to a database)."""

return run_query_2(db, select City, Snow from Precipitation)

5
5 Joins
Sometimes we want to select data from two or more different tables. Consider the following two
tables, one which lists library card ID number with name and one which lists library card ID
number with books on loan from the library.

Table: Library ID Table: Library Loans


ID Name ID Book
99110022 Homer 99110022 NULL
99110033 Marge 99110033 Updos for the Modern Woman
99110044 Bart 99110044 The Etymology of Slang
99110055 Lisa 99110055 Quantum Physics for Minors
99110055 LSAT Prep Guide

Suppose that we want to list the books taken out of the library for each student. This requires
using one table to look up into another. We can do this by joining the tables. The join of two
tables is their cross product. This means you match up every row of the left table with every row
of the right table. The symbol for join is o
n. Here is the result of Library ID on Library Loans:

ID Name ID Book
99110022 Homer 99110022 NULL
99110022 Homer 99110033 Updos for the Modern Woman
99110022 Homer 99110044 The Etymology of Slang
99110022 Homer 99110055 Quantum Physics for Minors
99110022 Homer 99110055 LSAT Prep Guide
99110033 Marge 99110022 NULL
99110033 Marge 99110033 Updos for the Modern Woman
99110033 Marge 99110044 The Etymology of Slang
99110033 Marge 99110055 Quantum Physics for Minors
99110033 Marge 99110055 LSAT Prep Guide
99110044 Bart 99110022 NULL
99110044 Bart 99110033 Updos for the Modern Woman
99110044 Bart 99110044 The Etymology of Slang
99110044 Bart 99110055 Quantum Physics for Minors
99110044 Bart 99110055 LSAT Prep Guide
99110055 Lisa 99110022 NULL
99110055 Lisa 99110033 Updos for the Modern Woman
99110055 Lisa 99110044 The Etymology of Slang
99110055 Lisa 99110055 Quantum Physics for Minors
99110055 Lisa 99110055 LSAT Prep Guide

Notice that in this case, the only lines that are useful to us are the ones where the library IDs
match. To use this table we need to learn:

1. How do we refer to the columns of this table?

6
2. How would we select the set of Names and Books taken out by a person?

Consider another table Temperature related to our Precipitation table (see page 11 of this
handout) populated from the file temperature.txt. Firstly, lets add it to our database:

def create_temperature(db, data_file):


"""(str, reader) -> NoneType
Populate the database db with the contents of data_file.
Create a table called Temperature, with nine columns: City
(text), AvgHigh (real), AvgLow (real), ColdMonth (text),
ColdAvgHigh (real), ColdAvgLow (real), WarmMonth (text),
WarmAvgHigh (real), WarmAvgLow (real).

Parameters:
db: name/path to a database

data_file: contains one city, average highest temperature,


average lowest temperatue, coldest month, average highest
temperature in coldest month, average lowest temperature in
coldest month, warmest month, average highest temperature in
warmest month, average lowest temperature in warmest month, per
line, separated by comma.
"""

# connect to database
conn = sqlite3.connect(db)
# get cursor
cur = conn.cursor()

# create a new table

# process each line in the input file


for line in data_file:
# get the data from the file

# write the data into the database

7
# commit the changes to the database
conn.commit()

# close cursor and connection


cur.close()
conn.close()

We will also use a table Geography populated by the data in the file geography.txt. Can you
do this one yourself?

def create_geography(db, data_file):


"""(str, reader) -> NoneType
Populate the database db with the contents of data_file.
Create a table called Geography, with two columns: City (text),
Province (text).

Parameters:
db: name/path to a database

data_file: contains one city, province per line, separated by


comma.
"""

5.1 Using joins with select


We will now practise writing select statements with joins.

Write a select statement to pass to run query2 that returns the names of cities which have
an average low temperature less than 1 and at least 250cm of snow. What tables do you need
to join here?

8
Write a select statement to pass to run query2 that returns the names of cities and
provinces, such that the city has average high temperature over 10 and total precipitation
less than 900. What tables do you need to join here?

5.2 Using distinct


We can join a table with itself! What problem do you have when trying to write such a select
statement?

We can solve this by

Q. Write a query that returns the names of cities which have the same average low temperatures.

Notice that there are duplicates, but the order is permuted we can get rid of these by doing
some post processing (working with the results after). Now consider the following example:
Q. Write a query that returns the names of the provinces which have cities with negative average
low temperature.

Notice that this returns duplicates. We can get rid them by using select distinct.

5.3 Aggregation
There are several built-in column functions that we can use: they are avg, min, max, sum, count.
They are called aggregate functions and are used as select function name(column) etc...
Q. Use avg to find the average snowfall amount in table Precipitation.

5.4 Using group by


Suppose that we want to get the average days of precipitation for each province. This means we
want to group the Days values according to the province the city is in, and then take the average
of the Days belonging to the same province. For example, Ontario cities are Ottawa and Toronto
and the total days precipitation are 159 and 139, giving an average of 149.
We use the group by statement:

select Geography.Province, avg(Precipitation.Days)


from Precipitation join Geography
on Precipitation.City = Geography.City
group by Geography.Province

Q. Write a query that returns the lowest average low temperature in the coldest month in each
province which has a city with snowfall of at least 150.

9
Python Commands
import sqlite3: use SQLite to access a database.

conn = sqlite3.connect(C:\A20\week10\weather.db): connect to the database stored


in file C:\A20\week10\weather.db and name this connection conn.

cur = conn.cursor(): Get a cursor to work with the database and name it cur.

cur.execute("SQL command"): Execute the given SQL command (string) on the database
accessed with the cursor cur.

cur.fetchall(): Get a list of rows of the table (tuples) from the columns that have been
selected.

cur.fetchone(): Get one row of the table (tuple) from the columns that have been selected.

cur.close(): Close the cursor.

conn.commit(): Make our changes to the database permanent by committing them.

conn.close(): Close the connection.


SQL Commands
create table TableName (ColumnName 1 TYPE 1,
ColumnName 2 TYPE 2, . . ., ColumnName N TYPE N)

drop table TableName

insert into TableName values(Val 1, Val 2, . . . , Val n)

insert into TableName values(?, ?, . . ., ?), (Val 1, Val 2, . . ., Val n)

select * from TableName

select Col 1, Col 2, . . ., Col n from TableName

select Col 1, Col 2, . . ., Col n from TableName


where condition 1 and condition 2 and . . . and condition n

select TableName1.ColName, TableName2.ColName from TableName1 join TableName2


on condition where condition 1 and condition 2 and . . . and condition n

select distinct TableName1.ColName, TableName2.ColName from TableName1 join TableName2


on condition where condition 1 and condition 2 and . . . and condition n

select TableName1.ColName1, TableName2.ColName2


from TableName1 join TableName2 on condition
where condition 1 and condition 2 and . . . and condition n
group by ColName i

update TableName set Col 1 = val 1, Col 2 = val 2, . . .


where condition 1 and condition 2 and . . .

delete from TableName where condition 1 . . .

10
precipitation.txt geography.txt

City Snow Total Days City Province


St.Johns 322.1 1482 217 St.Johns Newfoundland
Charlottetown 338.7 1201 177 Charlottetown PEI
Halifax 261.4 1474 170 Halifax Nova Scotia
Fredericton 294.5 1131 156 Fredericton New Brunswick
Quebec 337.0 1208 178 Quebec Quebec
Montreal 214.2 940 162 Montreal Quebec
Ottawa 221.5 911 159 Ottawa Ontario
Toronto 135.0 819 139 Toronto Ontario
Winnipeg 114.8 504 119 Winnipeg Manitoba
Regina 107.4 364 109 Regina Saskatchewan
Edmonton 129.6 461 123 Edmonton Alberta
Calgary 135.4 399 111 Calgary Alberta
Vancouver 54.9 1167 164 Vancouver BC
Victoria 46.9 858 153 Victoria BC
Whitehorse 145.2 269 122 Whitehorse Yukon
Yellowknife 143.9 267 118 Yellowknife NWT

temperature.txt

City AvgHigh AvgLow ColdMonth ColdAvgHigh ColdAvgLow WarmMonth WarmAvgHigh WarmAvgL


St.Johns 8.6 0.8 February -1.4 -8.7 July 20.2 10.5
Charlottetown 9.5 0.8 January -3.4 -12.2 July 23.1 13.6
Halifax 10.7 1.4 February -1.5 -10.6 July 23.4 13.2
Fredericton 11.0 -0.6 January -4 -15.4 July 25.6 12.9
Quebec 9.0 -1.0 January -7.7 -17.3 July 24.9 13.2
Montreal 10.9 1.2 January -5.8 -14.9 July 26.2 15.4
Ottawa 10.7 0.8 January -6.3 -15.5 July 26.4 15.1
Toronto 12.6 5.2 January -1.3 -7.9 July 26.5 17.6
Winnipeg 8.1 -3.4 January -13.2 -23.6 July 26.1 13.4
Regina 8.9 -3.8 January -11 -22.1 July 26.3 11.9
Edmonton 8.7 -1.5 January -8.2 -17 July 23.0 12.0
Calgary 10.3 -2.6 January -3.6 -15.7 July 23.2 9.5
Vancouver 13.5 6.1 January 5.7 0.1 August 21.7 12.9
Victoria 13.9 5.1 January 6.5 0.3 July 21.8 10.7
Whitehorse 4.1 -6.2 January -14.4 -23.2 July 20.3 7.6
Yellowknife -0.8 -9.7 January -23.9 -32.2 July 20.8 12.0

11

You might also like