You are on page 1of 3

CS3011 Database Systems Lab (January - April, 2015)

Assignment 1: Basic SQL


Marks: 80
This assignment must be done individually.
Start Date: 06-01-2015
Due Date: 22-01-2015 06:00 pm
NOTE: For late submissions, 10% is deducted for each day (including weekend) late after an assignment is
due.

1. Overview
In this project, you will design a relational database for storing information about National Basketball
Association1. We are providing you a small subset of their data called NBA dataset in the form of multiple
CSV files2. These CSV files contain information about basketball players, teams, coaches, etc. The different
attributes are described at: http://en.wikipedia.org/wiki/Basketball_statistics

2. Getting Started
Download and install PostgreSQL (preferably 9.3 or 9.4) from www.postgresql.org. For interactions with
the PostgreSQL server, you can use the commandline client (psql) or the GUI client (pgadmin3). You can
find useful tutorials on the postgres website3.

3. Exercises
Exercise 1 [0 points]: Create a database called nbadb

Exercise 2 [10 points]: The provided dataset has 8 CSV files. In each CSV file, the first row contains a list of
attributes and the remaining rows contains the attribute values. You should create a relational table for
each of the CSV files. For example, the file teams.txt contains the attributes: team, location, name, leag.
You should create a table with schema: team (team, location, name, leag), where the attributes (team,
leag) together is the primary key.
You should specify all the necessary integrity constraints, such as NOT NULL, CHECK, UNIQUE, PRIMARY
KEY, FOREIGN KEY, etc. (see http://www.postgresql.org/docs/9.3/static/ddl-constraints.html). Save all your
DDL statements in a file nba_create_table.sql (this file will be part of your submission). Note: While
creating the schema, use the same attribute names as given in the CSV files.
You can execute all the statements in the .sql file by using the command: \i nba_create_table.sql
Example:
CREATE TABLE teams (
1

http://en.wikipedia.org/wiki/National_Basketball_Association
http://www.cs.cmu.edu/~awm/15781/project/data.html#H:_NBA_statistics_data
3
http://www.postgresql.org/docs/9.3/static/tutorial.html
2

team CHAR(3) NOT NULL,


location VARCHAR(20) NOT NULL,
name VARCHAR(20),
leag CHAR(1) NOT NULL,
PRIMARY KEY (team, leag));
Exercise 3 [5 points]: Insert all the information given in the CSV files to the corresponding tables. Save the
statements in a file named nba_load_data.sql (to be submitted).

Exercise 4 [60 points]: Write SQL queries for the following. Save all the statements executed in a single a
file named nba_query.sql (to be submitted).
1. The schema defined above for the teams table allows one to enter any single character value for the
attribute leag. However, this attribute value can take only of the three values: (A, N, F). Create a
constraint in the file nba_create_table.sql so that the database allows only of these three values for
the leag column in all the tables where this column is present. [5 points]
2. Use table player to find the count of number of players in each position. [5 points]
3. Find the top-5 most productive years, which is determined based on the total number of games played
(gp) by all the players, including both regular seasons and playoffs, for each year. Solve ties by
preferring chronologically older years, and print only the years. [10 points]
4. In the table player_regular_season_career, add a new column eff (efficiency rating), which is defined
as follows:
eff = (pts + reb + ast + stl + blk ((fga fgm) + (fta ftm) + turnover))
Among the players who have played more than 500 games, find the top-10 most efficient players. [10
points]
5. Find the number of players who have played more regular season games in the year 1990 than
regular season games in any other year in their career. [15 points]
6. Use table player_regular_season_career to find the all-time best players. Use the two attributes gp
(games played) and eff (efficiency rating) to compare players. For two players p1 and p2, we define that
p1 dominates p2 if and only if p1 has a higher gp and eff value than p2.

Find a set of players (ilkid, firstname, lastname, gp, eff) P, so that each player in P is not
dominated by any other player in the table player_regular_season_career. Return the output on
ascending order of ilkid. [15 points]
Exercise 6 [5 points]: Create a file called nba_delete_all.sql to drop all the tables, views, or other things you
created in nbadb.
To restore to nbadb database content, you can use the following sequence of commands:

\i nba_delete_all.sql
\i nba_create_table.sql
\i nba_load_data.sql

4. What to Submit

nba_create_table.sql
nba_load_data.sql
nba_query.sql
nba_delete_all.sql
results.pdf

For the five SQL queries in Exercise 4 (i.e., 4.2 to 4.6), you should submit the results in the file results.pdf.
You should submit your files as a .zip file through Moodle.