You are on page 1of 33

Databases:

overview

Prof. Fabio Crestani


fabio.crestani@usi.ch
What is Data Management?
This course is about understanding the basic
notions of managing data

But what do we mean with managing and what


kind of data do we want to manage?

Let us start with data!

2
Data? Information? Knowledge?
From Wikipedia:

“Data as an abstract concept can be viewed as the


lowest level of abstraction, from which information
and then knowledge are derived”

Data is collected and analysed to create


information suitable for making decisions, while
knowledge is derived from extensive amounts of
experience dealing with information on a subject

3
Data vs. Information
Data Information
raw facts data with context
no context processed data
just numbers and text value-added to data
•summarized
•organized
•analyzed

4
Data vs. Information
Data: 6292

Information:
• 62/9/2 -> 62/09/02 -> The d.o.b. of one of my friends
• CHF 6292 -> The average starting monthly salary of a
postdoc.
• 6962 -> The area code of my town: Viganello!
•…

5
Data vs. Information
Data Information
SIRIUS SATELLITE RADIO INC.
6.34
6.45 $7.20

6.39 $7.00
6.62 $6.80
6.57

Stock Price
$6.60
6.64 $6.40
6.71 $6.20
6.82 $6.00
7.12 $5.80
7.06 1 2 3 4 5 6 7 8 9 10
Last 10 Days

6
From data to information
Data à Information à Knowledge
Data

Summarizing the data


Averaging the data
Selecting part of the data
Graphing the data
Adding context
Adding value

Information
7
From information to knowledge

Information

How is the info tied to outcomes?


Are there any patterns in the info?
What info is relevant to the problem?
How does this info effect the system?
What is the best way to use the info?
How can we add more value to the info?

Knowledge

8
What type of data?
In Databases we study how to access and manage
structured data
• Textual data (e.g. data about books)
• Numeric data (e.g. number of students per course)

In IR we study how to access and manage


unstructured data
• Textual documents (e.g. full text of books or articles)
• Web documents
• Multimedia documents (audio, speech, images, video,
etc.)

9
What type of information?
In Databases you learn how to access and
manage structured information
• Information held in DBMS
• Information held in KB and KR systems

Types (structured inf.)


Name Age Salary Date joined
String Int Int Date
Donald 25 £50.000 01/03/01
Mickey 52 £100.000 01/02/99

10
What type of information?
In Information Retrieval you learn how to access
and manage unstructured information
• Information held in Digital Libraries
• Information held in image archives
• Information held in audio or video archives
• Information held on the Web

“Jules Verne wrote 20,000 Leagues


Under The Sea and Around The
World In 80 Days. He died in 1905.”
Words (unstructured inf.)

11
Different way to access it

SQL Google
SELECT Name FROM Top 10 Gaining Queries
Employee WHERE Age Week of Aug. 1, 2005
1. nasa
BETWEEN 30 AND 40
2. diane lane
3. my chemical romance
4. green day
Artificial language
5. gorillaz
Complete description 6. fall out boy
7. psp
Exact description 8. rachel mcadams
9. slipknot
10. space shuttle

Google Trends

12
… and so is what you get back

SELECT Name FROM


Employee WHERE Age
BETWEEN 30 AND 40

gives Names

This is a known result


type (list of Names)

13
Types of search systems

Structured Unstructured
Data Data
Data Typed Untyped
Model Deterministic Prob./Sim.
Matching Exact Partial
Query specification Complete Incomplete

Query language Artificial Natural


Items wanted Matching Relevant
Error sensitivity High Low

14
Managing data
Managing data means a long series of possible
operations on the data themselves to be able to
use them and preserve them
• Storing
• Sorting
• Querying
• Organising
• …

15
Questions

16
Aims of the course
The course has two aims:
1. Teach you the basis of data management
2. Teach you how to use one DBMS for data
management

These two aims are clearly related!

17
Course outline
Two parts running in parallel:
1. DBMS
• Basic notions and architectures
• The Entity-Relationship model
• DBMS design
• Relational algebra and calculus
• The Relational model
• Relation between DBMS, Information Retrieval and
Data Mining

2. Applied DBMS
• Practical applications with MySQL
• Mini project

18
Now … the trivial stuff!

Course coordinator: Prof. Fabio Crestani


• Office: Informatics Bldg, level 2
• Ext. 4657
• Email: fabio.crestani@usi.ch
• Available on appointment
• Email is preferable to appointments, but it is also useful to
set appointment times

19
The trivial stuff

Teaching assistants (TA):

Emad Aghajani

He is a new PhD student and is not in Lugano


yet, he will arrive a couple of weeks

More info soon …

20
Material on iCorsi2
All the material will be available on iCorsi2!

The key for enrolment is: DB16

21
The curse structure
Course structure:
• 18-20 theoretical lectures, taught by the Prof
• 6-8 practical lectures, taught by the TAs
• 4 course tests on theory and practice (more later)

22
Course material
The course is based on:

Fundamentals of Database Systems


(6nd edition) by Emasri and Navathe,
Addison-Welsey, 2010

I suggest you to buy this book


(notice that this is the 6th not the 7th
edition)!

My slides are directly taken from


the content of the book

23
Course marks
Marking:
• 3 course tests
• 1 individual mini project

No final exam!!

Marking scheme:
• 75% course test (25% each)
• 25% project

Marks will be rounded based on “discretionary


points” …

24
Mini project
The mini project is the design and implementation
of a database based on some specific data
requirements

Details will be given in due time

25
Differences with other course
Notice that this course is quite different from:

1. Databases (2nd year, Bachelor INF)


2. Information Retrieval (3rd year, Bachelor INF)
3. Data Analytics (1st year, Master INF & ECO+INF)

It is thought only for students of the Master


ECO+INF

26
Questions

27
A few words about yourselves
What is your background?
• Are you all Master in INF+ECO students?
• Do you have experience with programming?
• Have you done any CS course before?
• Do you know some programming?

What is your specific interest in this course?


• What topics are you most interested in?
• Is there any topic you would like me to cover in more
details? Or maybe invite an external speaker to talk
about it?

28
Questions

29
Attendance and behaviour
Attendance is mandatory:
• I will not take names of people attending, but part of the final
mark will reflect attendance
• If you cannot come to a lecture, please let me know by email
You behaviour during lectures will affect your mark in
both positive and negative ways
• If you do not behave in class I will evaluate “pessimistically”
your assignments and exam
• If you are active and interested in class I will evaluate
“optimistically” your assignments and exam
• In extreme cases, e.g. you do not behave in class or miss
too many lectures without justification, you might not be
admitted at the exam!

30
A note on the use of laptops
I prefer that you do not use your laptop during the
theoretical lectures
• Laptop might be distracting to you and people around
you
• Most of the times there is no need for them and I will
tell you when they are needed
• Copy of the lecture material is distributed only after the
class

You will have plenty of time to use laptops during


the practical lectures!

31
Finally
It is up to you to make the course interesting and
fun, by:
• Asking questions or clarifications
• Discussing points of general interest
• Pointing out something interesting you know about the
topics of the course
•…

32
Questions

33

You might also like