Professional Documents
Culture Documents
General Information
CS102
Course Instructor: Muhammad Abulaish
Database Management Email: abulaish@sau.ac.in
Tel (O): 24195 (148)
Office: Room No. 310
System Teaching Assistant (TA)
Jayati Gulati <gulati.jayati@gmail.com>
Lecture:
11:30am‐1:00pm & 3:00 pm to 4:30 pm (Tuesday)
Lab:
Course Web Page: Open
www.abulaish.com/dbms20m.html
Course Structure & Evaluation Prerequisite
The course has three parts Knowledge of
Lectures (to discuss basic concepts and algorithms)
Data Structures and Algorithms (mainly tree and graph)
Lab (SQL & PL/SQL)
Presentations (self study on some recent advances in Discrete Mathematics (mainly set theory)
database and its applications)
Lecture slides will be available at course web page.
Evaluation:
Quiz: 30% (At least three quizzes)
Presentations: 10% (group of maximum three students)
Lab assignment: 10%
Final Exam: 50%
Teaching Materials Topics
Text Book Data Modeling
R. Elmasri and S. B. Navathe: Fundamentals of Database ER Model
Systems, Pearson Education. EER (Enhanced ER) Model
Database Design
Reference Books
Relation Database Elements & Constraints
A. Silberschatz, H. F. Korth and S. Sudarshan: Database System Database Decomposition & Normalization
Concepts, McGRAW‐HILL. Database Language
Web resources (mainly for SQL & PL/SQL) Relational Algebra and Relational Calculus
Query Optimization
Database storage and Indexing
File Organization and Indexing (Dynamic Multilevel Indexes
using B‐Tree and B+‐Tree)
Emerging Databases
Datawarehouse, Graph database & NoSQL database
1
11/11/2020
Data Information
Data is the Latin plural of datum Information is interpreted (processed) data so that it has
meaning for the user.
Used to represent unprocessed facts and figures
without any added interpretation or analysis. “The price of petrol has risen from Rs. 74 to Rs. 80 per
liter” – is information for a person who tracks petrol
Generally associated with some entity and often viewed prices.
as the lowest level of abstraction from which
information and knowledge are derived. Data becomes information when it is processed for some
purpose and adds value for the recipient.
Data may be unstructured, semi‐structured, and
structured A set of raw sales figures – Data
Knowledge Summarized View
Knowledge is a fluid mix of information, experience and Data – as in databases, spreadsheets, text files…
insight that may benefit the individual or the
organization. Information – Processed data
“When petrol prices go up by Rs. 6 per liter, it is likely knowledge – Fluid mix of information, experience, and
that bus fare will rise by 12%“ – knowledge. insight
The boundaries between data, information, and
OR, knowledge is a meta information about the patterns
knowledge is fuzzy
hidden in the data
What is data to one person is information to someone
else. The patterns must be discovered automatically!!!
Data Categories & Mining Terminologies Another Data Categorization
Data are stored in Documents (A file) Quantitative vs Categorical
Quantitative data
Unstructured Semi‐structured Structured
Discrete
2
11/11/2020
The Huber Taxonomy of Data Set Sizes
No. of Operations for Algorithms of Various
Algorithmic Complexity Computational Complexities and various Data Set Sizes
Algorithm Complexity
Plot a scatterplot O(n 1/2)
n n1/2 n n log(n) n3/2 n2
Calculate means, variances, kernel density O(n) tiny 10 102
2x10 2
10 3
104
estimates 2 4 4 6
small 10 10 4x10 10 108
Calculate fast Fourier transforms O(n log(n)) medium 10 3
106
6x10 6
10 9
1012
Calculate singular value decomposition of an rc O(nc) large 10 4
108
8x10 8
10 12
1016
matrix; solve a multiple linear regression
huge 105 1010 1011 1015 1020
Solve most clustering algorithms O(n2)
Computational Feasibility on a Pentium PC Computational Feasibility on a Silican Graphics
(10 MegaFLOPs) Onyx Workstation (300 MegaFLOPs)
3
11/11/2020
Computational Feasibility on an Intel Paragon Computational Feasibility on a TeraFLOP Grand
XP/S A4 (4.2 GigaFLOPs) Challenge Computer (1000 GigaFLOPs)
tiny 2.4x10 -9
2.4x10 -8
4.8x10-8
2.4x10 -7
2.4x10 -6 tiny 10-11 10-10 2x10-10 10-9 10-8
seconds seconds seconds seconds seconds seconds seconds seconds seconds seconds
small 2.4x10 -8
2.4x10 -6
9.5x10-6
2.4x10 -4
.024 small 10-10 10-8 4x10-8 10-6 10-4
seconds seconds seconds seconds seconds seconds seconds seconds seconds seconds
medium 2.4x10-7 2.4x10-4 .0014 .24 4.0 medium 10-9 10-6 6x10-6 .001 1
seconds seconds seconds seconds minutes seconds seconds seconds seconds second
large 2.4x10-6 .024 .19 4.0 27.8 large 10-8 10-4 8x10-4 1 2.8
seconds seconds seconds minutes days seconds seconds seconds second hours
huge 2.4x10-5 2.4 24 66.7 761 huge 10-7 .01 .1 16.7 3.2
seconds seconds seconds hours years seconds seconds seconds minutes years
Types of Computers for Interactive Feasibility
Types of Computers for Feasibility
(Response Time < 1 Second)
(Response Time < 1 Week)
tiny Personal Personal Personal Personal Personal tiny Personal Personal Personal Personal Personal
Computer Computer Computer Computer Computer Computer Computer Computer Computer Computer
small Personal Personal Personal Personal Super small Personal Personal Personal Personal Personal
Computer Computer Computer Computer Computer Computer Computer Computer Computer Computer
medium Personal Personal Personal Super Computer Teraflop medium Personal Personal Personal Personal Personal
Computer Computer Computer Computer Computer Computer Computer Computer Computer
large Personal Workstation Super Computer Teraflop --- large Personal Personal Personal Personal Teraflop
Computer Computer Computer Computer Computer Computer Computer
huge Personal Super Teraflop --- --- huge Personal Personal Personal Super Computer ---
Computer Computer Computer Computer Computer Computer
Data Mining Applications Resources
Almost every information system Graph database (neo4j): https://neo4j.com/developer/graph‐
database/
NoSQL database (MongoDB) https://www.mongodb.com/nosql‐
explained
Database related conferences
4
11/11/2020
Resources…