Professional Documents
Culture Documents
Data
Warehousing and Big Data
Data vs. Information
• Data:
– Raw facts; building blocks of information
– Unprocessed information
• Information:
– Data processed to reveal meaning
Database and the DBMS
• Centralized:
– Supports data located at a single site
• Distributed:
– Supports data distributed across several sites
Can be classified by use:
▪ Insertion
▪ Deletion
▪ Modification
The relational model
Relational Data Structure
▪ Relation, Attribute, Domain, Tuple, Relational Database,
Relational Keys
Integrity Constraints
▪ Null, Entity Integrity and Referential Integrity
Integrity constraint
• Integrity constraints are conditions which specify the
circumstances in which the data in a database are correct
• Examples
• no bank account can have a negative balance
• the department of each member of the academic staff has always to be
known
• each bank account has to belong to at least one customer
• a book cannot be borrowed from the university library for more than
30 days
• all the members of the academic staff are entitled to the same annual
leave
Types of integrity constraints
• Domain rules
– The age of each member of the academic staff has to be in the range
[0…100]
• Referential integrity constraints (known as base table rules)
– The department of each member of the academic staff has always to
be known
• Functional dependencies
– All the members of the academic staff are entitled to the same annual
holiday leave
• Other general rules
– No bank account can have a negative balance
SQL
• SQL is a procedural data definition and manipulation language
for relational database which is based on the relational algebra
• SQL statements to specify database queries and retrieve
information from database
• SQL emerged as a descendant of Sequel
(System R, IBM, 1970)
• Many commercial variations (SQL is a de facto
standard for commercial relational DBMSs)
7/18/2018
Main Parts of the language
• Data manipulation sublanguage
– Select, delete, insert and update data in relations
• Data definition sublanguage
– Create, delete and modify relations, create indexes,
authorisation
• Integrity constraint specification sublanguage
• Transaction control sublanguage
SQL: data manipulation
Select, Insert, Update and Delete SQL DML
statements.
Examples:
1) Select all albums of Elvis Presley and sort them descending
by the date of release.
• SELECT * FROM album WHERE interpreter = 'Elvis Presley'
SORT BY released DESC
SQL: data definition
SQL Identifiers
Data Types
Integrity Constraints
▪ Required Data
▪
Domain Constraints
▪ Entity Integrity
▪
Referential Integrity
▪ Data Definition (Create Database, Tables,
Change a Table Definition, remove a table etc.)
View
• The dynamic result of one or more relational
operations operating on the base relations to
produce another relation.
• A view is a virtual relation that does not necessarily
exist in the database but can be produced upon
request by a particular user, at the time of request.
Entity-relationship modelling
▪ Entity Type
▪ Relation Types
▪ Attributes
▪ Simple and Composite Attributes
▪ Single Valued and multi valued attributes
▪ Keys
▪ Strong and Weak Entity types
Enhanced entity-relationship modelling
Specialization/Generalization
Superclass
▪ An entity type that includes distinct Subclasses that
require to be represented In a data model.
Subclass
▪ A Subclass is an entity type that has a distinct role and
is also a member of the Superclass.
Diagram
Specialization
▪
The process of maximizing the differences between
members of an entity by identifying their distinguishing
characteristics.
Generalization
▪
The process of minimizing the differences between entities
by identifying their common features.
Natural Key : If the key is is naturally present in the relational table .
Surrogate Key : If we introduced a primary key for data warehouse.
Normalization
First Normal Form (1NF)
For a table to be in the First Normal Form, it should follow the following 4 rules: