Professional Documents
Culture Documents
Unit - I
Introduction
File processing systems
PPP
Data base systems and the evolution of database technology.
Aims and importance of database technology
Data independence
Data sharing
Data integrity
Data redundancy control
Data Modelling:
Conceptual model, Logical Model
Physical model, External model
Terminologies
Data : Data is a collection of facts, such as numbers, words,
measurements, observations or just descriptions of things.
Data can be qualitative or quantitative.
Qualitative: It is descriptive data (it describes something)
PPP
Your favorite holiday destination
Behaviour of a person
Describe the smell of a perfume
Quantitative: It is numerical data (numbers)
Discrete data is counted, Continuous data is measured. Continuous data varies
over time.
Petals on a flower (Discrete)
Customers in a shop (Discrete)
Height (Continuous)
Website traffic (Continuous)
Water temperature (Continuous)
No. of players in a team(Discrete)
Terminologies
Entity : Real world object that can be easily identified.
Student, Product, Book, Department etc.
Attribute : Properties of an entity.
Student ( Id, Name, Address, Email, Birth date, Mobile)
PPP
Product ( Product Id, Name, Price, Unit, Colour )
Book ( Book Id, Author, Publisher, No. of pages, Price,Edition)
Department (Dept Id, Name, Location, Pin code )
Information : Data is processed for meaningful to the end users.
Attendance report, Product Stock, Subject wise Book details, Dept. list etc.
Data Types : Data having predefined characteristics.
E.g. Integer, Real (Float), Character, String, Date, etc.
Integer Values: 1, 2, ……….
Real (Float) Values: 1.1, 2.9, 3.85, …………
Character Values: A … Z, a ….z , # , @ , …………
String Values: Aakash, Harsh, Gujarat, India, …………..
Date Values : 01-JAN-80, 12-AUG-23 , …………
File Processing Systems
• File Systems:
– Store large amount of data
– Store data over long periods of time
PPP
• However :
– No guarantee that data is not lost, if not backed up
– No support to query languages
– No efficient access to data unless the location is known
– Application depends on the data definitions (Structures)
– Separate files for each entities
– Limited control to multiple accesses
File Processing Systems
• In a File System, data is directly stored in set of files.
It contains flat files that have no relation to other
files.
PPP
Terminologies
Database
It is a computerized record-keeping system.
It is a collection of coherent and meaningful data.
E.g.
PPP
University Database: Student, Faculty, Course, Subjects, Result etc.
B2C Database: Customers, Products, Orders, Suppliers etc.
DBMS (Database Management System)
A system that allows creating, inserting, deleting, updating and
processing of data.
E.g. XML, Dbase, FoxPro etc.
RDBMS (Relational Database Management System)
It was invented by Edger. F. Codd at IBM in 1970.
A DBMS that is based on relational model and stores data in the
form of related tables.
E.g. Oracle, Microsoft SQL Server, Sybase, SQL Server, IBM’s DB2
DBMS manages interaction between End users and Database.
PPP
Aims & Importance of Database Technology
• Controlling Redundancy : Redundancy can be reduced.
– E.g. Students Data with Admin Level, Department Level,
& Printing Dept.
PPP
• Restricting Unauthorized Access
– Each user must follow authentication process.
– Each user have the specific rights (Authorization) to
access the resources.
PPP
storage of the record.
– Optimization : To determine the most efficient way to
execute a given query by considering the possible query
plans. The parameters like CPU Time, No. of Input /
Output, No. of joins used for optimization.
PPP
• Admin : Specific student fee collection information.
• Faculty : Specific student attendance sheet & their marks
• Press: All students subjects marks & mark sheet details.
• Enforce Integrity Constraints :
– Data type : Number, Char, Date, Varchar2 etc.
– Field Length : Length of field as per data placed in it.
– Primary Key : To uniquely identify each row in a table.
– Foreign Key : To reference the value from other table.
– Not NULL : Value for the field is compulsory
– Check Constraints : Restricted values in field of a table.
Aims & Importance of Database Technology
• Reduce Application Development Time
– Once the database up & running, development time is
less.
PPP
• Flexibility
– Change the structure of database.
– Easily add , remove, modify columns in table structure.
PPP
– RDBMS like SQL Server , Oracle, MS Access use SQL as their standard
database language.
• Features of SQL
SQL used to access and manipulate databases. It can
– Creates a new objects in database
– Insert records in a database
– Update records in a database
– Delete records from a database.
– Retrieve records from a database.
• SQL Engine
It is a server side process takes SQL queries from the application and
execute.
Rules For SQL
• SQL starts with a verb (i.e. A SQL action word)
– E.g. CREATE , INSERT , UPDATE , DELETE
PPP
E.g. INSERT INTO EMP VALUES (1 , 'HARSH');
PPP
database.
PPP
• INSERT – Insert data into a table.
PPP
• ROLLBACK – Restore database to original since the last COMMIT.
PPP
Data Types
PPP
Create Statement
• It is used to create a new Table (Entity or Relation) in
database.
• Each table having multiple Columns (Attributes or
PPP
Fields).
• Each Column of contains three things :
• Column-name,
• Data-type &
• Size
• Rules for Creating Tables
• A name can have maximum upto 30 characters.
• Alphabet from A-Z, a-z and number from 0-9 are allowed.
• A name should begin with Alphabet.
• The use of special characters like _ , $ & # signs are allowed.
• SQL reserved words are not allowed.
Create Statement
• Syntax
• CREATE TABLE table_name
(
PPP
column_name1 data_type(size)
………………………. ………………….
column_nameN data_type(size) );
E.g.
CREATE TABLE STUDENT_MASTER
(Stud_Id NUMBER(2) , Stud_Name VARCHAR2(40),
Stud_Birth_Date DATE, Stud_Mobile NUMBER(10),
PRIMARY KEY Stud_Id))
To Display Table Structure
• DESCRIBE Table-name
PPP
OR
• DESC Table-name
E.g.
DESCRIBE Student_Master;
DESC Student_Master;
SQL Query Is Not a Case-Sensitive
• Query may be allowed in either
PPP
– Upper case or
– Lowercase or
– Combination of uppercase and lowercase
E.g.
DESC Student_Master;
desc Student_Master;
DesC Student_Master;
DESC STUDENT_MasTer;
Syntax
Insert Statement
• To insert values of all fields/attributes
– INSERT INTO table_name
PPP
VALUES (value1,value2, ...valueN)
PPP
(column1,column2, ... columnN)
VALUES
(&column1,&column2, ... &columnN);
E.g.
INSERT INTO STUDENT_MASTER
VALUES (&Stud_Id1 , ‘&Stud_Name' ,
‘&Stud_Birth_Date', &Stud_Mobile)
Update Statement
• It is used to change the values of existing records in a table.
Syntax
UPDATE table_name
PPP
SET { column1=value1,column2=value2,...columnN=valueN }
[ WHERE Conditions ]
PPP
• Remove selected rows from table
DELETE FROM table_name WHERE Condition;
E.g.
DELETE FROM Student_Master WHERE Stud_Id = 5
DELETE FROM Student_Master WHERE Stud_Id > 5
DELETE FROM Student_Master WHERE Stud_Name = 'AMIT'
PPP
ALTER TABLE table_name
ADD
column_name1 data-type(size),
column_name2 data-type(size),
…………………….
column_nameN data-type(size)) ;
PPP
Syntax:
ALTER TABLE table_name DROP column_name;
PPP
ALTER TABLE table_name MODIFY
column_name1 New-data-type(New-size),
column_name2 New-data-type(New-size),
…………………….
column_nameN New-data-type(New-size)) ;
PPP
RENAME
COLUMN old_name TO new_name;
PPP
ADD PRIMARY KEY (Column_name);
PPP
SELECT [ DISTINCT ]
FROM table_name
[ Where conditions]
PPP
• Show all products name and price.
• Change the price of product_id = 1 to 450.
• Display product names whose price is equal to 200.
• Remove the product whose quantity is less than 2;
• Change the price to 340 & Quantity to 10 for
product_name=Keyboard.
• Display product name & price whose price greater than 100 and less
than 500.
• Add new column Product_color with data type VARCHAR and size 40.
• Display all product names and It’s price sorted by price.
• Display all product names and price whose price greater than 500.
GROUP BY Clause
• Syntax
SELECT <columnName1> <columnName2> ……<columnNameN>
AGGREGATE FUNCTION (<Expression>)
FROM Table Name
PPP
WHERE <Condition >
GROUP BY <columnName1> <columnName2> ……<columnNameN>
PPP
• To display citywise total number of students
SELECT city, count(stud_id)
FROM Student_Master
GROUP BY City
HAVING CLAUSE
• It impose a condition on the GROUP BY clause, which further filters
the groups created by the GROUP BY clause.
PPP
• It filters on fields that used with aggregate function.
• E.g. To display the only those city having total number of students
more than two.
SELECT city, count(stud_id)
FROM Student_Master
GROUP BY City
HAVING count(stud_id) > 2;
Constraints
Primary Key
PPP
uniquely identify each rwo in the table.
PPP
( Stud_Id NUMBER (2) PRIMARY KEY ,
Stud_Name VARCHAR(20),
Stud_Birth_Date DATE,
Stud_Mobile NUMBER(10) )
/
Constraints
• Foreign Key
• It defines the relationship between the tables.
• It is a column / columns whose values are derived from the primary
key or unique key of some other table.
PPP
• The table in which the foreign key is defined is called a Foreign Table
or Detail Table.
• The table that defined the primary key or unique key and is
referenced by the foreign key is called Primary Table or Master Table.
• Features:
• Child may have duplicates and nulls but unless it is specified.
• Parent record can be delete provided no child record exist.
• Master table can not be updated if child record exist.
• Record can not be inserted in detail table if corresponding record does not
exist in master table.
Constraints
•Foreign Key
FOREIGN KEY ( column-name1, column-name2, …….. column-nameN)
REFERENCES table-name (column-name1, column-name2, …….. column-
nameN)
PPP
CREATE TABLE Client_M
(Client_No NUMBER(3) PRIMARY KEY,
Client_Name VARCHAR2(40))
PPP
– Object-Oriented Data Model
Hierarchical Data Models
• Hierarchy means a pyramid-like structure or the tree data structure in
computers
• Every node must have exactly one parent node except the root node.
– E.g. In an organization, every employee, except the topmost person, has exactly
PPP
one reporting authority
• Advantage:
– Simplicity
– Data Access : Very fast for top in hierarchy.
• Disadvantages:
– Searching data at bottom is slow.
Network Data Models
• The network data model uses the concept of networks in
Mathematics or the graph data structure in computers.
• Allow more connection between nodes.
• A flexible way of representing entities and their relationships.
PPP
• Performance can be high because of use of pointers for traversal
• Advantage:
– Improved data access.
• Disadvantages:
– Much more complex than the hierarchical date model.
Relational Data Models
• Developed by Dr. E. F. Codd at IBM during the 1970.
• It is based on the mathematical theory of relations.
• Uses a declarative style of programming.
• Does not use pointers. Performance is low compare to Hierarchy and
PPP
Network model.
• Relationships are expressed using data values in relations (tables)
themselves
• Performance is improved with Indexes & Optimization techniques
• It's an essential part of developing a normalized database structure
• Not as good for other kinds of data (e.g. multimedia)
Object-Oriented Data Models
• An object-oriented is closely with an application program’s object
model.
• Data and relationships are contained in a single structure.
• OODBMS should be used when
PPP
– Business need,
– High performance required and
– Complex data is being used.
• Inheritance – A class (subclass) inherit the characteristic of another
class (super class).
– E.g. Employee has inherited attributes and methods from person.
Difference DBMS & RDBMS
DBMS RDBMS
•DBMS stands for Database Management RDBMS stands for Relational Database
System. Management System.
PPP
DBMS deals with small quantity of data. RDBMS deals with large quantity of data.
DBMS supports single user at a time. RDBMS supports multiple users at a time.
• Relationship between two tables or files Relationship between two tables or files can
maintained by programmed. be specified at the time of table creation.
•Changes in Logical schema may include
•Addition or Removal of attributes
PPP
•Addition or Removal of relationship
PPP
• Sharing data between different location.
Sharing Data Between Functional Units
• The data sharing suggests that people in
different functional areas use common pool
of data, each of their own applications.
PPP
• Without Data Sharing,
– Admission Section Department & Comp. Centre
may have their own data files.
– Each group benefits only from its own data.
• With Data Sharing
– Combined data are more valuable
– Access rights for data
Sharing data between different levels of users
• Three different levels of users :
– Operations
– Middle management and
– Executive
PPP
• Level of Users : Operations
– Its basic characteristics include:
• A focus on data, storage and flows at the operational level.
• Efficient transaction processing.
• Level of Users : Middle Management
• Inquiry and Report Generation
• Integration and Planning of functionality
• Level of Users : Executive
• Managers or Decision makers
• Strategic reports & Quick response
Sharing data between different locations
• A company with several locations has
important data distributed over a wide
geographical area.
PPP
• Two ways of data sharing
– A Centralized Database
– A Distributed Database
• A centralized database
– It is physically confined to a single location,
controlled by a single computer.
A Centralized Database
PPP
Distributed Database
A set of databases in a distributed system that can appear to applications as a single data
source.
PPP
Difference between Centralised and
Decentralized Database
PPP
Data Integrity
– It is the assurance that data is always be correct, consistent
and accessible during storage, transfer and retrieval.
– Ways to ensure data integrity
• Domain Integrity
PPP
• Entity Integrity Constraint
• Referential Integrity
• Database consistency
– Domain Integrity
• It means choosing the correct data type and length for a column.
• It ensures that all the data items in a column fall within a defined set
of valid values.
• Each column in a table has a defined set of values, such as
– For Mobile Number , Set of all numbers and length is 10 digit
– For Person Name , Set of all characters and length is 50 char. and
– For Birth Date , Set of all dates and length is 8
Data Integrity
– Entity Integrity Constraints
• Each row in a relation must be unique identified.
• Primary key shows the uniqueness of a rows, it cannot be NULL
– Referential integrity
PPP
• If a table has a foreign key, then a values of the foreign key
must be exist in the referenced table
• E.g. Each product mentioned in bill must be exist in product
master table .
– Database consistency
• Must be consistent before and after a transaction
• All database integrity constraints are satisfied
Data Redundancy Control
– Redundancy in DBMS is having several copies of the
same data in the database.
PPP
problems.
PPP
1001 Jaya A-3 Diamond Chok Surat Gujarat S1 Soap 20
PPP
applied to the current database design. If the design passes the test,
it is said to confirm to the normal form.
SALES
PPP
Customer Customer Cust_Add Cust_Add Cust_Add Store Product Price
_ID _Name r1 r2 r3 _Id
1001 Jaya A-3 Surat Gujarat S1 Soap 20
Diamond
Chok
1007 Kavita E-4 Shiv Anand Gujarat S2 Comb 50
Soc.
1010 Shreya G-2 Apex Baroda Gujarat S3 Shampoo 80
Flat
1008 Jaya 5, Shrey Mumbai Maharasht S4 Hair Oil 60
Apt. ra
Normalization : 2NF (Second Normal Form)
– A relation R is in 2NF if
• R is in 1 NF (Atomic)
PPP
• Non-key attributes are fully functional
dependent on Primary key.
Normalization – 2NF
PRODUCT CUST_SALES Store_Id
Product_id Product_Name Price Customer_ID Product_id Store_id
P1 Soap 20 1001 P1 S1
PPP
P2 Toothpaste 50 1007 P2 S2
P3 Shampoo 80 1010 P3 S3
P5 Conditioner 100
CUSTOMER
Customer_I Customer_Nam Cust_Addr1 Cust_Addr2 Cust_Addr3
D e
1001 Jaya A-3 Diamond Surat Gujarat
Chok
1007 Kavita E-4 Shiv Soc. Anand Gujarat
1010 Shreya G-2 Apex Flat Baroda Gujarat
1008 Jaya 5, Shrey Apt. Mumbai Maharashtra
Data Model
• “It defines how data is connected to each
other and how they are processed and stored
inside the system.”
PPP
• “Describe the relationships between different
parts of the data.”
PPP
Conceptual Data Model
• A conceptual data model identifies the
highest-level relationships between the
different entities.
PPP
• Features of conceptual data model include:
– Includes the important entities and the
relationships among them.
– No attribute is specified.
– No primary key is specified.
Logical Data Model
• A Logical data model describes the data in as
much detail as possible, without regard to
how they will be physical implemented in the
PPP
database.
• Features of a logical data model include:
– Includes all entities and relationships.
– All attributes for each entity are specified.
– The primary key for each entity is specified.
– Foreign keys are specified.
Logical Data Model
Time
Date
Day
Month
PPP
Year
Month_Name
Day_Name
Physical Data Model
• A physical database model shows all table
structures, including column name, column data
type, column constraints, primary key, foreign
key, and relationships between tables.
PPP
• Features of a physical data model include:
– Specification of all tables and columns.
– Foreign keys to identify relationships between tables.
– Physical data model will be different for different
RDBMS.
• For example, data type for a column may be different
between MySQL, Oracle and SQL Server.
Physical Data Model
Time
Date : Integer
Day : Integer
PPP
Month : Integer
Year : Integer
Month_Name : VARCHAR(30)
Day_Name : VARCHAR(30)
External Data Model
• It gives the user endpoint of view of the database.
• There may be a constraint within a database about
certain information it presents.
PPP
• For example, one user is not interested in the data
of another users data.
• An HR dept data is not interested in seeing data
based on sales.
• The external model subdivides the requirements
and constraints that can be examined within their
external model framework.
• This means it divides between Hr dept and sales.