Professional Documents
Culture Documents
Information
Data are information to the computer that is to be processed to get a relevant result. Processed results are
called information. Information can be defined as a set of organized and validated collection of data.
Thus, the term data processing means the process of collecting all items of data together to produce
meaningful information. It can be done either manually or by the use of computers. If data processing is done
with the help of computers, it is known as EDP (Electronic Data Processing).
Hence, the information that we obtain after processing the data must possess the following
characteristics.
It must be accurate.
It must be available in time as required.
It must be complete so that more inference (conclusion) can be drawn.
It should be precise in meaning.
It should be relevant to the context.
Database
A database is a collection of logically related data files organized to facilitate access by one or more
applications programs and to minimized data redundancy. Database refers to a collection of records or files that
are stored in a logically related format, making them easy to associate and retrieve. Therefore, a database is a
collection of information that is organized so that it can easily be accessed, managed and updated. The example
of database are telephone directory, a library’s card catalog etc. A database consists of four major elements.
They are data elements or data items, relationships, constraints (rules) and schema (Description of data in term
of data model).
Functions of Database
In a general file processing system, records are stored permanently in various files. There are numerous
application programs which can extract records and add the records to the appropriate files. These processes
have many advantages and disadvantages. They cannot provide data redundancy (duplication of data) and
other facilities.
Table: A table is a collection of related data held in a table format within a database. It consists of columns and
rows. The database in Relational Database Management System (RDBMS) like MS-Access is in table form. There
are many tables that you can create in a database. Each table has at least a primary key (Unique identification
of records).
Field: A field is a piece of information about an element. A field is represented by a column. Every field has got
a title called the field title.
Record: A record is an information about an element such as a person, student, an employee, client, etc. A
record can have much information in different heading or titles. Rows are also called records or tuples and they
corresponds to entities.
Tuple: Tuple is the collection of information about the attributes of table for single instance. This also can be
called as a 'row' in a Table. In simple Row wise collection of data is known tuple.
Features of RDBMS
All data stored in the tables are provided by an RDBMS.
Ensures that all data stored are in the form of rows and columns.
Facilitates primary key, which helps in unique identification of the rows.
Index creation for retrieving data at a higher speed.
Facilitates a common column to be shared amid two or more tables.
Multi-user accessibility is facilitated to be controlled by individual users.
A virtual table creation is enabled to store sensitive data and simplify queries.
Advantages / Functions of RDBMS
Data security-data is immune to program crashes.
Concurrent Access-atomic updates via transactions.
Fault tolerance-replicated dbs for instant failover on machine/disk crashes
Data integrity- to keep data meaningful (correctness and consistency)
Scalability- can handle small/large quantities of data in a uniform manner.
Reporting- Easy to write SQL programs to generate arbitrary (random choice) reports.
Disadvantages of RDBMS
Cost: The expense of maintaining and even setting up a database system is relatively high and one of the
drawbacks of relational databases. A special software is required for setting up a relational database and this
could cost a fortune.
Managing Huge Volumes of Data: The complication in information is another shortcoming of relational
databases. The data arranged within relational databases are based on common characteristics. Systems such
as multimedia products, complicated images, numbers, and designs have a structure that can accommodate
applications that are complex and are easily scalable.
Structured Limits: Relational databases impose limits on field lengths. While designing the database, it is
necessary that you specify the data volume you intend to introduce within any field. Since some of the search
queries are or might be precise than the original ones, this might lead to loss of data.
Abundance of Information: Advances in the complexity of information cause another drawback to relational
databases. Relational databases are made for organizing data by common characteristics. Complex images,
numbers, designs and multimedia products defy (refuse to obey) easy categorization leading the way for a new
type of database called object-relational database management systems. These systems are designed to handle
the more complex applications and have the ability to be scalable.
DBMS does not support Distributed databases. Most of the RDBMS supports Distributed
databases.
DBMS does not follow normalization, only single RDBMS follows normalization, multiple users can
user can access the data. access the data simultaneously.
DBMS there is no security of data. RDBMS there are multiple levels of security
Each table is given an extension in DBMS. Many tables are grouped in one database in
RDBMS.
Examples of DBMS are file systems, XML etc. Example of RDBMS is MySQL, SQL Server, Oracle,
MS-Access etc.
Advantages:
Searching is fast and easy, if parent is known.
It is easiest model than other database model.
The relationship between various layers is logically simple.
This system provides a tough database security.
Hierarchical database system maintains data independence i.e. if a data is altered in one table, it does
not affect the other location.
There is always a parent-child relationship and data integrity is maintained.
Very efficient in handling ‘one to many’ relationship.
Disadvantages:
It is old model of database.
Difficult to modification and addition of child.
It increases redundancy.
The physical implementation of the database is complicated.
It cannot handle ‘many-to-many’ relationship.
College
Std_id Sub_id
name Sub1
class Sub2
Section Std_id
Fig: Relational Database Model
Advantages:
Rapid database process.
Easy for searching of data.
Referential integrity can be applied.
It provides excellent data security.
This model is simpler and possesses improved conceptual simplicity.
It provides easier database design, implementation, management and use.
It possesses a powerful database management system.
Disadvantages:
It is complex to maintain than other database model.
We have to apply many rules.
It is not user friendly.
Components of OODM:
Object: An object is the abstraction of the real-world entity and an object represents only one occurrence of
entity.
Attributes: It describes the property of an object. Such as person is an object whereas name, age, DOB are
properties or attributes.
Class: Objects that are similar in characteristics or collection of similar objects with shared (common) attributes
and behaviour (method) is called class.
Method: Method represents a real-world action, such as finding a selected person’s name, changing person’s
name or printing a person’s address.
The students appear in examination. The registration number, name, subjects etc are attributes of
student and subjects, schedules etc are attributes of examination. The registration number is used as key
attributes.
Name Chemistry
Address Biology
Concept of Normalization
The normalization is the process of organizing data in database to reduce the redundancies, it also includes
creating of tables and establishing the relationship between those tables using rules designed to protect the
data and to make database flexibility.
The essence of normalization is to split your data into several tales that will be connected to each other
based on the data within them.
Student info Personal info Subject table Marks table
Studentid Personalid Studentid Subject
Firstname Firstname Subject Marks
Lastname Lastname Roll
Class Class
Subject
Marks Normalized table
Roll
Unnormalized table
Fig: Normalization of data
In the above figure the first table is not in normalized form. Those we want to enter the marks we have
to enter all the information so there is chance of redundancy of data and it seem to be ineffective so the table
is split into other three tables to make data independent and the table is only depending by keys. The above
normalization helps us to make sure of:
Dependence between the data is identified.
Redundancy in database is minimized.
The data model is making more flexible and easier to maintain.
Types of Normalization
1. First normal form (1NF)
When the table has no repeating group of data then it is said to be in first normal form. That means for each
cell in a table, there can be only one value. This value should be atomic in the sense that it can’t be decomposed
into smaller pieces.
Name Roll Class Sec Sub1 Marks1 Sub2 Marks2 Sub3 Marks3
Ram 2 12 A English 78 Comp 90 Maths 78
Shyam 1 11 B English 90 Comp 89 Maths 67
Hari 1 12 A English 67 Comp 98 Maths 90
The above table is not in normal form the attributes are most in repeated form to do in first normal form
we break table in the following way.
Name Roll Class Sec Subjects Marks
Ram 2 12 A Maths 90
Shyam 1 11 B English 90
Hari 1 12 A Computer 90
Ram 2 11 A English 78
Shyam 1 12 B Computer 67
Hari 1 11 A Maths 98
Ram 2 12 A Computer 78
Shyam 1 11 B Maths 89
Hari 1 12 A English 67
In the above whole table is split into the three tables marks, subject and student. The interrelated data are
place together in the table. Name depends on roll+class+sec, subject name dependent on class not on roll.
Name, subject and marks are interrelated.
Database Language
1. Data Manipulation Language (DML)
A data manipulation language (DML) is a family of computer languages including commands permitting
users to manipulate data in a database. This manipulation involves inserting data into database tables,
retrieving existing data, deleting data from existing tables and modifying existing data. DML is mostly
incorporated in SQL databases.
DML resembles simple English language and enhances efficient user interaction with the system. The
functional capability of DML is organized in manipulation commands like SELECT, UPDATE, INSERT INTO and
DELETE FROM, as described below:
Database
SERVER
INTERNET
Client Client
Slower response for certain queries. Faster response for certain queries.
Lower hardware, software cost and fewer Higher hardware and software cost, and more
complexes. complex.
Poor data and failure recovery. Better data and failure recovery.
Data Integrity
Database integrity means the correctness and consistency of data.
Requirements of data Integrity (Integrity constraints)
it is another form of database protection. So, it used for it.
Security means that the data must be protected from unauthorized operations.
Integrity is related to the quality of data. It always provides better quality of data.
Integrity is maintained with the help of integrity constraints:
i) Domain integrity
ii) Entity integrity
iii) Referential integrity
These constraints are the rules that are designed to keep data consistent and correct. They act like a check
on the incoming data.
It is very important that a database maintains the quality of the data stored in it. DBMS provides several
mechanisms to enforce integrity of the data.
It provides non-violation mode of database.
It provides validity of a data.
It provides accuracy and consistency of data in database.
It provides the logical method to design database.
It is basic element of database to give right information.
Types of Integrity:
There are mainly two types of data integrity. They are entity integrity, referential integrity and domain
integrity.
1. Entity Integrity
The entity integrity is a constraint on primary key value. It states that ant attribute of a primary key
cannot contain null value. If primary key contains null value, it is not possible to uniquely identify a record
in a relation. Entity integrity ensures that it should be easy to identify each entity in database. This integrity
rule can be applied to table columns to enforce different type of data integrity. Entity integrity includes the
following rules:
Null Rule
Unique Column Values
Primary Keys value
2. Referential Integrity
The referential integrity is a constraint on foreign key value. It states that if a foreign key exists in a
relation, the foreign key value matches the primary key value of some tuple in its parent relation. Otherwise
the foreign key value must be completely null.
A referential integrity rule is rule defined on key (a column or set of columns) in one table that
guarantees that the values in the key match the values on a key in a related table (the referenced value).
Referential integrity is a database concept that ensures that relationships between tables remain
consistent.
3. Domain Integrity
Domain integrity also refers to the validity of data. Validity of data means types of data, range of data
and format of data etc on column. Data integrity can be compromised in a number of ways:
Human errors when data is entered.
Errors that occurs when data is transmitted from one computer to another.
Software bugs or viruses.
There are many ways to minimize these threats of data integrity. These include:
Backing up data regularly.
Controlling access to data via security mechanisms.
Data security
Data security is one of the challenging jobs of Database Administrators (DA). The secured data can be
transferred from one server to another server at great distances. For the prevention of data piracy and data
mining, proper securities are necessary to be implemented in the system. The two common methods of data
security are using the username and password. The username authentication and password verification can
allow for data access. So, data security is a preventive measure that a Database Administrator (DA) must take
for the protection of data from the unauthorized access, theft, corruption, etc.
To protect the database, we must take security measures at several levels:
Physical: The sites or sites of containing the computer must physically secure against physical damage
from natural disaster and intruders. The physical security also consists of regular maintenance,
insurance, protect from theft etc.
Human: Database user must be authorized carefully. The data piracy of user is also the factor that may
damage the database.
Operating system: We can also make the database secure by the policy of operating system also today
there are many options of security provided by O.S.
Network: We can also apply by the network physical layer, since all the database are connected through
the network.
System Security
In this process by which the various resources and information of a system against destruction and
unauthorized access.
Note:
Primary Key: A key that can be used to uniquely identify a row in a table is called a primary key. Any column
can be act as a key. Therefore, a primary key is an attribute or combination of attributes that uniquely
identify each instance of an entity. A primary key’s main features are:
It must contain a unique value for each row of data.
It cannot contain null values.
Foreign Key: When a primary key of a parent key exists in a child entity, then thee key is called a foreign key.
A foreign key is a field (or collection of fields) in one table that uniquely identifies a row of another table
or the same table. In simpler words, the foreign key is defined in a second table, but it refers to the primary
key or a unique key in the first table. For example, a table called Employees has a primary key called
employee_id. Another table called Employee details has a foreign key which references employee_id in order
to uniquely identify the relationship between the two tables.
Candidate Key: A relation in which there is more than one attribute combination processing the unique
identification is called candidate key.