You are on page 1of 51

What is Data?

Data is a raw and unorganized fact that required to be processed to


make it meaningful. Data can be simple at the same time unorganized
unless it is organized. Generally, data comprises facts, observations,
perceptions numbers, characters, symbols, image, etc.
What is Information?
Information is a set of data which is processed in a meaningful way
according to the given requirement. Information is processed,
structured, or presented in a given context to make it meaningful and
useful.
Information assigns meaning and improves the reliability of the data.
So, when the data is transformed into information, it never has any
useless details.
Database
The database is a collection of inter-related data which is
used to retrieve, insert and delete the data efficiently. It is
also used to organize the data in the form of a table,
schema, views, and reports, etc.

For example: The college Database organizes the data about


the admin, staff, students and faculty etc.

Using the database, you can easily retrieve, insert, and


delete the information
Database Management System
Database management system is a software which is used
to manage the database. For example: MySQL, Oracle, etc
are a very popular commercial database which is used in
different applications.
DBMS provides an interface to perform various operations
like database creation, storing data in it, updating data,
creating a table in the database and a lot more.
It provides protection and security to the database. In the
case of multiple users, it also maintains data consistency.
DBMS allows users the following tasks:

Data Definition: It is used for creation, modification, and removal of


definition that defines the organization of data in the database.
Data Updation: It is used for the insertion, modification, and
deletion of the actual data in the database.
Data Retrieval: It is used to retrieve the data from the database
which can be used by applications for various purposes.
User Administration: It is used for registering and monitoring users,
maintain data integrity, enforcing data security, dealing with
concurrency control, monitoring performance and recovering
information corrupted by unexpected failure.
Characteristics of DBMS
It uses a digital repository established on a server to store and
manage the information.
It can provide a clear and logical view of the process that
manipulates data.
DBMS contains automatic backup and recovery procedures.
It contains ACID properties which maintain data in a healthy state in
case of failure. ACID (atomicity, consistency, isolation, durability)
It can reduce the complex relationship between data.
It is used to support manipulation and processing of data.
It is used to provide security of data.
It can view the database from different viewpoints according to the
requirements of the user.
Advantages of DBMS
Controls database redundancy (repetition): It can control data
redundancy because it stores all the data in one single database file and
that recorded data is placed in the database.
Data sharing: In DBMS, the authorized users of an organization can share
the data among multiple users.
Easily Maintenance: It can be easily maintainable due to the centralized
nature of the database system.
Reduce time: It reduces development time and maintenance need.
Backup: It provides backup and recovery subsystems which create
automatic backup of data from hardware and software failures and
restores the data if required.
multiple user interface: It provides different types of user interfaces like
graphical user interfaces, application program interfaces
Disadvantages of DBMS
Cost of Hardware and Software: It requires a high speed of
data processor and large memory size to run DBMS software.
Size: It occupies a large space of disks and large memory to run
them efficiently.
Complexity: Database system creates additional complexity
and requirements.
Higher impact of failure: Failure is highly impacted the
database because in most of the organization, all the data
stored in a single database and if the database is damaged due
to electric failure or database corruption then the data may be
lost forever
Advantages of DBMS over file system
Drawbacks of File system
Data redundancy: Data redundancy refers to the duplication of data,
lets say we are managing the data of a college where a student is
enrolled for two courses, the same student details in such case will be
stored twice, which will take more storage than needed. Data
redundancy often leads to higher storage costs and poor access time.
Data inconsistency: Data redundancy leads to data inconsistency, lets
take the same example that we have taken above, a student is
enrolled for two courses and we have student address stored twice,
now lets say student requests to change his address, if the address is
changed at one place and not on all the records then this can lead to
data inconsistency.
Data Isolation: Because data are scattered in various files, and files
may be in different formats, writing new application programs to
retrieve the appropriate data is difficult.
Dependency on application programs: Changing files would lead to
change in application programs.
Atomicity issues: Atomicity of a transaction refers to “All or
nothing”, which means either all the operations in a transaction
executes or none.

Data Security: Data should be secured from unauthorised access,


for example a student in a college should not be able to see the
payroll details of the teachers, such kind of security constraints are
difficult to apply in file processing systems.
Advantage of DBMS over file system
There are several advantages of Database management system
over file system. Few of them are as follows:
No redundant data: Redundancy removed by data
normalization. No data duplication saves storage and improves
access time.

Data Consistency and Integrity: As we discussed earlier the root


cause of data inconsistency is data redundancy, since data
normalization takes care of the data redundancy, data
inconsistency also been taken care of as part of it
Data Security: It is easier to apply access constraints in
database systems so that only authorized user is able to access
the data. Each user has a different set of access thus data is
secured from the issues such as identity theft, data leaks and
misuse of data.
Privacy: Limited access means privacy of data.
Easy access to data – Database systems manages data in such a
way so that the data is easily accessible with fast response
times.
Easy recovery: Since database systems keeps the backup of
data, it is easier to do a full recovery of data in case of a failure.
Flexible: Database systems are more flexible than file
processing systems.
There are four common types of database model that are useful for
different types of data or information. Depending upon your
specific needs, one of these models can be used.

1. Hierarchical databases.
2. Network databases.
3. Relational databases.
4. Object-oriented databases.
1. Hierarchical databases
It is one of the oldest database model developed by IBM for
information Management System. In a hierarchical database model, the
data is organized into a tree-like structure. In simple language we can
say that it is a set of organized data in tree structure.
This type of Database model is rarely used nowadays. Its structure is like
a tree with nodes representing records and branches representing
fields. The windows registry used in Windows XP is an example of a
hierarchical database. Configuration settings are stored as tree
structures with nodes.
The following figure shows the generalized the structure of Hierarchical
database model in which data is stored in the form of tree like structure
(data represented or stored in root node, parent node and child node).
Advantages

• The model allows us easy addition and deletion of new


information.
• Data at the top of the Hierarchy is very fast to access.
• It worked well with linear data storage mediums such as
tapes.
• It relates well to anything that works through a one to many
relationships. For example; there is a president with many
managers below them, and those managers have many
employees below them, but each employee has only one
manager.
Disadvantages

• It requires data to be repetitively stored in many different


entities.
• Now a day there is no longer use of linear data storage
mediums such as tapes.
• Searching for data requires the DBMS to run through the
entire model from top to bottom until the required
information is found, making queries very slow.
• This model support only one to many relationships, many to
many relationships are not supported.
2. Network databases
This is an extension of the Hierarchical model. In this
model data is organised more like a graph, and are
allowed to have more than one parent node.
In this database model data is more related as more
relationships are established in this database model.
Also, as the data is more related, hence accessing the
data is also easier and fast. This database model was
used to map many-to-many data relationships.
This was the most widely used database model, before
Relational Model was introduced.
Advantage

• The network model is conceptually simple and easy to design.


• The network model can represent redundancy in data more
effectively than in the hierarchical model.
• The network model can handle the one to many and many to
many relationships which is real help in modelling the real-life
situations.
• The data access is easier and flexible than the hierarchical model.
• The network model is better than the hierarchical model in
isolating the programs from the complex physical storage details.
Disadvantage:

• All the records are maintained using pointers and hence


the whole database structure becomes very complex.
• The insertion, deletion and updating operations of any
record require the large number of pointers adjustments.
• The structural changes to the database is very difficult.
3. Relational Database

A relational database is developed by E. F. Codd in 1970. The various


software systems used to maintain relational databases are known as a
relational database management system (RDBMS). In this model, data
is organised in rows and column structure i.e., two-dimensional tables
and the relationship is maintained by storing a common field. It
consists of three major components.
In relational model, three key terms are heavily used such as relations,
attributes, and domains. A relation nothing but is a table with rows and
columns. The named columns of the relation are called as attributes,
and finally the domain is nothing but the set of values the attributes
can take. The following figure gives us the overview of rational
database model.
Terminology used in Relational Model

• Tuple: Each row in a table is known as tuple.


• Cardinality of a relation: The number of tuples in a
relation determines its cardinality. In this case, the
relation has a cardinality of 4.
• Degree of a relation: Each column in the tuple is called
an attribute. The number of attributes in a relation
determines its degree. The relation in figure has a degree
of 3.
Keys of a relation-

• Primary key- it is the key that uniquely identifies a table. It


doesn’t have null values.
• Foreign key- it refers to the primary key of some other table.it
permits only those values which appear in the primary key of the
table to which it refers.

Some of the example of relational database are as follows.

Oracle: Oracle Database is commonly referred to as Oracle RDBMS


or simply as Oracle. It is a multi-model database management
system produced and marketed by Oracle Corporation.
MySQL: MySQL is an open-source relational database management
system (RDBMS) based on Structured Query Language (SQL). MySQL
runs on virtually all platforms, including Linux, UNIX, and Windows.

Microsoft SQL Server: Microsoft SQL Server is an RDBMS that


supports a wide variety of transaction processing, business
intelligence, and analytics applications in corporate IT environments.

PostgreSQL: PostgreSQL, often simply Postgres, is an object-


relational database management system (ORDBMS) with an
emphasis on extensibility and standards compliance.

DB2: DB2 is an RDBMS designed to store, analyze, and retrieve data


efficiently.
Advantage
• Relational model is one of the most popular used database model.
• In relational model, changes in the database structure do not affect
the data access.
• The revision of any information as tables consisting of rows and
columns is much easier to understand.
• The relational database supports both data independence and
structure independence concept which makes the database design,
maintenance, administration and usage much easier than the other
models.
• In this we can write complex query to accesses or modify the data
from database.
• It is easier to maintain security as compare to other models.
Disadvantages

• Mapping of objects in relational database is very difficult.


• Object oriented paradigm is missing in relation model.
• Data Integrity is difficult to ensure with Relational database.
• Relational Model is not suitable for huge database but
suitable for small database.
• Hardware overheads are incurred which make it costly.
• Ease of design can lead to bad design.
• Relational database system hides the implementation
complexities and the physical data storage details from the
users.
4. Object-oriented databases

An object database is a system in which information is represented


in the form of objects as used in object-oriented programming.
Object oriented databases are different from relational databases
which are table-oriented. The object-oriented data model is based
on the object-oriented- programming language concept, which is
now in wide use. Inheritance, polymorphism, overloading. object-
identity, encapsulation and information hiding with methods to
provide an interface to objects, are among the key concepts of
object-oriented programming that have found applications in data
modelling. The object-oriented data model also supports a rich
type system, including structured and collection types.
The following figure shows the difference between
relation and object-oriented database model.
Advantages

• Object database can handle different types of data while


relational data base handles a single data..
• Object-oriented databases provide us code reusability, real world
modelling, and improved reliability and flexibility.
• The object-oriented database is having low maintenance costs as
compared to other model
Disadvantages

• There is no universally defined data model for an OODBMS,


and most models lack a theoretical foundation.
• In comparison to RDBMSs the use of OODBMS is still
relatively limited.
• There is a Lack of support for security in OODBMSs that do
not provide adequate security mechanisms.
• The system more complex than that of traditional DBMSs.
RDBMS
RDBMS stands for relational database management system. A
relational model can be represented as a table of rows and
columns. A relational database has following major components:
1. Table
2. Record or Tuple
3. Field or Column name or Attribute
4. Domain
5. Instance
6. Schema
7. Keys
1. Table- A table is a collection of data represented in rows and
columns. Each table has a name in database. For example, the
following table “STUDENT” stores the information of students in
database.
Table: STUDENT

Student_Id Student_Name Student_Addr Student_Age

Dayal Bagh,
101 Chaitanya Agra 27

102 Ajeet Delhi 26


103 Rahul Gurgaon 24
104 Shubham Chennai 25
2. Record or Tuple
Each row of a table is known as record. It is also known as
tuple. For example, the following row is a record that we
have taken from the above table.

3. Field or Column name or Attribute


The above table “STUDENT” has four fields (or attributes):
Student_Id, Student_Name, Student_Addr &
Student_Age.
34. Domain
A domain is a set of permitted values for an attribute in table. For
example, a domain of month-of-year can accept January,
February,…December as values, a domain of dates can accept all
possible valid dates etc. We specify domain of attribute while
creating a table.

An attribute cannot accept values that are outside of their


domains. For example, In the above table “STUDENT”, the
Student_Id field has integer domain so that field cannot accept
values that are not integers for example, Student_Id cannot has
values like, “First”, 10.11 etc.
Instance: The data stored in database at a particular moment of
time is called instance of database.
For example, lets say we have a single table student in the database,
today the table has 100 records, so today the instance of the database
has 100 records. Lets say we are going to add another 100 records in this
table by tomorrow so the instance of database tomorrow will have 200
records in table. In short, at a particular moment the data stored in
database is called the instance, that changes over time when we add or
delete data from the database.

Database schema -defines the variable declarations in tables


that belong to a particular database; the value of these variables
at a moment of time is called the instance of that database.
Normalization in DBMS:
Normalization is a process of organizing the data in database to
avoid data redundancy, insertion anomaly, update anomaly &
deletion anomaly.
Anomalies in DBMS
There are three types of anomalies that occur when the database
is not normalized. These are – Insertion, update and deletion
anomaly. Let’s take an example to understand this.

Example: Suppose a manufacturing company stores the employee details in a table


named employee that has four attributes: emp_id for storing employee’s id, emp_name
for storing employee’s name, emp_address for storing employee’s address and
emp_dept for storing the department details in which the employee works. At some
point of time the table looks like this:
Emp_id emp_name emp_address emp_dept

101 Rick Delhi D001

101 Rick Delhi D002

123 Maggie Agra D890

166 Glenn Chennai D900

166 Glenn Chennai D004


Update anomaly: In the above table we have two rows for
employee Rick as he belongs to two departments of the
company. If we want to update the address of Rick then we have
to update the same in two rows or the data will become
inconsistent. If somehow, the correct address gets updated in
one department but not in other then as per the database, Rick
would be having two different addresses, which is not correct
and would lead to inconsistent data.
Insert anomaly: Suppose a new employee joins the company,
who is under training and currently not assigned to any
department then we would not be able to insert the data into the
table if emp_dept field doesn’t allow nulls.

Delete anomaly: Suppose, if at a point of time the company


closes the department D890 then deleting the rows that are
having emp_dept as D890 would also delete the information of
employee Maggie since she is assigned only to this department.

To overcome these anomalies we need to normalize the data.


Office automation
Office automation refers to the varied computer machinery
and software used to digitally create, collect, store,
manipulate, and relay office information needed for
accomplishing basic tasks. Raw data storage, electronic
transfer, and the management of electronic business
information comprise the basic activities of an office
automation system.[1] Office automation helps in optimizing
or automating existing office procedures.
The backbone of office automation is a LAN, which allows users to
transfer data, mail and even voice across the network. All office
functions, including dictation, typing, filing, copying, fax, Telex,
microfilm and records management, telephone and telephone
switchboard operations, fall into this category. Office automation
was a popular term in the 1970s and 1980s as the desktop
computer exploded onto the scene.
Advantages are:
Office automation can get many tasks accomplished faster.
It eliminates the need for a large staff.
Less storage is required to store data.
Multiple people can update data simultaneously in the event of
changes in schedule
DBA (Database Administrator)
A Database Administrator is a person or a group of person
who are responsible for managing all the activities related to
database system. This job requires a high level of expertise by
a person or group of person. There are very rare chances that
only a single person can manage all the database system
activities so companies always have a group of people who
take care of database system.Role Of DBA
Role, Duties and Responsibilities of database Administrator( DBA)
Installing and Configuration of database: DBA is responsible for
installing the database software. He configure the software of
database and then upgrades it if needed. There are many
database software like oracle, Microsoft SQL and MySQL in the
industry so DBA decides how the installing and configuring of
these database software will take place.

1. Deciding the hardware device


Depending upon the cost, performance and efficiency of the
hardware, it is DBA who have the duty of deciding which
hardware devise will suit the company requirement.
2. Managing Data Integrity
Data integrity should be managed accurately because it protects
the data from unauthorized use. DBA manages relationship
between the data to maintain data consistency.

3. Decides Data Recovery and Back up method


If any company is having a big database, then it is likely to happen
that database may fail at any instance. It is require that a DBA
takes backup of entire database in regular time span. DBA has to
decide that how much data should be backed up and how
frequently the back should be taken. Also the recovery of data
base is done by DBA if they have lost the database.
4. Tuning Database Performance
Database performance plays an important role for any business. If
user is not able to fetch data speedily then it may loss company
business. So by tuning an modifying sql commands a DBA can
improves the performance of database.

5. Capacity Issues
All the databases have their limits of storing data in it and the
physical memory also has some limitations. DBA has to decide the
limit and capacity of database and all the issues related to it.
6. Database design
The logical design of the database is designed by the DBA. Also
a DBA is responsible for physical design, external model design,
and integrity control.

7. Database accessibility
DBA writes subschema to decide the accessibility of database.
He decides the users of the database and also which data is to
be used by which user. No user has to power to access the
entire database without the permission of DBA.
8. Decides validation checks on data
DBA has to decide which data should be used and what kind of
data is accurate for the company. So he always puts validation
checks on data to make it more accurate and consistence.

9. Monitoring performance
If database is working properly then it doesn’t mean that there
is no task for the DBA. Yes f course, he has to monitor the
performance of the database. A DBA monitors the CPU and
memory usage.
10. Decides content of the database
A database system has many kind of content information in it.
DBA decides fields, types of fields, and range of values of the
content in the database system. One can say that DBA decides
the structure of database files.

11. Provides help and support to user


If any user needs help at any time then it is the duty of DBA to
help him. Complete support is given to the users who are new to
database by the DBA.
12. Database implementation
Database has to be implemented before anyone can start
using it. So DBA implements the database system. DBA
has to supervise the database loading at the time of its
implementation.

13. Improve query processing performance


Queries made by the users should be performed speedily.
As we have discussed that users need fast retrieval of
answers so DBA improves query processing by improving
their

You might also like