You are on page 1of 20

Unit 3

MBA/BBA/B.com /B.Tech /UGC Net

By
Dr. Anand Vyas
Managing Data Resources- The need
for data management,
• Improved Data Quality. As the MDM application
streamlines the data, it also eliminates bad data.
• Reduces Time and Cost.
• Avoids Data Duplication.
• Increased Data Accuracy.
• Better Data Compliance.
• Informed Decision Making.
• Handling Change Requests.
• Enables Easy Data Edits.
Challenges of data management,
• Sheer Volume of Data. Every day, it is estimated that
2.5 quintillion bytes of data are created,
• Multiple Data Storages.
• Data Quality.
• Lack of Processes and Systems.
• Data Integration.
• Lack of Skilled Resources.
• Data Governance.
• Data Security.
Data independence
• Data Independence is defined as a property of
DBMS that helps you to change the Database
schema at one level of a database system
without requiring to change the schema at
the next higher level. Data independence
helps you to keep data separated from all
programs that make use of it.
Data redundancy, Data consistency,.
• Data redundancy occurs when the same
piece of data exists in multiple places,
whereas data inconsistency is when the same
data exists in different formats in multiple
tables. Unfortunately, data redundancy can
cause data inconsistency, which can provide a
company with unreliable and/or meaningless
information
Data administration
• Data administration -- a high-level function
that is responsible for the overall
management of data resources in an
organization, including: Database planning,
analysis, design, implementation, and
maintenance. Data protection. Data
performance assurance. User training,
education, and consulting support.
Database Management System –
Concepts
• Database Management Systems (DBMS)
are software systems used to store, retrieve,
and run queries on data. A DBMS serves as an
interface between an end-user and a
database, allowing users to create, read,
update, and delete data in the database.
TYPES OF DBMS,
Four Types of DBMS systems are:
1. Hierarchical DBMS
In a Hierarchical database, model data is organized in a tree-like structure. Data is Stored Hierarchically
(top down or bottom up) format. Data is represented using a parent-child relationship. In Hierarchical
DBMS parent may have many children, but children have only one parent.
2. Network Model
The network database model allows each child to have multiple parents. It helps you to address the
need to model more complex relationships like as the orders/parts many-to-many relationship. In this
model, entities are organized in a graph which can be accessed through several paths.
3. Relational model
Relational DBMS is the most widely used DBMS model because it is one of the easiest. This model is
based on normalizing data in the rows and columns of the tables. Relational model stored in fixed
structures and manipulated using SQL.
4. Object-Oriented Model
In Object-oriented Model data stored in the form of objects. The structure which is called classes which
display data within it. It defines a database as a collection of objects which stores both data members
values and operations.
Fields, Records, Table, View, Reports
and Queries.
Fields = Individual
Data warehouse Characteristics
• Subject Oriented: A data warehouse is subject oriented because it provides
information around a subject rather than the organization’s ongoing operations.
These subjects can be product, customers, suppliers, sales, revenue, etc. A data
warehouse does not focus on the ongoing operations, rather it focuses on
modelling and analysis of data for decision making.
• Integrated: A data warehouse is constructed by integrating data from
heterogeneous sources such as relational databases, flat files, etc. This integration
enhances the effective analysis of data.
• Time Variant: The data collected in a data warehouse is identified with a particular
time period. The data in a data warehouse provides information from the historical
point of view.
• Non-volatile: Non-volatile means the previous data is not erased when new data is
added to it. A data warehouse is kept separate from the operational database and
therefore frequent changes in operational database is not reflected in the data
warehouse.
uses of Data warehouse
• Data warehouses are used for analytical
purposes and business reporting. Data
warehouses typically store historical data by
integrating copies of transaction data from
disparate sources. Data warehouses can also
use real-time data feeds for reports that use
the most current, integrated information.
Data Analysis
•There are two forms of data analysis that can
be used for extracting models describing
important classes or to predict future data
trends. These two forms are as follows:
•Classification
•Prediction
Data mining – Characteristics
Data Mining is defined as extracting information from huge sets of
data. In other words, we can say that data mining is the procedure of
mining knowledge from data. The information or knowledge extracted
so can be used for any of the following applications:

Market Analysis
Fraud Detection
Customer Retention
Production Control
Science Exploration
The characteristics of Data Mining are: Prediction of likely outcomes. Focus on large
datasets and database. Automatic pattern predictions based on behavior analysis.
Techniques of Data Mining
Data Mining Techniques
One of the most important task in Data Mining is to select the correct data mining technique.
Data Mining technique has to be chosen based on the type of business and the type of problem
your business faces. A generalized approach has to be used to improve the accuracy and cost
effectiveness of using data mining techniques. There are basically seven main Data Mining
techniques which is discussed in this article. There are also a lot of other Data Mining
techniques but these seven are considered more frequently used by business people.

1. Statistics
2. Clustering
3. Visualization
4. Decision Tree
5. Association Rules
6. Neural Networks
7. Classification
• Induction Decision Tree Technique
• A decision tree is a predictive model and the name itself implies that it
looks like a tree. In this technique, each branch of the tree is viewed as a
classification question and the leaves of the trees are considered as
partitions of the dataset related to that particular classification. This
technique can be used for exploration analysis, data pre-processing and
prediction work.

• Association Rule Technique


• This technique helps to find the association between two or more items. It
helps to know the relations between the different variables in databases.
It discovers the hidden patterns in the data sets which is used to identify
the variables and the frequent occurrence of different variables that
appear with the highest frequencies.
• Neural Network
• Network is another important technique used by people these days. This
technique is most often used in the starting stages of the data mining
technology. Artificial neural network was formed out of the community of
Artificial intelligence.
• Statistical Techniques
• Data mining techniques statistics is a branch of mathematics which relates to the collection and
description of data. Statistical technique is not considered as a data mining technique by many
analysts. But still it helps to discover the patterns and build predictive models.

• Clustering Technique
• Clustering is one among the oldest techniques used in Data Mining. Clustering analysis is the
process of identifying data that are similar to each other. This will help to understand the
differences and similarities between the data.

• Visualization
• Visualization is the most useful technique which is used to discover data patterns. This technique is
used at the beginning of the Data Mining process. Many researches are going on these days to
produce interesting projection of databases, which is called Projection Pursuit

• Classification
• Data mining techniques classification is the most commonly used data mining technique which
contains a set of pre classified samples to create a model which can classify the large set of data.
This technique helps in deriving important information about data and metadata (data about data).
Business Intelligence Database
Management System (Lab)

You might also like