MIS Unit 3

1
Management Information System
Lecture by
Dr. Ruchi Garg
Lloyd Business School
Greater Noida
Need of Data Management
2/19
 Data management helps minimize potential errors by

establishing processes and policies for usage and
building trust in the data being used to make decisions
across your organization.
 With reliable, up-to-date data, companies can respond

more efficiently to market changes and customer needs.
Need of Data Management
3/19
 Data Management is important because it offers a

streamlined, standardized system for organizations to
find, manage, access, store, and secure their data.
 DM ensures companies can easily find, analyze, and

utilize data to make the best, most informed decisions
and operate from a data-driven perspective.
Challenges of Data Management
4/19
 Sheer Volume of Data

 Multiple Data Storages
 Data Quality
 Data Integration
 Lack of Skilled Resources
 Data Governance
 Data Security
 Data Automation
 Data Analysis
 Going from Unstructured to Structured Data
5/19
 The more data is collected, the more monitoring and validation would be
required.
 Large organizations may wind up with tens of business solutions, each with its
own data repository, such as databases, CRM, ERP, and so on.
 One must get rid of unnecessary data while retaining high-quality and accurate
data.
 When data is gathered from many sources, inconsistency in the data is
unavoidable. Inadequate data management processes and systems contribute
to inaccurate data.
 The ultimate purpose of having quality ready data is to have it available for
further analysis and processing by other business intelligence tools in order to
deliver it to senior management for more informed decision making.
6/19
 There is a severe scarcity of experienced data management specialists

available for immediate recruitment.
 Data governance is in charge of establishing rules and regulations for an
organization’s information state. A data governance framework is comparable
to a constitution in that it aids in the implementation of policies, rules, and laws
for data-related procedures.
 Data automation: Without requiring much human participation, the procedure
streamlines the whole cycle from collection to analysis.
 more than 80% of data collected is unstructured, which is a major problem.
Data Independence
7/19
 Data independence is the type of data transparency that matters for a

centralized DBMS.
 It refers to the immunity of user applications to changes made in the definition
and organization of data.
 Application programs should not, ideally, be exposed to details of data
representation and storage.
Data Independence
8/19
 Data Independence is defined as a property of DBMS that helps you to change

the Database schema at one level of a database system without requiring to
change the schema at the next higher level.
 Data independence helps you to keep data separated from all programs that
make use of it.
Data Independence
9/19
 Physical Data Independence

 Logical Data Independence
Examples of Physical Data Independence
10/19
 Due to Physical independence, any of the below change will not affect the
conceptual layer.
 Using a new storage device like Hard Drive or Magnetic Tapes

 Modifying the file organization technique in the Database
 Switching to different data structures.
 Changing the access method.
 Changes to compression techniques or hashing algorithms.
 Change of Location of Database from say C drive to D Drive
Examples of logical Data Independence
11/19
 Due to Logical independence, any of the below change will not affect the
external layer.
 Add/Modify/Delete a new attribute, entity or relationship is possible

without a rewrite of existing application programs
 Merging two records into one
 Breaking an existing record into two or more records
Data Redundancy
12/19
 Data redundancy is the storage of the same data in data storage or

databases.
 When intentional, it provides a number of benefits and supports numerous use

cases.
 However, data redundancy is often unintentional and results in many

complications.
Data Redundancy
13/19
 Duplicity of data
 A common example of data redundancy is when a name and address are both
present in different columns within a table.
 If the link between these data points is defined in every single new database
entry it would lead to unnecessary duplication across the entire table.
Data Redundancy
14/19
 Redundant data is a bad idea because when you modify data

(update/insert/delete), then you need to do it in more than one place.
 This opens up the possibility that the data becomes inconsistent across the
database. (Data Consistency)
Data Consistency
15/19
 Data consistency means that each user sees a consistent view of the data,
including visible changes made by the user's own transactions and transactions
of other users.
Data Administration
16/19
 Data administration is the process by which data is monitored, maintained and managed
by a data administrator and/or an organization. Data administration allows an
organization to control its data assets, as well as their processing and interactions with
different applications and business processes.
 Data administration -- a high-level function that is responsible for the overall

management of data resources in an organization, including: Database planning,
analysis, design, implementation, and maintenance. Data protection. Data performance
assurance. User training, education, and consulting support.
Database Management System
17/19
 DBMS is a computerized data-keeping system.

 Users of the system can do either manipulation of the data in the database or the
management of the database structure itself.
 Basic Terms
 Field
 Record
 Table
 View
 Report
 Query
Types of DBMS
18/19
 Relational database system

 hierarchical database systems
 network database systems
 object-oriented database systems
Types of DBMS: Relational
19/19
 A relational database contains multiple tables of data with rows and columns that relate
to each other through special key fields.
 These databases are more flexible than flat file structures, and provide functionality for
reading, creating, updating, and deleting data.
 Relational databases use Structured Query Language (SQL) - a standard user

application that provides an easy programming interface for database interaction.
Types of DBMS: Hierarchical
20/19
 It resembles a tree structure, similar to a folder architecture in your computer system.
 The relationships between records are pre-defined in a one to one manner, between
'parent and child' nodes.
 They require the user to pass a hierarchy in order to access needed data.
 Due to such limitations, this type is confined to specific uses.

Types of DBMS: Network
21/19
 A network database is a type of database model wherein multiple member records or

files can be linked to multiple owner files and vice versa.
 The model can be viewed as an upside-down tree where each member information is the
branch linked to the owner, which is the bottom of the tree.
 Relationships are in a net-like form where a single element can point to multiple data
elements and can itself be pointed to by multiple data elements.
Types of DBMS: Object-Oriented
22/19
 In object-oriented databases, the information is represented as objects, with different

types of relationships possible between two or more objects.
 Such databases use an object-oriented programming language for development.

Data Warehouse
23/19
 A data warehouse is a central repository of information that can be analyzed to make

more informed decisions.
 Also known as an enterprise data warehouse.
 It is a system used for reporting and data analysis and is considered a core component
of business intelligence.
 DWs are central repositories of integrated data from one or more disparate sources.
 Data flows into a data warehouse from transactional systems, relational databases, etc.
Purpose of Data Warehouse
24/19
 A data warehouse is a type of data management system that is designed to enable and
support business intelligence (BI) activities, especially analytics.
 Data warehouses are solely intended to perform queries and analysis and often contain
large amounts of historical data.
Characteristics of Data Warehouse
25/19
26/19
 Subject-oriented –
A data warehouse is always a subject oriented as it delivers information about a theme
instead of organization’s current operations. It can be achieved on specific theme. That
means the data warehousing process is proposed to handle with a specific theme which
is more defined. These themes can be sales, distributions, marketing etc.
 Integrated –
Data from the different databases. The data required to be resided into various data
warehouse in shared and generally granted manner.
27/19
 Time-Variant –
The data is maintained via different intervals of time such as weekly, monthly, or
annually etc. It founds various time limit which are structured between the large datasets
and are held in online transaction process (OLTP).
 Non-Volatile –
As the name defines the data resided in data warehouse is permanent. It also means
that data is not erased or deleted when new data is inserted.
Uses of Data Warehouse
28/19
 A data warehouse is specially designed for data analytics, which involves reading large
amounts of data to understand relationships and trends across the data.
 A database is used to capture and store data, such as recording details of a transaction.
Data Mining
29/19
 Data mining is the process of extracting and discovering patterns in large data sets
involving methods at the intersection of machine learning, statistics, and database
systems.
 Data mining is the process of finding anomalies, patterns and correlations within large
data sets to predict outcomes.
Data Mining Techniques
30/19
Classification
31/19
1. Classification:
This technique is used to obtain important and relevant information about data and metadata. This data mining technique helps to classify
data in different classes.
type of data sources mined:
This classification is as per the type of data handled. For example, multimedia, spatial data, text data, time-series data, World Wide
Web, and so on..
database involved:
This classification based on the data model involved. For example. Object-oriented database, transactional database, relational
database, and so on..
kind of knowledge discovered:
This classification depends on the types of knowledge discovered or data mining functionalities. For example, discrimination, classification,
clustering, characterization, etc. some frameworks tend to be extensive frameworks offering a few data mining functionalities together..
data mining techniques used:
This classification is as per the data analysis approach utilized, such as neural networks, machine learning, genetic algorithms,
visualization, statistics, data warehouse-oriented or database-oriented, etc.
Clustering
32/19
 Clustering is a division of information into groups of connected objects.

 Clustering analysis is a data mining technique to identify similar data.
 This technique helps to recognize the differences and similarities between the data.
Clustering is very similar to the classification, but it involves grouping chunks of data
together based on their similarities.
Regression
33/19
 Regression analysis is the data mining process is used to identify and analyze the
relationship between variables because of the presence of the other factor.
 It is used to define the probability of the specific variable. Regression, primarily a form
of planning and modeling.
 For example, we might use it to project certain costs, depending on other factors such as
availability, consumer demand, and competition. Primarily it gives the exact relationship
between two or more variables in the given data set.
Outer detection
34/19
 This type of data mining technique relates to the observation of data items in the data
set, which do not match an expected pattern or expected behavior. This technique may
be used in various domains like intrusion, detection, fraud detection, etc. It is also known
as Outlier Analysis or Outilier mining. The outlier is a data point that diverges too much
from the rest of the dataset. The majority of the real-world datasets have an outlier.
Outlier detection plays a significant role in the data mining field. Outlier detection is
valuable in numerous fields like network interruption identification, credit or debit card
fraud detection, detecting outlying in wireless sensor network data, etc.
Sequential Patterns
35/19
The sequential pattern is a data mining technique specialized for evaluating sequential
data to discover sequential patterns. It comprises of finding interesting subsequences in a
set of sequences, where the stake of a sequence can be measured in terms of different
criteria like length, occurrence frequency, etc.
In other words, this technique of data mining helps to discover or recognize similar patterns
in transaction data over some time.
Prediction
36/19
 Prediction used a combination of other data mining techniques such as trends, clustering,
classification, etc. It analyzes past events or instances in the right sequence to predict a
future event.
Association Rules
37/19
 This data mining technique helps to discover a link between two or more items. It finds a
hidden pattern in the data set.
 Association rules are if-then statements that support to show the probability of
interactions between data items within large data sets in different types of databases.
Association rule mining has several applications and is commonly used to help sales
correlations in data or medical data sets.
References
38/19
 Self Notes
 https://theecmconsultant.com/data-management-challenges/
 https://www.javatpoint.com/dbms-data-independence
 https://www.guru99.com/dbms-data-independence.html
 https://www.nibusinessinfo.co.uk/content/types-database-system
 https://www.geeksforgeeks.org/characteristics-and-functions-of-data-warehouse/
 https://www.javatpoint.com/data-mining-cluster-vs-data-warehousing
 https://www.javatpoint.com/data-mining-techniques
Thank You

MIS Unit 3

Uploaded by

Document Information

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

MIS Unit 3

Uploaded by

Copyright:

Available Formats

1

Management Information System

 Data management helps minimize potential errors by

 With reliable, up-to-date data, companies can respond

 Data Management is important because it offers a

 DM ensures companies can easily find, analyze, and

 Sheer Volume of Data

 There is a severe scarcity of experienced data management specialists

 Data independence is the type of data transparency that matters for a

 Data Independence is defined as a property of DBMS that helps you to change

 Physical Data Independence

 Using a new storage device like Hard Drive or Magnetic Tapes

 Add/Modify/Delete a new attribute, entity or relationship is possible

 Data redundancy is the storage of the same data in data storage or

 When intentional, it provides a number of benefits and supports numerous use

 However, data redundancy is often unintentional and results in many

 Redundant data is a bad idea because when you modify data

 Data administration -- a high-level function that is responsible for the overall

 DBMS is a computerized data-keeping system.

 Relational database system

 Relational databases use Structured Query Language (SQL) - a standard user

 It resembles a tree structure, similar to a folder architecture in your computer system.

 Due to such limitations, this type is confined to specific uses.

 A network database is a type of database model wherein multiple member records or

 In object-oriented databases, the information is represented as objects, with different

 Such databases use an object-oriented programming language for development.

 A data warehouse is a central repository of information that can be analyzed to make

 Clustering is a division of information into groups of connected objects.

You might also like