You are on page 1of 26

Lecture 5

Database Technology (DB), Data


Ware House (DWH) and Data
Mining (DM)
Topics to cover
• File concept – Traditional method
• Database Concept – Modern method
• History of Database
• Database Architecture
• Exercise on FS and DBS
File concept – traditional approach
Materials Management Function

Concurrent access to Concurrent access


3 different Files and systems Files may not ensure that
anomalies (difference)
the information is current

Security
Purchase system Stock accounting Quality accounting
implementation
Item code Item code Item code becomes difficult due
Item Name Item Name Item Name to the different
PO No. PO No. programs developed at
Supplier code Supplier code different times for each
Supplier name Supplier name system.
Delivery Schedule
Qty. received Qty. received Qty. received
Qty. rejected Qty. rejected Qty. rejected
Security Problems
Qty. accepted Qty. accepted Qty. accepted
Goods received Note No. Goods received Note No. Goods received Note No.
Goods returned Qty. Goods returned Qty. Goods returned Qty. Integrity rules are
written during
programmes are written
To find the delivery Schedule by a
Repetition of common data results into person working on Quality becomes
repetition of storage of common data on difficult due to separate File system
3 systems Becomes difficult to
incorporate change or
new integrity rules
Data redundancy and Difficulty in access to the
inconsistency data
File concept – Traditional method
• Separate systems for each function
• Separate file for each system
• Data redundancy and
inconsistency
• Difficulty in access to the data
• Concurrent access anomalies
• Security problems
• Integrity of the data
Database Concept
• Definition: A Database is a collection of records
stored in a computer in a systematic way, so that a
computer program can consult it to answer
questions. For better retrieval and sorting, each
record is usually organized as set of data elements
(facts). The items retrieved in answer to queries
become information that can be used to make
decisions. The computer program used to manage
and query a database is known as a database
management system (DBMS).
• a database is a structured body of related
information. The software used to manage and
manipulate that structured information is called a
DBMS (Database Management System). A database
is one component of a DBMS.
Database for Materials
Management
Item code, Item name, supplier code,
Purchase Managers supplier name, P.O.No., Rate Stores Managers
View Delivery Schedule, Quantity View
received, GRR No., Qty. returned,
GRN No., Qty. rejected, Type of
rejection, Reason for rejection,
Rating

Quantity accounting Stock statement,


by P.O. and supplier, stock ledger, Fast
Purchase analysis. Quality Control Managers View moving and slow
moving, Valuation of
stock and analysis
Supplier quality rating, statement
of reliability index, Type and
reason analysis of rejection
Database Terminology
• Entities and Relationships
• Tables
• Columns or Attributes
• Rows, Records, Tuples
Entities and Relationships
• Entities are the things in the real world that we will store information
about in the database.
• Examples
– Information about employees and department they work. Here
employee and department are entities
– Information about an equipment or manpower working on a
project.
– A company doing a construction Project for an client.
• Relationships are the links between these entities.
• Examples
– One employee works for one department
– Many employees work for one department.
– One equipment working on one project or many project
– one company doing many projects
– One project for one client.
• Types of Relationships
– One to one
– One to many
– Many to many
Corporate level Construction Project Management of a
client company like NHAI
NHAI

Entities
NH1 NH2 NH3
1. Project
Contractor Contractor 2 Contractor 4
2. Contractor
1
Contractor 3 Contractor 5

3. Location Location 1 Location 2 Location 3

Relationships
1. One project will have one location. One to one
2. Many contractors work on one project. Many to One
3. Many project Managers work on many projects. Many to Many
Project Level Construction Database Entity
and Relationships for a Contractor
Project

Client Sub Contractor Sub Contractor Equipment Equipment


1 2 1 2

Entities Locations 1 Locations 2 Task 1 Task 2


1. Project
Relationship
2. Client
1. One project one client: one-to-one
3. Sub 2. One Project Many Sub Contractor:
Contractor one-to-many
4. Equipments 3. One Project many locations or sub projects:
5. Locations/Su one-to-many
b Projects 4. Many Equipments work on Many task:
6. Task many-to-many
Tables; Columns or Attributes; Rows, Records, Tuples

Columns
or
Attributes
Project Table

Project Project Client Location Project Project


ID Name Cost Duration

Pune01 Highway NHAI Pune-Sat 25 12 months


Widenin ara
g

Mub01 Flyover Pune Mumbai 14 8 months


Corp.

Rows, Records, Tuples


Examples of Database
4D Microsoft Access
Adabas Microsoft SQL Server
Adaptive Server Enterprise MySQL
Apache Derby Netzza
Corel Paradox OpenOffice.org Base
Dataflex Oracle
Dataphot OpenLink Virtuoso
DB2 PostgreSQL
FileMaker Progress
Firebird Rel (DBMS)
Helix database SQLite
HSQLDB SQL Anywhere Studio
Cloudscape Teradata
Information Management System
VistaDB
Informix
Ingres
Intersystem Cache
Kx

Source: http://en.wikipedia.org/wiki/Database
Data Warehouse and Data Mining

Prof. Vijaya Desai


From DB based Operational system to
Analytical Data warehouse…
Reasons
• The processing load of reporting reduced the response
time of the operational systems,
• The database designs of operational systems were not
optimised for information analysis and reporting,
• Most organisations had more than one operational system,
so company-wide reporting could not be supported from a
single system, and
• Development of reports in operational systems often
required writing specific computer programs which was
slow and expensive
Data Warehousing Architecture

External Data
Sources
Visualisation
Extract
Clean Metadata
Transform repository Serves OLAP
Load
Refresh

Operational Data
Databases Data
Warehouse Mining
Centralised data warehouse

Federated data warehouse


Tiered data warehouse
The star structure
The star structure

Projec Region
tTyp Nation
e
Cost Facts District
Duratio Projec Client
n tRegion
Resourc
e
Environment
Revenue
Environment Expenses Resourc
Cost e
Equipment
Manpowe
r
Multidimensional Database Model
Customer Store
Store

Time Time

SALES FINANCE

Product
Product

The data is found at the intersection of


dimensions.
Multidimensional Database Model
Type of Labor Soil Condition
Soil Condition

Time Time

LBOUR Prod. Output

Equipment
Project

The data is found at the intersection of dimensions.


Data Mining
Data mining functions
• Associations
– 85 percent of customers who buy a certain brand of wine also buy a certain
type of pasta
• Sequential patterns
– 32 percent of female customers who order a red jacket within six months
buy a gray skirt
• Classifying
– Frequent customers are those with incomes about $50,000 and having two
or more children
• Clustering
– Market segmentation
• Predicting
– predict the revenue value of a new customer based on that personal
demographic variables
Advantages and Disadvantage
Advantages
• Enhances end-user access to a wide variety of data.
• Increases data consistency.
• Increase productivity and decrease computing costs.
• Is able to combine data from different sources, in one place.
• It provides an infrastructure that could support changes to data and
replication of the changed data back into the operational systems.

Disadvantage or Concerns in using data warehouse


• Extracting, cleaning and loading data could be time consuming.
• Data warehousing project scope might increase.
• Problems with compatibility with systems already in place e.g.
transaction processing system.
• Providing training to end-users, who end up not using the data
warehouse.
• Security could develop into a serious issue, especially if the data
warehouse is web accessible.
• A data warehouse is a HIGH maintenance system.
Case study on DW and DM
Why Data Warehouse
• DBMSs widely used to maintain transactional data of TPSs.
• Supports OLTP for update, add, edit kind of operations of current data
at a very low level of analysis
• Unable to support Complex Analysis of Historical Information
• Reports were developed on request
• Reports provided little analysis capability
• Information to support day-to-day service
• Data stored at transaction level
• Database design: Normalized
• Attempts to use of these data for analysis, exploration, identification of
trends etc. has led to Decision Support Systems, which are rarely
enough
• Trend towards Data Warehousing
• Data Warehousing – consolidation of data from several databases
which are in turn maintained by individual business units along with
historical and summary information

You might also like