You are on page 1of 16

Chapter 5

Managing Data
Difficulties in Managing Data
- Amount of data increases exponentially:
According to the annual survey of the global digital output by International Data Corporation, the total
amount of global data was expected to pass 1.2 zettabytes.

- Data are scattered and collected by many individuals using various methods and devices:
These data are frequently stored in numerous servers and locations and in different computing systems, databases,
formats, and human and computer languages.

- Data come from many sources:


Internal sources (e.g., corporate databases and company documents)
Personal sources (e.g., personal thoughts, opinions, and experiences)
External sources (e.g., commercial databases, government reports, and corporate Web sites)
Downloaded from the Web (in the form of clickstream data).
Clickstream data are those data that visitors and customers produce when they visit a Web.
Clickstream data provide a trail of the users’ activities in the Web site, including user behavior and browsing
patterns.
Example: Credit card swipes/ RFID tags/ Digital video/ E-mails/ Radiology scans/ Blogs

- Data security, quality and integrity are critical:


They are easily jeopardized.

- Information systems that do not communicate with each other can result in inconsistent data

- Data degrades overtime:


Examples: customers move to new address/ employees are hired and fired.

- Data rot:
Problems with media on which the data are stored.
Over time, temperature, humidity, and exposure to light can cause physical problems with storage media and thus
make it difficult to access the data. The second aspect of data rot is that finding the machines needed to access the
data can be difficult.
Data Governance
Data Governance: an approach to managing data across an entire organization.
Formal sets of policies that are designed to ensure that the data are collected, handled and protected in a
certain, well-defined fashion.
The objective is to make information available, transparent, and useful for the people who are authorized to access it,
from the moment it enters an organization until it is outdated and deleted.

- Master data management:


is a process that spans all organizational business processes and applications. It provides companies with the ability to
store, maintain, exchange, and synchronize a consistent, accurate, and timely “single version of the truth” for the
company’s master data.
http://www.ncsi.gov.om/

The most important data in the organization that will be used by all the departments.
Example: if we take an example of one of the telecommunication companies like Omantel what is the most
important data that most likely that data will be stored most of the departments. we can see customer data,
customer data can be used by marketing department, it can be used let's say by human resource, it might be
used by many other departments who are involved in.
you know, doing the projects or innovating in new ways of reaching out to their customers. So, if you can
highlight what is your master data, then you can focus on that master data you will make the way to collect that
data and to maintain that data all unanimous across all the departments and accordingly you ensure that your
most important data is being taken care of.

Master data:
A set of core data [customer, employee, vendor, geographic location] that span all enterprise information
systems.

Transaction data: (produce every day)


Data that are generated and captured by operational systems, describe the business’s activities, or transactions.
In contrast, master data are applied to multiple transactions and are used to categorize, aggregate, and evaluate the
transaction data.
The Database Approach
Database management system (DBMS) provides all users with access to all the data.
- DBMSs minimize the following problems:
 Data redundancy: The same data are stored in many places.
 Data isolation: Applications cannot access data associated with other applications.
 Data inconsistency: Various copies of the data do not agree.

- DBMSs maximize the following issues:


 Data Security: keeping the organization’s data safe from theft, modification, and/or destruction.
 Data integrity: Data must meet constraints (e.g., student grade point averages cannot be negative).
 Data independence: Applications and data are independent of one another. Applications and data are not linked
to each other, meaning that applications are able to access the same data.

Data Hierarchy

Bit: a binary digit, or a “0” or a “1” - The smallest unit of data a computer can handle.
Byte: eight bits and represents a single character (e.g., a letter, number, or symbol)
Field: is a group of related characters (e.g., student’s name, age, mobile number)
Record: a group of logically related fields (e.g., student in a university database)

File (or table): a group of related records


Database: a group of related files.
University
Database

File Student Faculty

John Jones Kelly Rainer


Record MIS Professor

Name: Name: Position:


Field Field Major: MIS
John Jones Kelly Rainer Professor

J M K P
Byte Byte
(1001010) (1001101) (1001011) (1010000)

Bit Bit O 1 O 1

Designing the Database


Data model: a diagram that represents the entities in the database and their relationships.

 Entity: a person, place, thing, or event about which information is maintained. [A record generally describes an
entity]
 Attribute: a particular characteristic of a particular entity.
 Primary key (Key field): a field that uniquely identifies a record, so that it can be retrieved and updated.
 Secondary Key is another field that has some identifying information but typically does not identify the record with
complete accuracy.
Entity-Relationship Modeling

Database designers plan and create the database through a process called entity-relationship (ER) modeling.
ER diagrams consists of entities, attributes, and relationships. [illustrating relationships between database
entities]
 Entity classes: groups of entities of a certain type
 Instance (record): the representation of a particular entity
 Identifiers (Attribute): attributes that are unique to that entity instance.

 One-to-One [1:1]
 One-to-Many [1:M]
 Many-to-Many [M:M]
Entity-Relationship Diagram

Database Management Systems


Database management system (DBMS): a software that provides users with tools to add, delete, access, and
analyze data stored in one location.
Examples:
 Microsoft Access
 Oracle

Relational database model: based on the concept of two-dimensional tables.

Data Warehouses and Data Mart


Data warehouse: a repository of current and historical data to support decision makers in the organization.
- Organized by business dimension or subject (for example: by customer, product, price, and region)
- Consistent
- Historical: can be used for identifying trends, forecasting, and making comparisons over time.
- Multidimensional: A Data Cube

Relational Databases

Multidimensional Database
Benefits of Data Warehousing:

 End users can access data quickly and easily via Web browsers because they are located in one place.

 End users can conduct extensive analysis with data in ways that may not have been possible before.

 End users have a consolidated view of organizational data.

These benefits can improve business knowledge/ provide competitive advantage/ enhance customer service and
satisfaction/ facilitate decision making/ and streamline business processes.

Problems with Data Warehousing:


 Very expensive to build and to maintain [ around R.O. 400, 000].

 Incorporating data from obsolete (old) mainframe systems can be difficult and expensive.

 People in one department may be reluctant to share data with another department.

Data Marts:
Data mart: a small data warehouse, designed for the end-user needs in a strategic business unit (SBU) or a
department.
Example: Marketing and sale data mart to deal with customer information.
Advantage:
 Far less costly than a data warehouse (around R.O. 40, 000)

 Can be implemented more quickly (around 3 months)

 More rapid response and easier to learn and navigate.

Knowledge Management
Knowledge: information that is contextual, relevant, and actionable.
Another term for knowledge: Intellectual capital (or intellectual assets)

Explicit knowledge: codified (documented) in a form that can be distributed to others.


Example: (CEPS student’s handbook)
Tacit knowledge: a set of insights, expertise, and skills.
Knowledge that peoples carry in their heads, but difficult to write down in a document.
Example: our own opinion/views, experience

Best Practices: the most effective and efficient ways of doing things.

Knowledge management (KM): a process of accumulating and creating knowledge efficiently, so that it can
be applied effectively throughout the organization.
KM is not a technology. It a process supported by IS.

“Knowledge management involves efficiently connecting those who know with those who need to know
and converting personal knowledge into organizational knowledge.” (Peter Drucker)

Knowledge Management System (KMS)


KMS: the use of information technologies to systematize, enhance, and expedite intrafirm and interfirm
knowledge management and knowledge sharing.

Organizations can realize many benefits with KMSs:


they make best practices:
- improves overall organizational performance
- improved customer service,
- more efficient product development,
- improved employee morale and retention
Implementing effective KMSs presents several challenges:
- Employees must be willing to share their personal tacit knowledge
- Second, the organization must continually maintain and upgrade its knowledge base.
- Companies must be willing to invest in the resources needed to carry out these operations.

KMS Cycle:

1. Create knowledge. Knowledge is created as people determine new ways of doing things or develop know-
how. Sometimes external knowledge is brought in.

2. Capture knowledge. New knowledge must be identified as valuable and be represented in a reasonable
way.

3. Refine knowledge. New knowledge must be placed in context so that it is actionable. This is where tacit
qualities (human insights) must be captured along with explicit facts.

4. Store knowledge. Useful knowledge must then be stored in a reasonable format in a knowledge repository
so that others in the organization can access it.

5. Manage knowledge. Like a library, the knowledge must be kept current. It must be reviewed regularly to
verify that it is relevant and accurate.

6. Disseminate knowledge. Knowledge must be made available in a useful format to anyone in the
organization who needs it, anywhere and anytime.

Entity-Relationship Modeling
Database designers plan and create the database through a process called entity-relationship (ER) modeling.
ER diagrams consists of entities, attributes, and relationships. [illustrating relationships between database
entities].
Entities:
An entity is an object or concept about which you want to store information.
Entity classes: groups of entities of a certain type.

Attributes:
An attribute describes the property of an entity.
There are four types of attributes:
1. Key attribute
2. Composite attribute
3. Multivalued attribute
4. Derived attribute

1- Key attribute: A key attribute can uniquely identify an entity from an entity set.

key attributes have values that can not be repeated.


2- Composite attribute: An attribute that is a combination of other attributes is known as composite.

3- Multivalued attribute: An attribute that can hold multiple values is known as multivalued attribute.

Example: a person can have more than one phone number. So, the phone number attribute is considered
multivalued.

4- Derived attribute: A derived attribute is one whose value is dynamic and derived from another attribute.

What it does dynamic means? for example, like a person age is a derived attribute
why?
because it changes over time. So, and can be derived from another attribute which is date.

Relationships
It shows how two entities share information in the database.
Three types of Relationships:
1. One to One
2. One to Many
3. Many to Many
1- One-to-one (1-1): One entity from entity set X can be associated with at most one entity of entity set Y
and vice versa.

2- One-to-Many (1-M): One entity from entity set X can be associated with multiple entities of entity set Y,
but an entity from entity set Y can be associated with at least one entity.

3- Many-to-Many (M-M): One entity from X can be associated with more than one entity from Y and vice
versa.
Designing the Database

Primary key (Key field): a field that uniquely identifies a record, so that it can be retrieved and updated.

- Field or a key represent a column in a table.


- The values in the primary key column are unique, can not be repeated.
- each table should have key field.

Foreign Key
A field in one table that uniquely identifies a row (record) of another table. It is used to establish and
enforce a link between two tables.

DpetI DeptName DeptHOD NumofF


D aculty

1 INFS Dr. 12
Zahran

2 ACCT Dr. 10
Fatma

3 MRKT Dr. Saif 9


Foreign
Key

Stud StudentN StudentAdd G Dep


ent ame ress P tID
ID A

1234 Ahmed Sur 3 1


Al-Harthi .
2

1235 Amal Al- Rustaq 2 2


Hatmi .
9

1236 Saif Al- Sohar 3 1


Hashmi .
5

1237 Ahmed Muscat 2 3


Al-Harthi .
8

Foreign key is a primary key in another table that has 1:M relationship with this table.
Foreign keys are primary keys from another table.
For every M:M relationship, a new table has to be created.

See the Summary in page 140.

GOOD LUCK!
‫ادعوا لي بالتوفيق‬.

You might also like