You are on page 1of 39

 

EFFECTIVE DATA MANAGEMENT – A TOOL FOR


 
 
 
 
 

 
 
 
 
SUSTAINING
NATION’S ECONOMY AND ERADICATING
 
 

CORRUPTION
BY
PROFESSOR E. ROTIMI ADAGUNODO
DEPARTMENT OF COMPUTER SCIENCE & ENGINEERING
OBAFEMI AWOLOWO UNIVERSITY, ILE-IFE, NIGERIA

A Paper presented at a 2-Day Seminar Organized by the Institute of


Data Processing and Management (IDPM), Nigeria organized
Presentation Plan
 Pleasantries
 Introduction
 What is Data?/Related Terms & Concepts
 What Data Management is about
 Techniques/Tools of Data Management
 Evolution of Data
 Big Data Analytics/Types of Big Data Analytics/Survey of
benefits
 Trends in Data Processing
 Classes of Digital Computers
 Data Processing Modes
Presentation Plan
 Classes/Categories of Digital Computers
 Trends in DP (Networks)
 Some Benefits of Cloud Computing
 Best Practices in Data Management
 Benefits of Effective Data Management
INTRODUCTION

 Data is a veritable resource to every individual or corporate body.


It usually gives a representation of the quality of situation of
things. The presence or absence of data concerning any event or
situation tells a lot of story about that situation. Good quality data
can lead to good and useful decisions while bad data can lead to
bad decisions. Effective data management is therefore necessary
to achieve good and desirable results.
 Effective data management involves the deployment of best
practices in all spheres of data processing and storage to achieve
the ultimate goal which is high quality data.
 Using best practices through all stages of working with data will
ensure the accessibility and longevity of data.
What is Data?

 Data is the plural form of the Latin word ‘datum’.


It is factual entity (such as text, numbers, sounds
and images or graphics (multimedia)) presented in
a form that can be processed by the Computer.
 Data has almost become known to represent both
the singular and the plural forms.
Some Associated or Related Terms

 Permit me to quickly mention a few related terms or concepts:


 Information – a collection of processed data. Data that have been organized,
systematized, and presented so that the underlying patterns or trends become
clear. Useful for decision making. For example, the temperature, humidity
and reports from hundreds of weather stations are data. A computer
simulation report that shows how this data predicts a strong possibility of
tornadoes is information.
 Database – a collection of related information about a subject organized in a
useful manner that provides a base or foundation for procedures, such as
retrieving information, drawing conclusions and making decisions. A
database can also be viewed as an application that provides the tools for data
retrieval, modification, deletion and insertion: for example, Access, MySQL
and Oracle Applications. Such applications can also create a database and
produce reports.
Some Associated Terms Cont’d
 Information Systems (IS) – Information System is
defined as a purposefully designed system that
brings data, computers, procedures and people
together to manage the information that is
important to an organization’s or nation’s mission.
What is Data Management?

 Data Management is a group of activities relating to the


planning, development, implementation and administration of
systems for the acquisition, storage, security, retrieval,
dissemination, archiving and disposal of data. Such systems
are commonly digital, but the term equally applies to paper-
based systems where the term records management is
commonly used. The term embraces all forms of data,
whether these datasets are simple paper forms, the contents of
relational databases, multi-media datasets such as images, or
scientific data such as seismic records of the Nigerian land
mass. The management of geographic data is in many ways.
Techniques or Tools of Data Management: Data Management Structures

 Flat File Database: This one does not possess any


elaborate structure. A file without sub-directories of
grouping files. Everything is done in a single table. Not
usually used for commercial or large applications.
 Relational Database Management (RDBM): In this
approach, data is stored in 2-dimensional tables of
columns and rows which can be related if the tables
have a common column or field. For example, column
‘E’ in the diagram below. This was employed by MS-
Access and some other database management systems.
Tables
S/No A B C D E
1
2
3
4
5
6
7
Table 2
S/No F G H E
1
2
3
4
5
6
7
More on Techniques & Tools
 The term “relational” actually means the capability
of this type of database software to relate two or
more tables on the basis of a common field and to
construct a new 3rd table based on the relation, e.g a
bookstore’s database showing names, order
numbers and subjects:
More on Techniques & Tools
 Relational Database Management System
(RDBMS) – A relational database application,
especially one that comes with all the necessary
support programs, programming tools and
documentation needed to create , install and
maintain custom database applications. This is the
most popular commercially among other database
systems. There are others such as Network and
Distributed Databases.
Evolution of Data
Year Analog Digital Remarks
Content Content
Pre 1986 - 2.86 exabytes 0.02 exabytes Total Global
1986 (99.8%) (0.2%) Data= 2.62
exabytes
1993 97% 3%
2000 75% 25%
2002 50% 50% The Advent of
Digital Age
2007 19 exabytes 280 exabytes
(6%) (94%)
2015 2% 98%
Big Data Analytics

 The possibilities of big data continue to evolve rapidly,


driven by innovation in the underlying technologies,
platforms, and analytic capabilities for handling data, as
well as the evolution of behavior among its users as more
and more individuals live digital lives. The exponential
growth can be traced back to the evolution from the
information age or information societies to the knowledge
societies. These trends include growth in traditional
transactional databases, continued expansion of multimedia
content, increasing popularity of social media, and
proliferation of applications of sensors in the Internet of
Things, Internet of Minds and Wearable devices .
Definition of Big Data Analytics
 The Data Warehousing Institute (TDWI) defined Big
data analytics as the application of advanced analytic
techniques to very big data sets.

 Big data analytics is a set of technologies and


techniques that require new forms of integration to
disclose large hidden values from large datasets that
are different from the usual ones, more complex, and
of a large enormous scale.
A Survey on Benefits of
Big Data Analytics
 According to a survey conducted by TDWI in 2011 based on 1,635 responses
from 325 respondents; 5 responses per respondent, to access the gains that
would arise if an organization implemented some form of big data analytics. Top
on the list includes better-targeted social-influencer marketing (61%), followed
by customer-base segmentation (41%), and recognition of sales and market
opportunities (38%). Since Recent economic changes worldwide have changed
consumer behaviours. Defining of churn and other customer behaviours ranked
(35%), as well as an understanding of consumer behaviour from click streams
(27%). While other benefits of customer loyalty, service experience
optimization, healthcare delivery optimization, and supplier performance based
on cost and quality ranked low at 4%.
 Furthermore, big data analytics thus appeal to businesses by offering savings on
three essential levels of any business, namely: time, money, people that is
reduction in time of processing data translated to saving money and the use of
fewer resources to present the data for better decisions.
Some of the gains derived by organizations that implemented some form of big data analytics
Keys to the survey
 BTSI -social market influencer
 BI - accurate business insight
 CC – customer base segmentation
 SMO – sales and market opportunities
 RTP -automated decisions for real-time processes
 CB - customer behaviors
 DOF – Detection of Fraud
 ROI – return of investment for big data
 QR – Quantification of risks
 MS - trending for market sentiments
 BC - understanding of business change
 BPF - Better planning and forecasting
 RCC -Identification of root causes of cost
 CBC -understanding customer behavior from click streams
 MYI - manufacturing yield improvements
Types of Big Data Analytics: Prescriptive Analytics

 This type of analytics help to decide what actions


should be taken or suggests what to do. It s very
valuable but not used largely. It focuses on
answering specific questions like, hospital
management, diagnosis of cancer patients, diabetes
patients that determine where to focus treatment.
 Prescriptive analytics can identify optimal solutions,
often for the allocation of scarce resources to give
the best course of action for any pre-specified
outcome,
Predictive Analytics
 This type of analytics help to predict future or what might happen
in the future. It utilizes a variety of statistical, modeling, data
mining, and machine learning techniques to study recent and
historical data, thereby allowing analysts to make predictions
about the future. Predictive analytics argued by Wu in the
information weeks only forecast what might happen in the future
based on available data at hand. They are probabilistic in nature. It
also includes what if-then-else scenarios and risk assessment For
example some companies use predictive analytics to take decision
for sales, marketing, production, Clinical Decision Support: to
determine that which patients are at risk of developing certain
conditions like diabetes, asthma, lifetime illness etc.
Exploratory or discovery
analytics
 it refers to finding relationships in big data that
were not previously known. This is just another
name for predictive analytics in which big data
creates additional opportunities for insights and is
especially important for firms with voluminous
amounts of customer data
Descriptive analytics
 It describes what has happened and predict for the
near future. They are backward looking and reveal
what has occurred to predict the future. For
example market analysis to predict the demand for
a particular product or service.
DATA PROCESSING MODES
 The modes of processing data have developed
through batch processing mode to distributed
processing, then real-time processing mode, online
processing mode and network processing/Internet
mode.
Classes/Categories of Digital Computers

S/NO CLASS OF COMPUTER PHYSICAL


DESCRIPTION/
EXAMPLES
1 Super Computers Very Large; CDC
2 Mainframe Computers Large; IBM 360/370
3 Mini Computers Fairly Large; VAX, ICL, PDP
4 Micro/Personal/ Small/Mobile; IBM and
Desktop Computers Compatibles
5 Laptops/Palmtops Very Small/More Mobile; DELL,
HP, TOSHIBA, LENOVO,
6 Personal Digital Assistants PDAs/Cell Very Very Small/Micro Devices;
Phones/Wrist Watches/Embedded HUAWEI, NOKIA, INFINIX,
Systems TECHNO, SAMSUNG……….
TRENDS IN DATA PROCESSING
TECHNOLOGY (Digital Computers)
TRENDS IN DATA PROCESSING
TECHNOLOGY (Computer Networks)
 The advent of computer networking and digital communications enlarged
the horizon of data processing into the realm of data communications within
a small geographical radius (LAN), a city (MAN), a region (WAN), limited
to a particular establishment or institution (INTRANET) or across the globe
(INTERNET) and it has gone into CLOUD & GRID computing.

 The word “cloud” originates from the Virtual Private Networks (VPN)
services for data communications. The National Institute of Standards and
Technology (NIST) explains that “cloud computing is a model for enabling
ubiquitous, convenient, on-demand network access to a shared pool of
configurable computing resources (e.g. networks, servers, storage devices,
services and applications) that can be rapidly provisioned and released with
minimal management effort or service provider interaction”.
 GRID Computing refers to concentrated and massive computing power.
Some Benefits of Cloud Computing

 No more need to store large and enterprise data on desktops and


portable systems but can be stored on servers and the data will be
accessible through the Internet.
 Organizations do not have to spend heavy budget on acquisition of
elaborate hardware and software infrastructures for data processing.
 Cloud computing provides better utilization of ubiquitous, convenient,
on-demand access to shared pool of configurable distributed
computing resources (e.g. networks, servers, storage devices,
applications and services) that can be rapidly provisioned and released
with minimal management effort or service provider interaction, and
the resources can be accessed remotely through the Internet.
 Organizations with presence on the Cloud can leverage on data from
other sources available through the Cloud for decision making.
Grid Computing
Plugging Your Device to the Cloud !!!
The Entire Cloud on Your Finger Tips !!!
BEST PRACTICES IN DATA MANAGEMENT

1. Data Management Plan (DMP)- Developing a data


management plan (DMP) at the beginning of a new
project will inform good practice throughout the
data cycle. The following practices are
fundamental to effective data management and can
be applied to all disciplines.
2. Data Storage – Data must be stored in an
appropriate storage medium (hardware) with open
standard and not proprietary using correct format
Best Practices Cont’d
3. Data Documentation – to include date of collection
of data; collection methods; use established
metadata appropriate to the field or subject of the
data.
4. Ethical Issues – Ensure issues such as data privacy,
data redaction before depositing in public archive
or repository, data on human subjects, responsible
conduct of research, ownership of Intellectual
Property Rights,
5. Sharing Data – store metadata in readme files for
others to access and use.
 Data Governance Paradigm – Data quality
management has shifted from occasional data
cleansing to an ongoing cycle of data quality
created by incorporating governance plans –
continuous improvement process, embraced at all
levels of the organization.
 Data governance is everybody’s business
BENEFITS OF EFFECTIVE DATA MANAGEMENT

 Maintaining clean and comprehensive data in all organizations and sectors of the
nation will lead to a trusted and reliable system
 When every citizen of the country is able to use the computer, it enhances quality of
living, boosts the national economy and makes life easier. The UNESCO has redefined
literacy as the ability of a person to read, write and compute (use computers). This will
enhance the skill and knowledge of the citizenry about good quality data, keeping good
records such as family data and how to carry out basic data processing – word
processing, statistical analysis and some advanced data processing like forecasting,
simulation and modeling.
 Databases remove data duplication and inconsistencies.
 Organizations (SSEs, SMEs, LSEs, Educational, Social & Religious) should
embrace and cultivate the technology culture by integrating the use of computers and
computer-related devices in generating and managing data in their establishments such
as information systems on employees, students, membership, customers, management
systems, simulation and modeling systems.
 
Benefits of Effective Data Management
 At Governance Levels, the 3-tiers of Governance (Local, State and
Federal) can incorporate the use of modern data management tools and
technologies such as computers and digital networks to facilitate the work of
governance. For example, databases on resources available and produced in
each geographic location, staff databases, computerized payroll system,
computerized infrastructure systems,
 National and Sectorial deployment of Digital Communication
Networks, ( Cloud, Grid, IoT, and Big Data Analytics): Digital
Communication networks for the management of different sectors of the
economy to be coordinated by a National Digital Backbone for security
and surveillance, education, production, agriculture, music and
entertainment, transportation to mention a few.

 Effective Data Management achieves Engineering Economy which is the


trend in national economy rather than resource-based economy Effective
Data Management is Knowledge
Conclusion
 Without doubt, high quality data is imperative for
sustaining national economy and eradicating corruption
at different levels. In this presentation, we have
examined some fundamental concepts about data, its
evolution and different processing tools, techniques and
modes. We have also discussed the nature and benefits of
Big Data Analytics. The role of data communications
was also highlighted leading to a discussion on the Cloud
Computing. Best Practices in Data Management and the
benefits of Effective Data Management among others
were also discussed.
REFERENCES/RESOURCES

 Boston College Libraries (2016), Data Management: Best Practices in Data Management. Trustees of
Boston College, 140, Commonwealth Avenue, Chestnut Hill, MA 02467. Obtained from libguides.bc.edu
as @ 25 August, 2016

 Inter-Governmental Group on Geographic Information (IGGI) (2005), The Principles of Good Data
Management, 2e, Office of the Deputy Prime Minister: London

 QuinStreet Enterprise (2016), 5 Best Practices for Data Quality Management. Obtained from
enterpriseappstoday.com/data-management

 Richard Brittain (2015), Effective Data Management, Research Computing Workshop, Dartmouth
College. Obtained from Dartmouth.edu as @ 30 August, 2016

 Veerle Van den Eynden, Louise Corti, Matthew Woollard, Libby Bishop and Laurence Horton (2011),
Managing and Sharing Data: Best Practice for Researchers. Published by UK Data Archive, University of
Essex, ISBN:1-904059-78-3

 World Health Organization(WHO) (2015), Guidance on Good Data and Record Management Practices.
Working Document QAS/15.624
THANK YOU FOR YOUR
ATTENTION &
GOD BLESS YOU

You might also like