You are on page 1of 22

TYPES OF DIGITAL DATA

2
OUTLINE

 1.Introduction
 2.Structured Data
 3.Unstructured Data
 4.Semi-Structured Data
 5.Difference between Semi structured and structured data

TYPES OF DIGITAL DATA- Dr.J.Shanthini


3
Introduction:

 Data growth has seen exponential acceleration since the


advent of the computer and internet.
 define: it is defined as the data that is stored on digital format
may be in the form of a picture, document or video etc. it is the
data that is not physical but stored in digital form.
 Digital data can be classified into three forms:
 1. Unstructured Data
 2. Semi-Structured Data
 3. Structured

TYPES OF DIGITAL DATA- Dr.J.Shanthini


4
Sources of structured data
Databases eg. Access

Structured
spreadsheet
data

SQL

OLTP systems

TYPES OF DIGITAL DATA- Dr.J.Shanthini


5
Characteristics of structured data
Conforms to a data
model

Data is stored in the


Similar entities are form of rows and
columns
grouped

Structured
data

Data resides in
Attributes in the group fixed fields withn
are the same a record or a file

Definition,
format,meaning of data
is explicitly known

TYPES OF DIGITAL DATA- Dr.J.Shanthini


6
Sources of Structured Data
storage

Scalibility
Ease with
structured
data
Security

Update and
delete

*
TYPES OF DIGITAL DATA- Dr.J.Shanthini
7
Sources of structured Data

The structured data come from databases such as Access, OLTP


Systems, SQL as well as spreadsheets such as Excel are all in
the structured format
To summarize, structured data:
 Consists of fully described data sets.
 Has clearly defined categories and sub- categories.
 Is placed neatly in rows and columns
 Goes into records and hence the database is regulated. by a
well-defined structure.
 Can be indexed easily by the DBS itself or manually.

TYPES OF DIGITAL DATA- Dr.J.Shanthini


Advantages of structured 8
data(Easy to work with
structured data)
It is easy to work with structured data. The advantages are :
Storage: Both defined and user- defined data types help with the storage of structured
data.
Scalability: Scalability is not generally an issue with increase in data
Security: ensuring security is easy
Update and Delete: Updating, deleting etc is easy due to structured form.
Transaction Properties : ACID

TYPES OF DIGITAL DATA- Dr.J.Shanthini


*
9
Hassle free structured data
Retrieving
information

Indexing and
Ease with searching
structured data

Mining data

BI operations

TYPES OF DIGITAL DATA- Dr.J.Shanthini


10
Hassle Free Retrieval

• Retrieval of structured data is totally hassle free. The features are


as follows:
• Retrieving information: a well defined structure helps in
easy retrieval of data
• Indexing and searching: Data can be indexed based not only on
a text string but also on other attributes . This enables
streamlined search.
• Mining Data: Structured data can be easily mined and
knowledge can be extracted from it.
• BI operations: BI works extremely well with structured
data. Hence data mining, warehousing etc. can be easily
undertaken

TYPES OF DIGITAL DATA- Dr.J.Shanthini


11
UNSTRUCTURED DATA

 It is the one which cannot be stored in


the form of rows and columns as in a
database and does not conform to any
data model, i.e. it is difficult to
determine the meaning of the data.

 It does not follow any rules and it can be


of any type and thus its unpredictable.

TYPES OF DIGITAL DATA- Dr.J.Shanthini


12
CHARACTERISTICS OF UNSTRUCTURED
DATA

TYPES OF DIGITAL DATA- Dr.J.Shanthini


13
SOURCES OF UNSTRUCTURED

DATA
Web pages, Memos, Videos (MPEG, etc.), Images (JPEG,
GIF, etc.), body of an email, Word document, PowerPoint
presentation, Chats, Reports, White papers, Surveys etc.

Where does Unstructured data come from ?


Anything in a non-database form is unstructured data. It can be
divided into two broad categories :
 Bitmap objects : For e.g. Image, video or audio files.
 Textual objects : For e.g. Microsoft word documents, emails
or MS Excel.
 A lot of unstructured data is also noisy text such as chats,
emails and SMS texts.
TYPES OF DIGITAL DATA- Dr.J.Shanthini
14
MANAGING
UNSTRUCTURED DATA

 INDEXING : Data is indexed to enable faster


search and retrieval. On the basis of some
value in data, index is defined as an identifier
which represents a large record in the data set.
 Indexing in unstructured data is difficult as text
can be indexed based on a text string but in
case of non-text based files, e.g. audio/video,
indexing depends on file names.
 TAGS/METADATA : Using metadata, data in
TYPES OF DIGITAL DATA- Dr.J.Shanthini

a document can be tagged. But in unstructured


 CLASSIFICATION/TAXONOMY : Taxonomy is classifying15
data
on the basis of relationship that exist between data. Data can be
grouped and placed in hierarchies based on the taxonomy prevalent
in a firm.
 But in absence of any structure/metadata, identifying relationships
between data is difficult as data is unstructured, naming standards
are not consistent across the firm thus making it difficult to classify
data.
 CAS (Content Addressable Storage) : It stores data based on their
metadata. It assigns a unique name to every object stored in it
 The object is retrieved based on its content and not its location.
 It is used to store emails etc.
TYPES OF DIGITAL DATA- Dr.J.Shanthini
16
CHALLENGES FACED WHILE
STORING UNSTRUCTURED DATA

TYPES OF DIGITAL DATA- Dr.J.Shanthini


17

Semi Structured Data

Eg: Data from XML, JSON OBJECT

Donot conform to any data model


Has tag to classify the data
Tags to impose hierarchy
Data belong to same class many not have same set of attributes, and
order or attribute
TYPES OF DIGITAL DATA- Dr.J.Shanthini
18
Getting to know semi-
structured data
 Only about 10% of data in any organization is semi-
structured.
 still it is important to understand, manage, and analyze this
semi-structured data coming from heterogeneous sources.
 Semi-structured data does not conform to any data model. Also,
this data cannot be stored in rows and columns as in a database
 Semi-structured data has tags and markers which helps group the
data and describe how the data is stored. But they are not sufficient
for management and autonomous of data
 Similar entities are grouped and organized in a hierarchy. The
attributes or the properties within a group may or may not be the
same.

TYPES OF DIGITAL DATA- Dr.J.Shanthini


Does not
Similar conform to a 19
entities data model
are but contains
grouped tags and
elements

Attributes Cannot be
in a Semi stored in the
group structured rows and
may not data columns as in
be the a database
same

The tags
Not and
sufficient elements
metadat describe
a the data is
TYPES OF DIGITAL DATA- Dr.J.Shanthini stored
20

 Email Standard format:


 To : <NAME>
 From : <NAME>
 Subject : <TEXT>
 CC : <NAME>
 Body : <TEXT,GRAPHICS,IMAGES,ETC>

TYPES OF DIGITAL DATA- Dr.J.Shanthini


21
Where does semi-structured data
come from? Email

XML

TCP/IP Packets
Semi
structured Zipped File
data

Binary
Executables

Mark-Up
Languages
Integration of
data from
TYPES OF DIGITAL DATA- Dr.J.Shanthini heterogeneous
sources
22
 Characteristics of semi structured data are summarized as below :

 It is organized into semantic entities.


 Similar entities are grouped together.
 Entities in the same group may not have the same attributes.
 The order of attributes is not necessarily important.
 Not always all attributes are required.
 Size of the same attributes in a group may differ.
 Type of the same attributes in a group may differ.

(Semantic – relating to “meaning”, or arising from distinctions between the meaning of


different words)

TYPES OF DIGITAL DATA- Dr.J.Shanthini

You might also like