Professional Documents
Culture Documents
1021 bytes = 1 zettabyte
or one billion terabytes forms a zettabyte.
This is the data which does not conform to a data model or is not in a form
which can be used easily by a computer program. About 80—90% data of
an organization is in this format; for example, memos, chat rooms,
PowerPoint presentations, images, videos, letters, researches, white papers,
body of an email, etc.
Unstructured data
Unstructured data Come from…
Store Unstructured data
Conti…
Extract information from Unstructured data
Conti…
Unstructured Data: Example
The output returned by ‘Google Search’
Types of Big Data: Semi-structured
Semi-structured data can contain both the forms of
data. We can see semi-structured data as a structured in
form but it is actually not defined with e.g. a table
definition in relational DBMS. Example of semi-
structured data is a data represented in an XML file.
Conti…
Semi-structured data pertains to the data containing
both the formats mentioned above, that is, structured
and unstructured data.
To be precise, it refers to the data that although has not
been classified under a particular repository (database),
yet contains vital information or tags that segregate
individual elements within the data.
Semi-structured data
Semi-structured data come from…
Manage Semi-structured Data
Store Semi-structured Data
Conti…
Extract Information from Semi-structured data
Conti…
Semi-structured Data example
Personal data stored in an XML file
<rec><name>Anil Dubey</name><sex>Male</sex><age>33</age></rec>
<rec><name>Rohit Rastogi</name><sex>Male</sex><age>43</age></rec>
<rec><name>Sikha</name><sex>Female</sex><age>31</age></rec>
THANK
YOU