You are on page 1of 5

Practical No.

10
Aim:Case Study
Case Study Topic: StructuredData vs. UnstructuredData.

Theory :
ABSTRACT:
Structured data vs. unstructured data: structured data is
comprised of clearly defined data types whose pattern makes
them easily searchable; while unstructured data – “everything
else” – is comprised of data that is usually not as easily
searchable, including formats like audio, video, and social media
postings.
Unstructured data vs. structured data does not denote any real
conflict between the two. Customers select one or the other not
based on their data structure, but on the applications that use
them: relational databases for structured, and most any other
type of application for unstructured data.
However, there is a growing tension between the ease of analysis
on structured data versus more challenging analysis on
unstructured data. Structured data analytics is a mature process
and technology. Unstructured data analytics is a nascent industry
with a lot of new investment into R&D, but is not a mature
technology. The structured data vs. unstructured data issue within
corporations is deciding if they should invest in analytics for
unstructured data, and if it is possible to aggregate the two into
better business intelligence.
What is Unstructured Data?

Unstructured data is essentially everything else. Unstructured


data has internal structure but is not structured via pre-defined
data models or schema. It may be textual or non-textual, and
human- or machine-generated. It may also be stored within a
non-relational database like NoSQL.
Typical human-generated unstructured data includes:
 Text files: Word processing, spreadsheets, presentations,
email, logs.
 Email: Email has some internal structure thanks to its
metadata, and we sometimes refer to it as semi-structured.
However, its message field is unstructured and traditional
analytics tools cannot parse it.
 Social Media: Data from Facebook, Twitter, LinkedIn.
 Website: YouTube, Instagram, photo sharing sites.
 Mobile data: Text messages, locations.
 Communications: Chat, IM, phone recordings, collaboration
software.
 Media: MP3, digital photos, audio and video files.
 Business applications: MS Office documents, productivity
applications.

Typical machine-generated unstructured data includes:


 Satellite imagery: Weather data, land forms, military movements.
 Scientific data: Oil and gas exploration, space exploration, seismic
imagery, atmospheric data.
 Digital surveillance: Surveillance photos and video.
 Sensor data: Traffic, weather, oceanographic sensors.

Structured vs. Unstructured Data: What’s the


Difference?
Besides the obvious difference between storing in a relational
database and storing outside of one, the biggest difference is the
ease of analyzing structured data vs. unstructured data. Mature
analytics tools exist for structured data, but analytics tools for
mining unstructured data are nascent and developing.
Users can run simple content searches across textual unstructured
data. But its lack of orderly internal structure defeats the purpose
of traditional data mining tools, and the enterprise gets little
value from potentially valuable data sources like rich media,
network or weblogs, customer interactions, and social media
data. Even though unstructured data analytics tools are in the
marketplace, no one vendor or toolset are clear winners. And
many customers are reluctant to invest in analytics tools with
uncertain development roadmaps.
On top of this, there is simply much more unstructured data than
structured. Unstructured data makes up 80% and more of
enterprise data, and is growing at the rate of 55% and 65% per
year. And without the tools to analyze this massive data,
organizations are leaving vast amounts of valuable data on the
business intelligence table.

Structured Data UnstructuredData


 Pre defined data model. No Predefined Datamodel.
 Usually text only. May be text,image,sound,video.
 Easy to search Difficult to search.

Resides In:
 Data Warehouses. NoSQL databases.

Generated By:

 Humans and Machines. Humans and Machines.

Typical Applications:
 Airline Reservation System. Word Processing.
 Inventory Control. Presentation Software.
 CRM System. Email Client.
 ERP system. Tools for viewing or editing medium.

Examples.

 Dates Text files.


 Phn numbers. Reports.
 Social Security Numbers. Email messages.
 Credit Card Numbers. Audio Files.
 Addresses. Numbers.

Conclusion:
Hence, I have studied this CASE STUDY of Structured data vs.
Unstructured data.

Date of Date of Signature Remark


Performance Assesment

You might also like