Professional Documents
Culture Documents
Types of Data
Types of Data
Big Data Types
Structured Data
• structured data is comprised of clearly defined data types whose pattern
makes them easily searchable;
• structured data (the kind that is easy to define, store, and analyze)
• Structured data analytics is a mature process and technology.
• Structured data usually resides in relational databases (RDBMS).
• Fields store length-delineated data phone numbers, Social Security
numbers, or ZIP codes
Structured Data
• Even text strings of variable length like names are contained in records,
making it a simple matter to search.
• Data may be human- or machine-generated as long as the data is created
within an RDBMS structure.
• This format is eminently searchable both with human generated queries
and via algorithms using type of data and field names, such as
alphabetical or numeric, currency or date
Structured Data
• Common relational database applications with structured data include
airline reservation systems, inventory control, sales transactions, and ATM
activity. Structured Query Language (SQL) enables queries on this type of
structured data within relational databases.
Unstructured Data
• Unstructured data is essentially everything else. Unstructured data has
internal structure but is not structured via pre-defined data models or
schema.
• It may be textual or non-textual, and human- or machine-generated. It
may also be stored within a non-relational database like NoSQL.
Unstructured Data
Unstructured Data
Unstructured Data
• On top of this, there is simply much more unstructured data than
structured. Unstructured data makes up 80% and more of enterprise data,
and is growing at the rate of 55% and 65% per year. And without the tools
to analyze this massive data, organizations are leaving vast amounts of
valuable data on the business intelligence table.
Unstructured Data
Unstructured data (the kind that tends to defy easy
definition, takes up lots of storage capacity, and is
typically more difficult to analyze).
Unstructured data is basically information that either
does not have a predefined data model and/or does
not fit well into a relational database.
Unstructured information is typically text heavy, but may
contain data such as dates, numbers, and facts as well.
Unstructured Data
but here are the main takeaways that we would like to share with you:
The amount of data (all data, everywhere) is doubling every two years.
Our world is becoming more transparent. We, in turn, are beginning to accept
this as we become more comfortable with parting with data that we used to
consider sacred and private.
Most new data is unstructured. Specifically, unstructured data represents almost
95 percent of new data, while structured data represents only 5 percent.
Unstructured data tends to grow exponentially, unlike structured data, which
tends to grow in a more linear fashion.
Unstructured data is vastly underutilized. Imagine huge deposits of oil or other
natural resources that are just sitting there, waiting to be used. That ’s the
current state of unstructured data as of today. Tomorrow will be a different
Semi Structured Data