Professional Documents
Culture Documents
MODULE V
INTRODUCTION
Unstructured data is the data which does not conforms to a data model and has no easily identifiable
Unstructured data is not organized in a pre-defined manner or does not have a pre-defined data model,
Web pages
Images (JPEG, GIF, PNG, etc.)
Videos
Memos
Reports
Word documents and PowerPoint presentations
Surveys
PROBLEMS FACED IN STORING UNSTRUCTURED DATA
NoSQL databases (aka "not only SQL") are non-tabular databases and store data differently than
relational tables.
They provide flexible schemas and scale easily with large amounts of data and high user loads.
TYPES OF NOSQL DATABASES
Document databases store data in documents similar to JSON (JavaScript Object Notation) objects.
Each document contains pairs of fields and values. The values can typically be a variety of types
including things like strings, numbers, booleans, arrays, or objects.
Key-value databases are a simpler type of database where each item contains keys and values.
Graph databases store data in nodes and edges. Nodes typically store information about people,
places, and things, while edges store information about the relationships between the nodes.
BIG DATA
• Big Data is a collection of data that is huge in volume, yet growing exponentially with time.
• It is a data with so large size and complexity that none of traditional data management tools can store it or process
it efficiently. Big data is also a data but with huge size.
• The first organizations to work with big data are: Google, eBay, Facebook, LinkedIn
CONTD..
Volume is the base Big Data is built on. Each day a gigantic amount of data is being produced by all sorts of sources. Tec jury claims
that 2.5 quintillion bytes of data is created worldwide every day. That’s a lot, though most of this data will never be processed.
This term relates to the diversity of data types and sources. Data comes from web pages, search engines, social media, data sensor
systems, and it’s all raw, semi-structured or unstructured. In many ways it is a struggle for enterprises to turn this data mess into a
Velocity refers to the enormous data generation, analysis and reprocess speed. Nowadays data spawns in a blink of
Analytics is the process of discovering, interpreting, and communicating significant patterns in data.
Quite simply, analytics helps us see insights and meaningful data that we might not otherwise detect.
Analytics uses data and math to answer business questions, discover relationships, predict unknown outcomes and
automate decisions.
INTRODUCTION TO BIG DATA ANALYTICS
Big Data analytics is a process used to extract meaningful insights, such as hidden patterns, unknown
Big Data analytics provides various advantages—it can be used for better decision making, preventing fraudulent