You are on page 1of 37

Lesson 2

Big Data
In our last lesson, we
discovered how data analytics
helps us transform
information into valuable
insights.
But what if we encounter data
that is not just large but
incredibly massive and
complex???
?
? But what if we encounter data
?
? ?
that is not just large but
incredibly massive and

?
complex???
?
Thi is where Big Data
s comes into
play.
Big Data represents an exponential increase
in data volume, velocity, variety, and
complexity. It challenges conventional data
processing methods and opens up new
horizons in analytics.
In Lesson 2, we'll explore why Big Data
matters and how it extends our data
analytics journey.
Big Data refers to extremely large and
complex sets of digital information that are
too vast to be processed and analyzed by
traditional methods. It encompasses
enormous volumes of data from various
sources and formats, and it is characterized
by its challenges in storage, processing, and
analysis.
Applying in Real Life
Imagine this: every time you post
on social media, shop online, use
your phone, or even walk by a
sensor, you're adding drops to this
data ocean. And you're not alone
—billions of people and countless
machines do the same every day.
Big Data is like a vast ocean of digital
information. It's not just any data; it's an
incredibly large and diverse collection of
facts, numbers, words, images, and more.
What makes it "big" is not just its size but
also how quickly it's generated and how
different it can be.
This data universe is characterized by
four key dimensions, often referred to as
the Four V's of Big Data.
Volume
This refers to the sheer size of the
data, often measured in terabytes,
petabytes, or more. Big Data
involves massive volumes that
exceed the capacity of
conventional databases and
storage systems.
Velocity
Velocity relates to the speed at
which data is generated, collected,
and processed. In the era of Big
Data, information flows in
rapidly, sometimes in real-time.
Examples include social media
updates, sensor data, and financial
transactions.
Variety
Variety highlights the diverse types of
data within Big Data. It includes
structured data (like databases and
spreadsheets), unstructured data (such as
text and images), and semi-structured
data (like XML or JSON files). Big Data
encompasses all these forms, making it
complex to manage and analyze.
Veracity
Veracity addresses the quality and
reliability of data. In Big Data, there
can be issues with accuracy,
consistency, and trustworthiness due
to the diverse sources and large
volumes. Data cleansing and
validation processes are essential to
ensure the data's veracity.
Big Data holds amazing secrets and
answers to all sorts of questions. It can
tell us what people like to buy, how
traffic moves in a city, or even help
doctors find the best treatments for
patients.
But because Big Data is so huge and
complex, we need special tools and
techniques to explore and understand it.
Steps in
Big Data
Analytics
Steps in Big Data
Analytics:
1. Data 3. Data Storage
Collection
Imagine Big Data as a treasure trove of We need a safe place to store all these data
information. The first step is to gather data from treasures. This is often done using special
various sources like sensors, social media, databases and technologies that can handle the
websites, or customer records. It's like collecting huge volume of data. Think of it as a massive
pieces of a puzzle. library for data.
2. Data Cleaning 4. Data
Just like sorting through treasures, we need to Processing
Now, it's time to make sense of the data. We use
clean the data. This means removing errors, powerful computers to process and organize the
duplicates, or anything that doesn't fit. It's like information. It's like arranging puzzle pieces to
cleaning up the puzzle pieces before putting them form a clear picture.
together.
Steps in Big Data
Analytics:
5. Data Analysis 7. Interpretation
This is where the magic happens. We use different Once we have insights, we need to understand
tools and techniques to analyze the data and find what they mean. It's like reading the message
patterns or insights. It's like solving the puzzle and hidden in the puzzle and figuring out its
discovering a hidden message. importance.

6. Data 8. Decision-
Visualization
Sometimes, it's easier to understand data when we Making
Finally, we use these insights to make informed
can see it. We create charts, graphs, and visual decisions. It's like using the message from the
representations to make the insights clear, like puzzle to guide our actions and choices.
turning the puzzle into a beautiful picture.
To make Big Data Analytics work
smoothly, we need special tools and
frameworks. These tools help us bring
data from various places into the Big
Data Analytics system.
Specialized
Tools
and
Frameworks
Specialized tools and frameworks are like
powerful assistants in data analytics. Tools
are like specialized gadgets for specific tasks,
while frameworks provide structured plans to
solve complex problems efficiently. They
help analysts work faster and smarter in
handling and making sense of data.
Example
Imagine you're tasked with creating a digital art gallery, like an online museum. You
have thousands of artworks to organize, display, and make accessible to visitors.
Here's where specialized tools and frameworks come into play:
Specialized Tools: Think of these as your digital art-handling tools. For instance, you
might use a tool to automatically adjust the lighting and colors of each artwork to
make them look their best online. This tool is like a magic paintbrush that enhances
the pictures without you having to do it manually for each one.
Frameworks: Now, consider frameworks as your gallery-building plans. They
provide step-by-step instructions for designing the layout, setting up navigation, and
ensuring a smooth visitor experience. With a framework, you don't have to figure out
how to create a user-friendly gallery from scratch; it's like having a blueprint to
follow.
So, in this scenario, specialized tools help you
enhance the artworks, while frameworks guide you
in building an organized and user-friendly digital
art gallery. Together, they make the process
efficient and ensure that visitors can enjoy the art
without any hassle. Just as artists use different
brushes and follow a design plan, data analysts use
specialized tools and frameworks to craft
meaningful data solutions.
In short
In short, specialized tools are specific
software or instruments designed to perform
particular tasks efficiently, often automating
or simplifying complex processes within
data analytics or other fields.
In short
frameworks are structured plans or
pre-established guidelines that provide a
systematic approach for solving complex
problems or tasks in data analytics or other
areas, streamlining the process and ensuring
consistency.
When is Specialized Tools and Frameworks are
required?
1. the volume of data involved is so large that it is difficult
to store, process and analyze data on a single machine;
2. the velocity of data is very high and the data needs to be
analyzed in real-time;
3. there is variety of data involved, which can be
structured, unstructured or semi-structured, and is
collected from multiple data sources.
Examples of Big Data
• Click-stream data generated by web applications such as e-
Commerce to analyze user
behavior
• Machine sensor data collected from sensors embedded in
industrial and energy systems
for monitoring their health and detecting failures
• Healthcare data collected in electronic health record (EHR)
systems
• Logs generated by web applications
• Stock markets data
• Transactional data generated by banking and financial
Any
Question?
Be ready
for our Quiz
Questions
1. Which "V" of Big Data refers to the 6. Removing errors and duplicates
enormous quantity of data involved? from the data.
2. Gathering data from various 7. The speed at which data is
sources. generated and processed.
3. Storing data in specialized systems
4. Among the Four V's, which one 8. Analyzing and organizing data.
deals with the diverse types of data,
9. Using insights for informed
including structured, unstructured, and
decisions.
semi-structured data? 10. Extracting insights and patterns.
5. The quality and reliability of data.
Answers
Questions Answers
1. Which "V" of Big Data refers to the 1. Volume
enormous quantity of data involved?
2. Gathering data from various 2. Data Collection
sources.
3. Storing data in specialized systems 3. Data Storage
4. Among the Four V's, which one 4. Variety
deals with the diverse types of data,
including structured, unstructured, and
semi-structured data?
5. The quality and reliability of data. 5. Veracity
Questions Answers
6. Removing errors and duplicates 6. Data Cleaning
from the data.
7. The speed at which data is 7. Velocity
generated and processed.
8. Analyzing and organizing data. 8. Data Processing

9. Using insights for informed 9. Decision-Making


decisions.
10. Extracting insights and patterns. 10. Data Analysis
Thanks!

CREDITS: This presentation


template was created by Slidesgo,
including icons by Flaticon and
infographics & images by Freepik

You might also like