Professional Documents
Culture Documents
BIG DATA
CORRESPONDS TO THE TOPIC
PREPARED BY:
ESTHER LYN TAGOON ANGELES, DM
WHAT IS BIG DATA?
Big data is a combination of structured, semi
structured and unstructured data collected by
organizations that can be mined for information
and used in machine learning projects,
predictive modeling and other advanced
INSERT PHOTO HERE analytics applications.
Veracity refers to the degree of accuracy in data sets and how trustworthy they
are. Raw data collected from various sources can cause data quality issues that
may be difficult to pinpoint. If they aren't fixed through data cleansing
processes, bad data leads to analysis errors that can undermine the value of
business analytics initiatives. Data management and analytics teams also need
to ensure that they have enough accurate data available to produce valid result s
More characteristics of big data
Some data scientists and consultants also add value to the list of big data's
characteristics. Not all the data that's collected has real business value or
benefits. As a result, organizations need to confirm that data relates to relevant
business issues before it's used in big data analytics projects.
Variability also often applies to sets of big data, which may have multiple
meanings or be formatted differently in separate data sources -- factors that
further complicate big data management and analytics.
The history of big data
the origins of large data sets go back to the 1960s and ‘70s when
the world of data was just getting started with the first data
centers and the development of the relational database.
Around 2005, people began to realize just how much data users
generated through Facebook, YouTube, and other online
services.
With the advent of the Internet of Things (IoT), more objects and
devices are connected to the internet, gathering data on customer
usage patterns and product performance. The emergence of
machine learning has produced still more data.
Importance of big data
Companies use big data in their systems to
• improve operations,
Big data is also used by medical researchers to identify disease signs and risk
factors and by doctors to help diagnose illnesses and medical conditions in
patients. In addition, a combination of data from electronic health records, social
media sites, the web and other sources gives healthcare organizations and
government agencies up-to-date information on infectious disease threats or
outbreaks.
Here are some more examples of how big data is used by organizations:
In the energy industry, big data helps oil and gas companies identify potential
drilling locations and monitor pipeline operations; likewise, utilities use it to
track electrical grids.
Financial services firms use big data systems for risk management and
real-time analysis of market data.
Manufacturers and transportation companies rely on big data to manage their
supply chains and optimize delivery routes.
Other government uses include emergency response, crime prevention and
smart city initiatives.
Examples of big data
Big data comes from myriad sources -- some examples are transaction processing
systems, customer databases, documents, emails, medical records, internet
clickstream logs, mobile apps and social networks. It also includes machine-
generated data, such as network and server log files and data from sensors on
manufacturing machines, industrial equipment and internet of things devices.
In addition to data from internal systems, big data environments often incorporate
external data on consumers, financial markets, weather and traffic conditions,
geographic information, scientific research and more. Images, videos and audio
files are forms of big data, too, and many big data applications involve streaming
data that is processed and collected on a continual basis.
Big data benefits:
But it’s not enough to just store the data. Data must be used to be valuable and
that depends on curation. Clean data, or data that’s relevant to the client and
organized in a way that enables meaningful analysis, requires a lot of work.
Data scientists spend 50 to 80 percent of their time curating and preparing data
before it can actually be used.
It requires new strategies and technologies to analyze big data sets at terabyte, or
even petabyte, scale.
During integration, you need to bring in the data, process it, and make sure it’s
formatted and available in a form that your business analysts can get started with.
2. Manage
Big data requires storage. Your storage solution can be in the cloud, on premises,
or both.
Many people choose their storage solution according to where their data is
currently residing.
Your investment in big data pays off when you analyze and act on your data. Get
new clarity with a visual analysis of your varied data sets.
Explore the data further to make new discoveries. Share your findings with others.
Build data models with machine learning and artificial intelligence. Put your data
to work.