Professional Documents
Culture Documents
Not long ago, storage and massive data processing solutions were very expensive, proprietary,
and hardly scalable
With Big Data now we use cheap, highly redundant, scalable, and easy to operate servers
Hadoop cluster of the SurfSara Science Park in Amsterdam, with 170 nodes , totalling 1370 cores and 2.3 Pb storage size
A Little history (2/3) - Software
Before, we used proprietary, not quite adaptable, expensive, and hard to maintain software
A Little history (2/3) - Software
With Big Data, we use software which is highly scalable, open source, free, and highly specific
to each industry
A Little history (3/3) – Databases
Before Big Data, database solutions were based on the previous creation of fields
with a rigid size and format, creating limits in their scalability and in the format of the
data to be stored in them
A Little history (3/3) – DataBases
With Big Data we now have the so called “NoSQL” databases. These are created in structures of
variable size, which can thus receive data in any size and format
End result = Lots of Data
The end result of all that is that the amount of data grew exponentially, confirming the
need to store and process massive amounts of data quickly and inexpensively
The original problema however still remains…
However, the original problem remains, which is the fact that data arrive with “Volume”,
“Velocity”, and “Variery”, but still need “Veracity” (the so called “4V”)
Big Data: central part of a 3 level pyramid
In our vision, Big Data is really the central part of a three level pyramid: Data
Capture (IOT), Storage and Processing (Big Data), and Data Analysis (AI)
IOT (Internet of Things) – Data Capture
The base of the pyramid is IOT, or the “Internet of Things”. It is made of thousand
of devices, sensors, etc which capture data and transmit them over the internet
or other networks. Thus, IOT generates the data which will be processed
Big Data – Storage and Processing
In the middle we find Big Data, which is responsable for storing and processing
the data obtained by IOT, so they can be then analyzed by AI
AI – Predicting the future
Finally, the top is AI (Artificial Intelligence) is about finding pattners in the data captured by IOT
and stored/processed by Big Data, to be able to predict the future and thus react accordingly
Hari Sheldon, fictitious characted created by Isaac Asimov, who could predict the future with math formulas
Impact – Greater competition
But the most striking effect of Big Data is that, just like the internet allowed the small
enterprise to compete with large ones in foreign markets, Big Data will allow it to know its
processes and customers as well as the large ones and, again, compete effectively.
Need: Multidisciplinary team
Again, just like the internet showed the need of a multidisciplinary team which included
Marketing, Management, etc, the same happens with Big Data, which generally use:
1 – SysAdmin and SysOps
2 – Programmers
3 – Mathematicians and Statisticians
4 – System Architects
5 – Marketing Managers
Examples Big Data projects (1/4) – Power use prediction
FACTS: All of the power grid in Spain shall have Smart Meters before 31 of December
of 2018, as required by the government
EXAMPLE: Power company Endesa is leading the change in Spain with over 3.5 million
meters installed: over 30% of its client base. Similarly, swedish company Sweco
processed 5 billion lines of 200 thousand clients in 3 years
BENEFITS: Allows to follow in real time the level of usage of the grid, redirecting
power as needed. Also, allows to predict the increase in power needs, thus
proactively increasing the equipment required to handle the future load
Examples Big Data projects (2/4) – Corporate image
FACTS: In 2010, 1,5 million people saw a video promoted by Greenpeace on how Nestle’s Kit
Kat chocolate bar was in fact killing orangutans. The company only reacted after receiving
over 200 thousand emails protesting, and tried to erase the video and the comments in
youtube.
EXAMPLE: Today Nestlé uses Sentiment Analysis, which allowed it to move from the 16th to
the 12th position in the “Most Respected Company” index in the world. Similarly, Exxon
developed with IHS a system which analyzed over 20 thousand tweets the public’s opinión
on fracking
BENEFITS: Increase in the company’s corporate image and thus on amount of customers
Implementación
Examples Big típica
Data projects (3/4) 3 Management
– Fleet
Análisis Geoposicional de la flota
FACTS: Corporate car fleets are frequently victim of unauthorized use, generating delays on
deliveries, customer support, etc, increased costs in fuel, and a decrease in the vehicle’s value
EXAMPLE: The city of Boston was able to eliminate potholes in its streets thanks to an app
which reads the user’s mobile accelerometer and thus identifies each time his car hits a
pothole
BENEFITS: Reduction in fuel use, increase in the city’s fleet time, decrease in the amount og
fraud and fines, and an increase in the city’s image by tourists
Examples Big Data projects (4/4) – Churning Control
FACTS: The impossibility to predict when a customer plans to leave creates a loss in
revenues
CHALLENGE: Any prediction system requires a large amount of pre-collected data from
several diverse sources
EXAMPLE: T-Mobile developed a system which calculates the customer lifetime value based
on 3 variables: Billing analysis, Drop call analysis, and Sentiment analysis. The end result is
that churning decreased by 50% in only 3 months
BENEFITS: Reduction in the amount of customers lost, increase in market share, and
increase in customer satisfaction
Reality of Big Data today (1/4)
Sadly however, most part of the Big Data implementations today are the simple gathering
and storing of data without any need or focus
Episode of the TV show “South Park”, where gnomes collect underwear to make money off of them, but without knowing how
Reality of Big Data today (2/4)
In fact, study from December 2013 by TeraData found out that half ot the companies do not
know if they actually got any benefit from Big Data
And that hasn’t changed: later study of November 2015 by PWC showed that only 4% of the
companies actually obtain benefits of Big Data
Reality of Big Data today (3/4)
The huge growth of the amount of technologies available created a déficit in the
amount of qualified professionals and a large increase in their salaries
2012
Reality of Big Data today (4/4)
Most of the companies still program for 1st generation, when we are already at the 4th
Conclusions
1 – Big Data is not about just storing data, but about finding trends
to allow more revenue, less costs, and better service
5 – Thus, Big Data si not just a fad destined to a few chosen ones,
but a new technology which in a few years will be available to
everyone just as it happened with the internet
Thank you for your time
Synthetic Data
Email: info@syntheticdata.eu
Web: http://www. syntheticdata.eu