You are on page 1of 34

BITS College

Graduate Program

Business Intelligence and Analytics


Learning should be fun, not Mundane
Business Intelligence
 Why do some organizations outperform others?
 Why do so many companies go out of business?
 Do we follow the normal when giving out loans to customers?
 How can organizations potentially increase their chance of survival?
 Why is it difficult for companies to demonstrate sustainable high
performance?
 What can be done to deal with this challenging environment and
prepare for the future?
 executives perceive their environment as getting more and more
complex?
 How effective are organizations in executing their strategy?

• There is a growing influence of data in most sectors and most industries.


Group Discussions
 "One-stage" , how do we classify decisions like One-stage ?

 Payoff table is tabulated using the alternatives and state of nature, but
how do we populate/ calculate the data?

 from the "elements of decision analysis" the 4th step talks about
criteria. but in the given example about the "CBE Birr" what are the
criteria used to make the decision ?
What data can we possibly generate and
in we have in what format?
 Financial Sector
 Insurance
 Health
 Police
 Education
 Technology
 Public Service
 Telecom
 Disaster Prevention
 Other…..
Overview of Business Intelligence and
Data Science
Data Revolution
• Data is created constantly, and at an ever-increasing rate
• Massive amounts of data about many aspects of our lives
• Shopping, communicating, reading news, listening to music, searching for
information, expressing our opinions
• Websites track every user’s every click.
• Smartphone are building up a record of our location
• Smart cars collect driving habits,
• Smart homes collect living habits, and
• Smart marketers collect purchasing habits.
• The finance, the medical industry, pharmaceuticals, bioinformatics, social
welfare, government, education, retail, and the list goes on.

7
Data Revolution…
• There is a growing influence of data in most sectors and most
industries.

• Culturally saturated feedback loop where


“our behavior changes the product and the product changes our
behavior”

• Technology makes this possible:


• infrastructure for large-scale data processing,
• increased memory, and bandwidth,
• cultural acceptance of technology in the fabric of our lives.

8
Big Data - a tsunami that is hitting us
 We are witnessing a tsunami of data:
 Huge volumes
 Data of different types and formats
 Impacting the business at new and ever increasing speeds

 The challenges:
 Capturing, transporting, and moving the data
 Managing - the data, the hardware involved, and the software
(open source and not)
 Processing - provide insight into the data

 Storing - safeguarding and securing


 “Big Data refers to non-conventional strategies and innovative technologies
used by businesses and organizations to capture, manage, process, and
make sense of a large volume of data”
Data has an intrinsic property…it grows and grows

80% 20%
90% of the world’s data of available data can be
of the world’s data today is processed by traditional
was created in the unstructured systems
last two years

1 in 2 83% 5.4X
business leaders don’t have of CIO’s cited BI and analytics as more likely that top
access to data they need part of their visionary plan performers use business
analytics
Growing interconnected & instrumented world
Data Revolution
 eBay captures a terabyte of data per minute
 Every mouse click on a web site is captured in Web log files
 Machines (smart meters, Sensors, GPS, etc)
 Social media sites

14
Some examples of Big Data
• Science
• Astronomy  Internet text and documents
• Atmospheric science  Internet search indexing
• Genomics  Call detail records (CDR)
• Biogeochemical  Photographic archives
• Biological  Video / audio archives
 Large scale ecommerce
• Social networks  Regular government business
 Person to person and commerce needs
 Person to world (P2W, C2W):  Military and homeland security
Twitter surveillance
Facebook
LinkedIn  Financial transactions
 Flight Data
• Medical records
What data do we have in what format?
 Financial Sector
 Insurance
 Health
 Police
 Education
 Technology
 Public Service
 Telecom
 Disaster Prevention
 Other…..
Characteristics of Big Data

March 3, 2017

16th Annual Accounting Educators Seminar -


University of Missouri - Kansas City
Volume
Volume
 Many factors contribute to the increase in data volume
– Transaction-based data stored through the years.
– Unstructured data streaming in from social media.
– Increasing amounts of sensor and machine-to-machine data
 Excessive data volume is not storage issue
– But with decreasing storage costs, other issues emerge,
– how to determine relevance within large data volumes
– how to use analytics to create value from relevant data.
 Data Volume
– Growth 40% per year
– Data volume is increasing exponentially
– 90% of the data is created in the past two years
Variety
Variety
 Data today comes in all types of formats.
 Structured

– Relational database, Social Network, Semantic Web (RDF)


 Semi-structured Data
– XML
 Unstructured

– text documents, email, video, audio, stock ticker data and


financial transactions.
– Streaming Data
 Different Sources:
 Product reviews from different provider websites
 Customer history from various sources
Velocity
Velocity
 Data is streaming in an unprecedented speed and must be dealt with in a
timely manner
 Data is being generated fast and need to be processed fast
 Late decisions =missing opportunities
 Examples
E-Promotions: Based on your current location, your purchase history,
what you like, send promotions right now for store next to you
Healthcaremonitoring: sensors monitoring your activities and body,
any abnormal measurements require immediate reaction
Disaster management and response
Users comments from social networking sites must be dealt timely
Veracity
Veracity
 Refers to the biases, noise and abnormality in data
 When we talk about big data, we typically mean its quantity:
 What capacity of a system provides to cope with the sheer size of the
data?
 Is a query feasible on big data within our available resources?
 How can we make our queries tractable on big data?
 Can we trust the answers to our queries?
 Dirty data routinely lead to misleading financial reports, strategic business
planning decision  loss of revenue, credibility and customers, disastrous
consequences
 The study of data quality is as important as data quantity
Demand for Data Science
 McKinsey Global Institute on Big Data Jobs (2011)
http://www.mckinsey.com/mgi/publications/big_data/index.asp
 Estimated gap of 140,000 - 190,000 data analytics skills by 2018
 UK Big Data skills report 2014
 6400 UK organisations with 100+ staff will have implemented
Big Data Analytics by 2020
 Increase of Big Data jobs from 21,400 (2013) to 56,000 (2017)
 European Data Market (2015)
 Number of data workers 6.1 mln (2014)
– increase 5.7% from 2013
 Average number of data workers per company 9.5 - increase 4.4%
 Gap between demand and supply 509,000 (2014) or 7.5%
Demand for Data Science
 HighLevel Expert Group on the European Open Science Cloud
identified need for data experts and data stewards
 Estimation: More than 500,000 data stewards
1 per every 20 scientists or 5% funding

 Challenges and observation:


 Shortage of data experts
 Data literacy gap: Requirements: Core data experts need to be trained and
their career perspective significantly improved - domain specialists

 Support: Professional data management and Long term data stewardship


Need Data Science?
• Do the above problem need a new Science?
• In the 20 century, the data that novel theories are based are buried in the
notebooks of researchers
• We wish a time when data will be permanently stored
• Is the problem beyond the existing disciplines?
• What is then Data Science?
• Someone who knows more statistics than a computer scientist and more
computer science than a statistician
• Can you define data science by what data scientists do? Who gets to define the field,
anyway?
• There’s lots of buzz and hype—does the media get to define it, or should we rely
on the practitioners, the self-appointed data scientists? Or is there some actual
authority?

28
Definition of Data Science
• What is data science?
• Is it new, or is it just statistics or analytics rebranded?

• Data science is the civil engineering of data.


• Big Data = Organic Data
• Survey Data = Designed Data

• Data Science provides ways to deal with and benefit from Big
Data:
• To see patterns
• To discover relationships
• To make sense of stunningly varied images and information.

29
Definition of Data Science
 Datascience is the extraction of actionable knowledge directly
from data through a process of discovery, or hypothesis
formulation and hypothesis testing.

 Datascience incorporates principles, techniques, and methods


from many disciplines and domains including data cleansing,
data management, analytics, visualization, engineering, and in
the context of Big Data, now also includes Big Data Engineering.
Definition of Data Science
•Data science is a methodology by which a data scientist takes
unspecified portions of statistics, scientific rigor and systemic
capabilities to ensure that an answer to a data question is
accurate.

•Data science combines various technologies, techniques, and theories


from various fields, mostly related to computer science and statistics, to
obtain actionable knowledge from data.

31
Definition of Data Science
•Dealing with unstructured and structured data. DS is a field that
comprises of everything that related to data cleansing, preparation,
and analysis.

•DS is the combination of statistics, mathematics, programming,


problem solving (capturing data in ingenious ways), the ability to look
at things differently, and the activity of cleansing, preparing, and
aligning the data.

•In simple terms, it is the umbrella of techniques used when trying to


extract insights and information from data.
Definition of Data Science

33
Data Science

You might also like