Professional Documents
Culture Documents
1 / 43
Agenda
1. Motivation
3. Data Science
4. KDD process
6. Final words
2 / 43
Motivation
Papyrus of Oxyrhynchus (100 B.C.)
Papyrus of Euclid’s geometry with diagram (Oxyrhynchus, Egypt, ca. 100 AD, now at the University of Pennsylvania)
3 / 43
Punch cards (1937)
4 / 43
Floppy disks (1960)
5 / 43
Prince of Persia (1989)
6 / 43
Prince of Persia (1989)
OS: Apple II
7 / 43
Prince of Persia: The Forgotten Sands (2010)
OS: PS3
8 / 43
What happens on the Internet?
9 / 43
60 seconds on the Internet
10 / 43
The republic of Facebook
11 / 43
Structured Vs Unstructured data
12 / 43
IoT: Smart stuffs
https://www.cisco.com/c/dam/en_us/about/ac79/docs/innov/IoT_IBSG_0411FINAL.pdf
13 / 43
What is Big Data?
What do the dictionaries say? (1)
14 / 43
What do the dictionaries say? (2)
[Wikipedia] Big data is a term for data sets that are so large or complex
that traditional data processing application software is inadequate to deal
with them. Big data challenges include capturing data, data storage, data
analysis, search, sharing, transfer, visualization, querying, updating and
information privacy
15 / 43
Big (Rich) Data
16 / 43
Big Data and the four V s
Value + Vulnerability
17 / 43
Turning Big Data into a value
18 / 43
Big Data vs Small Data
19 / 43
Big Data landscape
https://mattturck.com/data2019/
20 / 43
Methods of the past Vs Current methods (1)
21 / 43
Methods of the past Vs Current methods (2)
22 / 43
Big Data trends
23 / 43
Data Science
Data Science
24 / 43
What do the dictionaries say? (1)
25 / 43
What do the dictionaries say? (1)
26 / 43
Pillars of Data Science
• Business domain
• Statistics and probability
• Mathematical thinking
• Computer science and software programming
• Written and verbal communication
27 / 43
Data Scientist skills
http://upxacademy.com/data- scientist/
28 / 43
Data Scientist: the sexiest job
29 / 43
KDD process
KDD Process
1 From data mining to knowledge discovery: an overview. Usama M. Fayyad, Gregory Piatetsky-Shapiro, Padhraic Smyth. Advances in
knowledge discovery and data mining, pages 1-34, 1996
30 / 43
6-step Knowledge Discovery Process
2 Pal, N.R., Jain, L.C., Eds. (2005). Advanced Techniques in Knowledge Discovery and Data Mining, Springer
31 / 43
Knowledge Discovery Process
3 K.J. Cios, W. Pedrycz, and R.J. Swiniarski (2007). The Knowledge Discovery Process. Springer
32 / 43
CRISP-DM
33 / 43
Data-driven process for Big data
https://www.researchgate.net/figure/Big-picture- of- the- data- driven- process- for- crowd- management- and- control_fig1_
332775847
34 / 43
Some (small) examples
Data summarization
Event summarization using tweets - Work developed in collaboration with Arturo Oncevay - PUCP
35 / 43
Text Mining
Study of the perception of citizen insecurity - Work developed in collaboration with Juandiego Morzan - UP
36 / 43
Analysis of a meteorological phenomenon
37 / 43
Analysis of epidemiological data
Epidemiological pattern visualization - Work developed in collaboration with Agustı́n Guevara - PUCP
38 / 43
Studying the resilience in Peru (1)
39 / 43
Studying the resilience in Peru (2)
40 / 43
Inhabitants mobility in Lima
41 / 43
Final words
Final words
42 / 43
Hugo Alatrista Salas, Ph.D.
Escuela de Posgrado Newman
https://simbig.org/alatrista-salas/
E-mail: hugo.alatrista@epnewman.edu.pe
43 / 43