Professional Documents
Culture Documents
30 Open Technet
-
2012.07.26
R,
What is R?
R is a language and environment for statistical computing and graphics.
l An environment
-
, , , ,
l A community
-
contributors, 2
R
l ,
: Revolution Analytics
Excellence in Application & Data Governance
R
l ,
tool
R
l , Google Facebook R
Vendor R
l Oracle, IBM Netezza, SAP HANA, Teradata in-memory
in-database R
Vendor R
l SAS SPSS R
SAS/JMP
10
SPSS with R
l SPSS R ?
-
- , R
, . ( )
SPSS, R
SPSS , , TEST ,
,
Tobit SPSS
, , ,
. SPSS
!!!
: RUserConf2011 - SPSS R , , SPSS Korea, 2011.10.28
11
Big Data R
l
: Revolution Analytics
Excellence in Application & Data Governance
12
R
l In-Memory Computing
- , H/W
l Object-oriented programming
- , object
- (class) & (method)
l Package
-
- , Help Examples
l Visualization
-
- Chart, Plot, MotionChart, Map R
Excellence in Application & Data Governance
13
History of R
: (, 2011)
14
15
Korea :
http://cran.nexr.com/
l R CRAN Site
l 39 87 Mirror
16
R Package
R Package (http://cran.r-project.org/web/packages/)
R
l
IT
l Software Vendor
Version Up
17
R Package
18
R Package
Crantastic! (http://crantastic.org)
19
R Visualization
ggplot2
l ggplot2 is a flexible, colorful, and dynamic graphics package:
20
R Visualization
Word Cloud
l R text mining Word Cloud
21
R Visualization
22
R Visualization
23
R Visualization
googleVis
l R API
http://code.google.com/p/google-motion-charts-with-r/
Excellence in Application & Data Governance
24
R Visualization
RgoogleMaps
l
# ...
library(RgoogleMaps)
#
tollgate_info <- read.csv(".csv")
#
map.center.loc <- c(36, 128)
#
input.zoom <- 7
map_data <- tollgate_info
#
win.graph()
mymap <- GetMap(center = map.center.loc,
zoom = input.zoom, maptype = "road", format
= "roadmap", destfile = "mymap.png")
PlotOnStaticMap(mymap, lat =
map_data$Y, lon = map_data$X,
destfile = "mymap.point.png", cex = 1, pch
=20, col="blue)
: R ,
Excellence in Application & Data Governance
25
R Visualization
animation
l R Graph animation
# animation
saveHTML({
for(map.i in 1:length(unique.name)) {
mymap <- GetMap(center =
map.center.loc, zoom = input.zoom, maptype
= "road", format = "roadmap", destfile =
"mymap.png")
PlotOnStaticMap(mymap, lat =
map_data[map_data$ ==
unique.name[map.i], ]$Y, lon =
map_data[map_data$ ==
unique.name[map.i], ]$X, destfile =
"mymap.point.png", cex = 1, pch =20,
col="blue")
}
}, img.name = "unif_plot", imgdir = "unif_dir",
htmlfile = "random.html",
autobrowse = FALSE, title = "
",
description = c("RgoogleMaps
.\n\n"))
: R ,
Excellence in Application & Data Governance
26
Useful R IDE
RGUI
l RGui , ,
27
Useful R IDE
R Studio (http://www.rstudio.org/)
28
Useful R IDE
R Studio
29
Useful R IDE
Red-R (http://www.red-r.org/)
30
Useful R IDE
Red-R
31
Useful R IDE
32
Useful R IDE
33
Useful R IDE
RExcel (http://rcom.univie.ac.at/)
34
Useful R IDE
RExcel
35
Useful R IDE
36
Useful R IDE
R Commander
37
Useful R IDE
Revolution R (http://www.revolutionanalytics.com/)
38
Useful R IDE
Revolution R
39
Useful R IDE
40
Useful R IDE
StatET for R
41
R
l http://www.r-project.org
l http://www.r-project.kr ( R - KRUG)
l http://www.r-bloggers.com
l http://stackoverflow.com
l http://stats.stackexchange.com
l http://www.inside-r.org/
l http://www.r-statistics.com/
l http://support.rstudio.org/
l http://quora.com
Excellence in Application & Data Governance
42
43
Learning R
: Revolution Analytics
Excellence in Application & Data Governance
44
R,
45
Advanced Analytics
l (Advanced Analytics) ,
() ,
Optimization
Predictive Modeling
egatnavda evititepmoC
Forecasting/extrapolation
Statistical Analysis
Analytics
Alerts
Query/drill down
Ad hoc Reports
Standard Reports
What happened?
Access and
Reporting
Degree of Intelligence
: Davenport and Harris, Competing on Analytics, Harvard Business School Press (2007)
Excellence in Application & Data Governance
46
47
48
R
l Single Core
- CPU 1
- R 2.14 parallel
l In-Memory
-
-
Open Source
l Snow, multicore, parallel, bigmemory Multi-core
l
Excellence in Application & Data Governance
49
RHIPE (http://www.datadr.org/index.html)
l RHIPE(R and Hadoop Integrated Programming Environment)
Purdue Univ. Saptarshi Guha
R
l R Hadoop MapReduce
l Revolution Analytics RHadoop .
l Divide & Recombine (D&R) , not parallel processing.
50
RHadoop (https://github.com/RevolutionAnalytics/RHadoop)
l Hadoop Map-Reduce R Code
Hadoop
: Revolution Analytics
Excellence in Application & Data Governance
51
: R ,
Excellence in Application & Data Governance
52
53
RHive (https://github.com/nexr/RHive)
l R Hive , R
Hive Hadoop
54
RHive Example
: http://www.nexr.co.kr/nexrcorp_en/products_and_services/rhive_useCase.php
Excellence in Application & Data Governance
55
RevoScaleR
l RevoScaleR Single Core
l HPC Clusters/Cloud R
56
High-Performance Computing
(http://cran.r-project.org/web/views/HighPerformanceComputing.html)
l Parallel computing: Explicit parallelism Rmpi, snow, snowfall, foreach, doMPI
l Parallel computing: Implicit parallelism fork, multicore
l Parallel computing: Grid computing
l Parallel computing: Hadoop
l Profiling tools
profr, proftools
57
RevoDeployR (Architecture)
l RevoDeployR R Application
: Revolution Analytics
58
: Revolution Analytics
Excellence in Application & Data Governance
59
l ?
:
RDBMS (Scale-up) ,
.
Scale-out
?
l ?
: , SAS SPSS
.
.
?
60
Efficiency
, ,
Scalability
Agility
Hadoop
,
: R ,
61
l .
l ?
( , ) .
Data Scientist ( ) .
study,
study, study.
study
62
DG
Tel : 010-4801-6609
email : leewow@gtone.co.kr
63