Professional Documents
Culture Documents
Luis Figueroa
Solutions Architect
BlueGranite, Inc
MCITP SQL Server 2008
Microsoft V-TSP
Email: lfigueroa@blue-granite.com
Twitter: @luisefigueroa
LInkedIn: https://www.linkedin.com/in/luisefigueroa
WHY ARE WE HERE TODAY?
WHY ARE WE HERE TODAY?
• The increasing amount and complexity of data makes it more and more
difficult to infer insights with simple data exploration techniques. Advanced
statistics are required to learn more from our data.
Drew Conway
http://drewconway.com/zia/2013/3/26/the-data-science-venn-diagram
The Data Science Process
http://benfry.com/phd/
Why R?
• R is Free
• Azure ML supports R
1997 - R core group is formed. This entity governs the evolution of the R
language
Download RStudio:
http://www.rstudio.com/products/rstudio/
Revolution R Open
http://mran.revolutionanalytics.com/download/
To manipulate data
dplyr - Essential shortcuts for subsetting, summarizing, rearranging, and joining together data sets. dplyr is
our go to package for fast data manipulation.
tidyr - Tools for changing the layout of your data sets. Use the gather and spread functions to convert your data
into the tidy format, the layout R likes best.
stringr - Easy to learn tools for regular expressions and character strings.
lubridate - Tools that make working with dates and times easier.
TryR
http://tryr.codeschool.com
Swirl
http://swirlstats.com
R-Bloggers
http://www.r-bloggers.com
StackOverflow
http://stackoverflow.com/questions/tagged/r
http://www.image-net.org
Resources - Infographics
http://azure.microsoft.com/en-us/documentation/articles/machine-learning-algorithm-cheat-sheet/
Download today’s slides and project from:
https://github.com/luisefigueroa/Intro-to-R-for-the-SQL-Server-Pro
Luis Figueroa
Solutions Architect
BlueGranite, Inc
MCITP SQL Server 2008
Microsoft V-TSP
Email: lfigueroa@blue-granite.com
Twitter: @luisefigueroa
LInkedIn: https://www.linkedin.com/in/luisefigueroa