Professional Documents
Culture Documents
Introduction
• The R statistical programming language is a free open source package
based on the S language (developed by Bell Labs).
• R was created by Ross Ihaka and Robert Gentleman at the university of
Auckland, New Zealand
• The language is very powerful for writing programs.
• Many statistical functions are already built in.
• Contributed packages expand the functionality to cutting edge research.
Getting Started
• Download R from www.r-project.org
• Download RStudio from https://www.rstudio.com/
R Operations
• Mathematical Operators in R • Logical Operators
– +, -, * - Simple Mathematical operations • < less than
– X^Y - X raised to Y • <= less than or equal to
– sqrt(x) - square root of x • > greater than
– abs(x) - Absolute value of x • >= greater than or
– factorial(x) - Factorial of x equal to
– log(x) - logarithm of x • == exactly equal to
– cos(x), sin(x), tan(x) - Trigonometric functions • != not equal to
• !x Not x
• x | y x OR y
• x & y x ANDy
Declaring Variables in R
• Two ways to assign the values • To print the variable
I. Using “=” symbol >print(MyVar)
>MyVar=10 [1] 10
II. Using “<-” symbol
• >MyVar<-10
• Code begins with ‘>’ symbol and output begins with [1]
Data types in R
• R organizes data in following formats
– Scalar : Represents a single number (0 dimensional)
– Vector : Represents row of numbers (1 dimensional)
• Integer Vectors, Character Vectors and Factors
– Matrix: Represents the table like format (2 dimensional)
– Arrays : Represents the table like format (>2 dimensional)
– Lists : General vector containing other kinds of vectors
– Data Frames : Represents the table like format (2 dimensional)
Data types : Examples
• Scalar :
>var1 <- 1
• Vector :
• The “c” - concatenate command is used to create vector
>Var2 <- c(1,2,3,4,5)
>Var2 <- c(“Apple”, “Orange”, “Mango”)
>Var3 <- c(“Hello2”, 20, “Hello4”, 30)
• Colon Operator (:) for creating a sequence of numbers
>Var4 <-c(1:15)
Data types : Examples
• Vector : To access elements in vectors
>Var2 <- c(“Apple”, “Orange”, “Mango”)
• Var2[1] = Apple; Var2[2] = Orange, Var2[3] = Mango
write.csv(file, “File_Name.csv”)
Machine Learning in R
• k.means : K-means clustering
• e1071 : support vector machines, bagged clustering, naive Bayes classifier
rpart Recursive Partitioning and Regression Trees.
• nnet Feed-forward Neural Networks and Multinomial Log-Linear Models.
• randomForest : random forests for classification and regression.
• caret package (short for Classification And REgression Training)
• glmnet Lasso and elastic-net regularized generalized linear models.
• gbm Generalized Boosted Regression Models.
• arules Mining Association Rules and Frequent Itemsets.
• tree Classification and regression trees.
• ipred Improved Predictors.
• mboost Model-Based Boosting.
R Packages
• To install or add new R packages • CRAN ()
– install.packages(“package_name”) • https://cran.r-project.org/
• To load the package • Comprehensive R archivee
– library(package_name) network
• To see default packages on R
– library() • Currently, the CRAN package
• To see installed packages on R repository features 10480
available packages.
– installed.packages()
• You can create your own package
Thank You