9 views

Uploaded by Suren Markosov

rguide

save

- Guide to Eng. Data Science Lecture (1) Ver2
- SM Programming Tutorial
- 8913A_08
- Classwork+5+Objects.docx
- Software Design Document csv2kml
- How To Use Keygen.txt
- 최종본 Youngsue Han(145-168)
- Commons Based Peer Production Datasets
- Exercises 1 Into R
- Classwork+5+Objects.docx
- 8943_y04_qp_5202
- Zpl Manual ZEBRA
- ckan.pdf
- Crash Course
- _Abstracts
- UA2Manual En
- RHHPC Installation Guide 20100524
- D73488GC11_ag.pdf
- refman-5.6-en.a4_MySQL.pdf
- Tf2.0 User Manual-Installation Guide
- Python for Econometrics, Statistics and Data Analysis
- WDM 4.8 Administrators Guide JUL2010
- IELTS for Teachers
- Change Log
- 1422C
- Wise Package Studio 7.0 SP3 Reference_V1.0
- beginnersguide_Iraf
- Front CoverCS2406 Lab Manual
- patil nishant finalresearchpaper
- Installation Manual 4.0
- T02 - Installment Sales & Consignment Sales.pdf
- berita 1.docx
- X7. Etapas.pdf
- sdfsdfsdf
- Chritical Dan Tanggapan Mengenai Laporan Pelaksanaan Rha Bencana Banjir Di Desa Tanjung Jaya Kota Bengkulu Pada Tahun 2013
- Lejanias
- Ejercicios Resueltos Del Tema 6. OCW Economia 2013 Definitiva
- CONTOH SURVEI PELANGGAN.docx
- 09_Arte Renacimiento Italiano
- concentracion eficaz
- 345742745 Manual de Taller Motor Perkins Serie 400C
- 2BL01183
- TALLER EVALUADO DE PORCENTAJES.docx
- Eds 2015 - Smp
- A Squire s Trial
- Arthur W. Pink - Gospel Preaching Commanded.pdf
- c6 Perceptron Multicapa Parte 3
- Kwitansi Imunisasi 2014.xlsx
- 339272846-Pedoman-Tata-Naskah-Puskesmas-Fix.doc
- lecture5.pdf
- El Cañoncito de Ramón Castilla
- Lampiran_Daya_Tampung_Unsoed_Program_Sarjana_Tahun_2018.pdf
- Proposal Babi
- Guía de Actividades y Rúbrica de Evaluación Unidad 3 Etapa 4 Estudios Epidemiológicos
- 145gfdsfs.docx
- Dr.mangalindung
- Eeeee
- Minerva Si Neptun
- Bronx English 2018_web
- june med school cafe
- Kx Intel Solution 1309
- Can a Dead Cat Bounce Make Money
- Handling Bugs
- Metaprogramming for Ocaml
- Kx Finance 1312
- EViews 5 Users Guide
- Quant research paper
- plt-book
- 43-4
- Does Price Matter
- Math Logic
- Cuckoo Search
- PairsTrading
- High Frequency Pairs Trading
- r Studio 101
- EViews7WhatsNew
- Jeff Ryan
- 2008_80
- r Studio Cheat Sheet
- STATISTICAL MODELING OF HIGH FREQUENCY FINANCIAL DATA:
- SSRN-id263745
- Readme
- Pairs Trading: A Cointegration Approach
- Greenwood-2008_Exercise and the Brain
- Forrester_SharePoint and BPM %E2%80%94 Finding the Sweet Spot_Derek Miers[1]
- Module 1 Intro to R and RStudio_1

You are on page 1of 6

Prof. Dr. Kurt Schmidheiny Universit¨ at Basel

A Short Guide to R with RStudio

2

1

Introduction

A Short Guide to R with RStudio

1 Introduction 2 Installing R and RStudio 3 The RStudio Environment 4 Additions to R: Packages 5 Where to get help 6 Opening and Saving Data 7 Importing Data 8 Data Manipulation 9 Descriptive Statistics 10 Graphs 11 OLS Regression 12 Important Functions and Operators

3 3 3

This guide introduces the basic commands of the statistical software R using the graphical interface RStudio. Examples are shown using Windows, the adaption to other platforms such as Mac OS and Linux is straightforward. This guide only introduces the basic commands for data management and estimation. More commands (on panel data, limited dependent variables, monte carlo experiments, etc.) will be described in the respective handouts. R commands are set in Courier.

2

3 5 5 5 7 9 9 11 12

Installing R and RStudio

R is a free open-source statistical software for various platforms such as Windows, Mac OS and Linux. R can be download from http://www.rproject.org/. RStudio is a convenient environment to run R. It is also a free, open-source, cross-platform software available at http://rstudio.org/. Install R ﬁrst, and then RStudio.

3

The RStudio Environment

When you start RStudio you will see a window with several panes: the Console pane where you type in your R commands and where results will be reported, the Workspace pane where open datasets and other objects are listed, the Plots pane where graphs are drawn, the History pane with a list of all your previously issued commands. More panes will be described below.

4

Additions to R: Packages

Many researchers provide their own R programs through the R project webpage. Many packages are already preinstalled in the basic R instal-

Version: 22-9-2013, 21:42

They can be directly activated from RStudio: go to the Packages pane and tick the corresponding package. where mydata is the name of the dataframe to be saved. they are activated by issuing a command in the Console. For example.RData or . Alternatively. 7 Importing Data Data from common statistical packages like SAS. file="example. You can view the dataframe by double-clicking on its name. mydata. software such as Stata and SAS. or > load("c:/my documents/mydir/example. in the Workspace pane. > help. is displayed in the Help pane by issuing the command > ?load The data is stored in an object called dataframe in the R workspace and appears in the Workspace pane. In the usual case where we are using variables from only one dataframe. STATA can be imported using the package foreign. the following com- . For example. load. e.3 Short Guides to Microeconometrics A Short Guide to R with RStudio 4 lation.search("read") Save data in R format: > save(mydata. you can search the R documentation. the help for a known command. For example.RData") 5 Where to get help The online help in R describes all basic R commands as well as commands in active packages. The working directory is speciﬁed depending on the operation system by > setwd("/Users/sck/mypath") > setwd("c:/my documents/mydir") > setwd("c:\\my documents\\mydir") # example Mac OS # example Windows # alternative Windows activates additional commands to import data from foreing.RData") You will need to activate the new package before using it. You can directly search the online help from the Help pane in RStudio. Alternatively.g. you can list the ﬁrst lines with the command > head(mydata) If you don’t know the exact expression for the command. > library("foreign") 6 Opening and Saving Data R will look for data or save data in the drive and working directory.rda) > load("example. SPSS. New packages can be installed by clicking on Install Packages in the Packages pane or with the command > install.g. Open an existing R dataﬁle (extension . i.RData") lists in the top-left pane all packages and commands that contain the word ”read”.e. other. it is convenient to apply the command > attach(mydata) or > help("load") which allows to access variables without specifying the dataframe.packages("foreign") Note: R cannot handle the usual windows path "c:\my documents\mydir". Alternatively. e.

“N/A”) to represent missing data. a comma (“.dta("c:/mypath/example.csv("c:/mypath/example. Categorical variables can only take a limited number of ordered or unordered values. where the option convert. 999 or -1).. • Excel uses a diﬀerent separator depending on the country settings: with English settings.csv("example. > mtcars$logmpg <. • Make sure that there are only dots (“.csv). If the spreadsheet to be imported contains missing value codes. then select Cells. Excel: • Make sure that missing data values are coded as empty cells or as numeric values (e. the following commands converts these into the R code for missing values: > mydata <. The ﬁle will be saved with a .” or ‘. .read. A numeric dummy variable is created by > mtcars$heavy = as.1. Numeric variables can be turned into factor variables by .strings="-999") A new variable in the dataframe mtcars is created by e. the command is simply > mtcars$logmpg <. Comma-separated ﬁles can be imported with the following command: > mydata <.g.csv". .”) in numbers. Logical variables are treated like a factor variable in estimation commands. the old one will be automatically replaced.log(mtcars$mpg) where logmpg is the name of the new variable and log(mpg) is an example of a mathematical function of existing variables.factors=FALSE) Under the File menu. convert. with other languages a semicolon (“.g. The example data is loaded and stored in dataframe mtcars as > data(mtcars) where example.”) is typically used. na.csv is the name of the comma-separated ﬁle and mydata is the name of the dataframe to be created in the R workspace.g..csv".g.log(mpg) There are diﬀerent ways of creating dummy variables in R: > mtcars$heavy = (mtcars$wt>=3) Comma-separated ﬁles can be generated with e... creates a logical variable which is TRUE if cars are heavier than 3. sep=". rather than empty cells. Then Save as type Comma Separated(.factors=FALSE asks R not to automatically convert all categorical variables into factor variables (see section 8).000 lb and FALSE otherwise. You can set the comma separator (“.. If you have already opened a dataframe with the same name in R.") 8 Data Manipulation All subsequent commands are shown using the example dataset mtcars. e.”) but no commas (“.dta". select Save As.g “-”.read. If the dataframe is attached.numeric(mtcars$wt>=3) or > mtcars$heavy = ifelse(mtcars$wt>=3. Do not use strings (e. See ?mtcars for a description of the data. You may need to change this under Format menu.”) used in the dataﬁle with the option sep.5 Short Guides to Microeconometrics A Short Guide to R with RStudio 6 mand imports data from STATA: > library(foreign) > read.csv extension. • Make sure that variable names are included only in the ﬁrst row of your spreadsheet.0) in both cases the resulting variable is 1 for heavy cars and 0 otherwise. Factor variables are categorical variables that can be either numeric or string variables.”) or semicolons (“.”) may be used. -999.

. The correlation between the two variables mpg and wt is computed by > cor(mpg.complete.obs" asks to use all pairs of observations without missing values. Draw a scatter plot of the variable mpg (y-axis) against wt (x-axis): .) of numeric variables mpg in dataframe mtcars varlist: > summary(mtcars$mpg) where the option useNA = "ifany" allows to include missing values. Display the correlation and covariance matrix.obs") > cov(mtcars. labels = c("light"..NA mtcars$hptype[mtcars$hp<100] <. For example. wt. wt>=3)) where the two-samples are by default not assumed to have equal variances. respectively."regular" mtcars$hptype[mtcars$hp>=200] <. am) 9 Descriptive Statistics Display univariate summary statistics (mean."economy" mtcars$hptype[mtcars$hp>=100 & mtcars$hp < 200] <.g. useNA="ifany") which also takes the two values 0 and 1. "heavy")) Note: Variables are automatically replaced if a new assignment is made."sport" mtcars$hptype = factor(mtcars$hptype) where the option useNA="ifany" requests to report missing values. or equivalently > summary(mpg[wt>=3]) 10 Graphs where only observations with weight wt heavier than 3000 lb are used.NULL A basic two-way table of frequency counts of two variables gear and am is produced by: > table(gear.factor(mtcars$heavy. use="pairwise. Variables are deleted from the dataframe by > mtcars$heavy <. Conditional assignments can be made by choosing a subset of observations with the [] operator. use="complete") creates a factor variable with 3 types of horsepower. . The package gmodels adds a command to produce a detailed twoway table with counts. > > > > > mtcars$hptype <. between all numeric variables in the dataframe mtcars > cor(mtcars. use="pairwise. simply > attach(mtcars) > summary(mpg) Perform a two-sample t-test of the hypothesis that mpg has the same mean for the two groups deﬁned by the dummy variable am > t.factor(mtcars$heavy) Report the frequency counts of variable am: > table(am.complete. am. > summary(subset(mpg.obs") The option use="pairwise.test(mpg~am) Statistics based on a subset of observations are given by e. String labels can be attached to the two values by > mtcars$heavyf <. row and column percentages along with Pearson’s chi-square statistic: > install. chisq=TRUE) Or if the dataframe mtcars is attached.packages("gmodels") > library(gmodels) > CrossTable(gear.complete.7 Short Guides to Microeconometrics A Short Guide to R with RStudio 8 > mtcars$heavyf <. median.

mpg.resid(fm) > mtcars$mpghat <. freq=FALSE) tests H0 : β1 = 0 and β2 = 0 against HA : β1 = 0 or β2 = 0. For example. and > waldtest(fm. vcov=sandwich) Draw a histogram of the variable mpg > hist(mpg.-wt-disp. subset=(wt>3)) where only cars heavier than 3000 lb are considered.lm(mpg~wt+disp. title and labels > plot(wt. See the handout on ”Functional Form in the linear Model” for more details. data=mtcars) The Eicker-Huber-White covariance is reported after estimation with > library(sandwich) . col="red") F -tests for one or more restrictions are calculated with the command waldtest which also uses the two libraries sandwich and lmtest > waldtest(fm. main="Mileage and weight". New variables with residuals and ﬁtted values are generated by > mtcars$uhat <.fitted(fm) 11 OLS Regression The multiple linear regression model mpgi = β0 + β1 wti + β2 dispi + ui is estimated by OLS with the lm function. data=mtcars) > summary(fm) R treats factor variables diﬀerent from numeric variables in estimation commands. ylab="mile per gallon") tests H0 : β1 = 0 against HA : β1 = 0 with Eicker-Huber-White. "wt". A constant is automatically added if not suppressed by > lm(mpg~wt+disp-1. Tranformations of variables are directly included with the I() function > lm(I(log(mpg))~wt+I(wt^2)+disp. . mpg) > library(lmtest) > coeftest(fm. vcov=sandwich) Draw a scatter with connected points.9 Short Guides to Microeconometrics A Short Guide to R with RStudio 10 > plot(wt.~. regresses the mileage of a car (mpg) on weight (wt) and displacement (disp). xlab="weight". data=mtcars. data=mtcars) Estimation based on a subsample is performed as > lm(mpg~wt+disp. type="l". vcov=sandwich) Add a red regression line after drawing the scatter plot > abline(lm(mpg~wt). > fm <.

digits=n) returns the absolute value of x. returns the square root of x if x>=0. returns the exponential function of x.11 Short Guides to Microeconometrics 12 Important Functions and Operators Some Mathematical Expressions abs(x) exp(x) log(x) log10(x) sqrt(x) floor(x) ceiling(x) trunc(x) round(x. returns the natural logarithm of x if x>0. largest integers not greater than x smallest integers not less than x integer by truncating x toward 0 rounds x to the nearest n decimal places Logical and Relational Operators > >= == & ! greater than greater or equal equal and not < <= != | less than smaller or equal not equal or Some Probability distributions and density functions pnorm(q) cumulative standard normal distribution dnorm(x) returns the standard normal density qnorm(p) inverse cumulative standard normal distribution . returns the log base 10 of x if x>0.

- Guide to Eng. Data Science Lecture (1) Ver2Uploaded byBugMyNuts
- SM Programming TutorialUploaded bysanzia
- 8913A_08Uploaded byzubairpam
- Classwork+5+Objects.docxUploaded bybalvinder singh
- Software Design Document csv2kmlUploaded byCharles Bulan
- How To Use Keygen.txtUploaded byLucas Chemello
- 최종본 Youngsue Han(145-168)Uploaded byYoung Sue Han
- Commons Based Peer Production DatasetsUploaded byP2Pvalue
- Exercises 1 Into RUploaded byWala' Saleh
- Classwork+5+Objects.docxUploaded bybalvinder singh
- 8943_y04_qp_5202Uploaded bymsaad4444
- Zpl Manual ZEBRAUploaded byForest Sxn
- ckan.pdfUploaded byLuis Huacho Lazo
- Crash CourseUploaded byHenrik Andersson
- _AbstractsUploaded byGerard Crotty
- UA2Manual EnUploaded byHeriton Rocha
- RHHPC Installation Guide 20100524Uploaded byEnrique Herrera Noya
- D73488GC11_ag.pdfUploaded bythieumaulennao
- refman-5.6-en.a4_MySQL.pdfUploaded byRomeo Romeo
- Tf2.0 User Manual-Installation GuideUploaded byNdomadu
- Python for Econometrics, Statistics and Data AnalysisUploaded byRenzo Gonzales Maldonado
- WDM 4.8 Administrators Guide JUL2010Uploaded byjodyg35950
- IELTS for TeachersUploaded byTalat
- Change LogUploaded byJdogzz
- 1422CUploaded byFrancesco Pilato
- Wise Package Studio 7.0 SP3 Reference_V1.0Uploaded byKuruba Chandrasekhar
- beginnersguide_IrafUploaded byKanthanakorn Noysena
- Front CoverCS2406 Lab ManualUploaded byLokeshwaran Kanagaraj
- patil nishant finalresearchpaperUploaded byapi-437250839
- Installation Manual 4.0Uploaded byTesla

- Kx Intel Solution 1309Uploaded bySuren Markosov
- Can a Dead Cat Bounce Make MoneyUploaded bySuren Markosov
- Handling BugsUploaded bySuren Markosov
- Metaprogramming for OcamlUploaded bySuren Markosov
- Kx Finance 1312Uploaded bySuren Markosov
- EViews 5 Users GuideUploaded bySuren Markosov
- Quant research paperUploaded bySuren Markosov
- plt-bookUploaded bySuren Markosov
- 43-4Uploaded bySuren Markosov
- Does Price MatterUploaded bySuren Markosov
- Math LogicUploaded bySuren Markosov
- Cuckoo SearchUploaded bySuren Markosov
- PairsTradingUploaded bySuren Markosov
- High Frequency Pairs TradingUploaded bySuren Markosov
- r Studio 101Uploaded bySuren Markosov
- EViews7WhatsNewUploaded bySuren Markosov
- Jeff RyanUploaded bySuren Markosov
- 2008_80Uploaded bySuren Markosov
- r Studio Cheat SheetUploaded bySuren Markosov
- STATISTICAL MODELING OF HIGH FREQUENCY FINANCIAL DATA:Uploaded bySuren Markosov
- SSRN-id263745Uploaded bySuren Markosov
- ReadmeUploaded bySuren Markosov
- Pairs Trading: A Cointegration ApproachUploaded bySuren Markosov
- Greenwood-2008_Exercise and the BrainUploaded bySuren Markosov
- Forrester_SharePoint and BPM %E2%80%94 Finding the Sweet Spot_Derek Miers[1]Uploaded bySuren Markosov
- Module 1 Intro to R and RStudio_1Uploaded bySuren Markosov