Professional Documents
Culture Documents
EricZivot
DepartmentofEconomics,UniversityofWashington
October7,2008
PreliminaryandIncomplete
ImportingCommaSeparatedValue(.csv)DataintoR
Whenyoudownloadassetpricedatafromfinance.yahoo.com,itgetssavedinacommaseparatedvalue
(. csv)file.Thisisatextfilewhereeachvalueisseparated(delimited)byacomma,.Thistypeoffile
iseasilyreadintobothExcelandR.Excelopens. csvfilesdirectly.Theeasiestwayimportdatain.csv
filesintoRistousetheRfunctionr ead. csv( ) .
Toillustrate,considerthemonthlyadjustedclosingpricedataonStarbucks(SBUX)andMicrosoft(MSFT)
inthefilessbuxPr i ces. csvandmsf t Pr i ces. csv.Thesefileareavailableontheclass
homeworkpage.Thefirst5rowsofthesbuxPr i ces. csvfileare
Noticethatthefirstrowcontainsthenamesofthecolumns,thedateinformationisinthefirstcolumn
andtheadjustedclosingprice(closepriceadjustedforstocksplitsanddividends)isinthesecond
column.AssumethatthisfileislocatedinthedirectoryC: \ cl asses\ econ424\ f al l 2008.Toread
thedataintoRuse
> sbux. df = r ead. csv( " C: / cl asses/ econ424/ f al l 2008/ sbuxPr i ces. csv" ,
+ header = TRUE, st r i ngsAsFact or s = FALSE)
NowdothesamefortheMicrosoftdata.
Remarks:
1. Notehowthedirectorystructureisspecifiedusingforwardslashes/
2. Theargumentheader = TRUEindicatesthatthecolumnnamesareinthefirstrowofthefile
3. Theargumentst r i ngsAsFact or s = FALSEtellsthefunctiontotreatthedate
informationascharacterdataandnottoconvertittoafactorvariable.
TheSBUXdataisimportedintosbux. df whichisanobjectofclassdat a. f r ame
> cl ass( sbux. df )
[ 1] " dat a. f r ame"
Adat a. f r ameobjectisarectangulardataobjectwiththedataincolumns.Thecolumnnamesare
> col names( sbux. df )
[ 1] " Dat e" " Adj . Cl ose"
Andthefirst6rowsare
> head( sbux. df )
Dat e Adj . Cl ose
1 3/ 31/ 1993 1. 19
2 4/ 1/ 1993 1. 21
3 5/ 3/ 1993 1. 50
4 6/ 1/ 1993 1. 53
5 7/ 1/ 1993 1. 48
6 8/ 2/ 1993 1. 52
Thedatainthecolumnscanbeofdifferenttypes.TheDat ecolumncontainsthedateinformationas
characterdataandtheAdj . Cl osecolumncontainstheadjustedpricedataasnumericdata.Notice
thatthedatesarenotallmonthlyclosingdatesbutthattheadjustedclosingpricesareforthelast
tradingdayofthemonth.
> cl ass( sbux. df $Dat e)
[ 1] " char act er "
> cl ass( sbux. df $Adj . Cl ose)
[ 1] " numer i c"
Representingtimeseriesdatainadat a. f r ameobjecthasthedisadvantagethatthedateindex
informationcannotbeefficientlyused.Youcannotsubsetobservationsbasedonthedateindex.You
mustsubsetbyobservationnumber.Forexample,toextractthepricesbetweenMarch,1994and
March,1995youmustuse
> whi ch( sbux. df $Dat e == " 3/ 1/ 1994" )
[ 1] 13
> whi ch( sbux. df $Dat e == " 3/ 1/ 1995" )
[ 1] 25
> sbux. df [ 13: 25, ]
Dat e Adj . Cl ose
13 3/ 1/ 1994 1. 52
14 4/ 4/ 1994 1. 86
25 3/ 1/ 1995 1. 50
Inaddition,thedefaultplotmethodfordat a. f r ameobjectsdonotutilizethedateinformationfor
thexaxis.Forexample,thefollowingcalltopl ot ( ) createsanerror
Regularlyspacedtimeseriesdata,datathatareseparatedbyafixedintervaloftime,maybe
representedasobjectsofclasst s.Suchdataaretypicallyobservedmonthly,quarterlyorannually.t s
objectsarecreatedusingthet s( ) constructorfunction(baseR).Forexample,
Thefrequencyperperiodandtimeintervalbetweenobservationsofat sobjectmaybeextractedusing
> f r equency( sbux. t s)
[ 1] 12
> del t at ( sbux. t s)
[ 1] 0. 08333333
However,subsettingat sobjectproducesanumericobject
> t mp = sbux. t s[ 1: 5]
> cl ass( t mp)
[ 1] " numer i c"
> t mp
[ 1] 1. 19 1. 21 1. 50 1. 53 1. 48
Mergingtsobjects
Tocombinethetsobjectssbux.tsandmsft.tsintoasingleobjectusethecbind()function
> sbuxmsf t . t s = cbi nd( sbux. t s, msf t . t s)
> cl ass( sbuxmsf t . t s)
[ 1] " mt s" " t s"
Figure1Plotcreatedwithplot.ts()
Monthl y closing price of SBUX
Time
Adjusted close
1995 2000 2005
0
10
20
30
Fort sobjectswithmultiplecolumns(mt sobjects),twotypesofplotscanbecreated.Thefirsttype,
illustratedinFigure2,putseachseriesinaseparatepanel
> pl ot ( sbuxmsf t . t s)
Figure2Multipletimeseriesplot
Thesecondtype,showninFigure3,putsallseriesonthesameplot
> pl ot ( sbuxmsf t . t s, pl ot . t ype=" si ngl e" ,
+ mai n=" Mont hl y cl osi ng pr i ces on SBUX and MSFT" ,
+ yl ab=" Adj ust ed cl ose pr i ce" ,
+ col =c( " bl ue" , " r ed" ) , l t y=1: 2)
> l egend( 1995, 45, l egend=c( " SBUX" , " MSFT" ) , col =c( " bl ue" , " r ed" ) ,
+ l t y=1: 2)
0
1
0
2
0
3
0
s
b
u
x
.
t
s
1
0
2
0
3
0
4
0
5
0
1995 2000 2005
m
s
f
t
.
t
s
Time
sbuxmsft.ts
Figure3Multipletimeseriesplot
Manipulatingtsobjectsandcomputingreturns
Somecommonmanipulationsoftimeseriesdatainvolvelagsanddifferencesusingthefunctions
l ag( ) anddi f f ( ) . Forexample,tolagthepricedatainsbux. t sbyonetimeperioduse
> l ag( sbux. t s)
Tolagthepricedataby12periodsuse
> l ag( sbux. t s, k=12)
Noticewhathappenswhenyoucombineat sobjectwithitslag
> cbi nd( sbux. t s, l ag( sbux. t s) )
sbux. t s l ag( sbux. t s)
Feb 1993 NA 1. 19
Mar 1993 1. 19 1. 21
Apr 1993 1. 21 1. 50
May 1993 1. 50 1. 53
J un 1993 1. 53 1. 48
Thet sclassisratherlimited,especiallyforrepresentingfinancialdatathatisnotregularlyspaced.For
example,thet sclasscannotbeusedtorepresentdailyfinancialdatabecausesuchdataareonly
observedonbusinessdays.Thatis,abusinessdaytimeclockgenerallyrunsfromMondaytoFriday
skippingtheweekends.Sodataareequallyspacedintimewithintheweekbutthespacingbetween
FridayandMondayisdifferent.Thistypeofirregularspacingcannotberepresentedusingthet sclass.
Averyflexibletimeseriesclassiszoo(Zeileisorderedobservations)createdbyAchimZeileisand
GaborGrothendieckandavailableinthepackagezooonCRAN.Thezooclasswasdesignedtohandle
timeseriesdatawithanarbitraryorderedtimeindex.Thisindexcouldbearegularlyspacedsequenceof
dates,anirregularlyspacedsequenceofdates,oranumericindex.Azooobjectessentiallyattaches
dateinformationwithdata.
InstallandloadthepackagezoointoRbeforecompletingtheexamplesinthenextsections.
> l i br ar y( zoo)
Creatingatimeindex
ThereareseveralwaystorepresentatimeindexorsequenceofdatesinR.Table1summarizesthe
maintimeindexclassesavailableinR
Table1DateindexclassesinR
Class Package Description
Date Base Representcalendardatesasthenumberofdayssince19700101
POSIXct Base Representcalendardatesasthe(signed)numberofsecondssincethe
beginningof1970asanumericvector.Supportsvarioustimezone
specifications(e.g.GMT,PST,ESTetc.)
yearmon zoo Representmonthlydata.Internallyitholdsthedataasyearplus0for
January,1/12forFebruary,2/12forMarchandsooninorderthatits
internalrepresentationisthesameast sclasswithf r equency = 12.
yearqtr zoo Representquarterlydata.Internallyitholdsthedataasyearplus0for
Quarter1,1/4forQuarter2andsooninorderthatitsinternal
representationisthesameast sclasswithf r equency = 4.
TheDateclass(BaseR)
TocreateatimeindexofclassDat estartinginMarch,1993andendinginMarch,2003use
> t d = seq( as. Dat e( " 1993/ 3/ 1" ) , as. Dat e( " 2003/ 3/ 1" ) , " mont hs" )
> cl ass( t d)
[ 1] " Dat e"
> head( t d)
[ 1] " 1993- 03- 01" " 1993- 04- 01" " 1993- 05- 01" " 1993- 06- 01" " 1993- 07- 01"
[ 6] " 1993- 08- 01"
Dat eobjectsinternallyrepresentdatesasthenumberofdayssinceJanuary1,1970.Forexample,
March1st,1993is8460daysfromJanuary1,1970:
> as. numer i c( t d[ 1] )
[ 1] 8460
Havinganumericrepresentationfordatesallowsforsomesimpledatearithmetic.Forexample,there
are31daysbetweenApril1,1993andMarch1,1993
> t d[ 2] - t d[ 1]
Ti me di f f er ence of 31 days
ThePOSIXctclass(BaseR)
Tobecompleted.
Theyearmonclass(Packagezoo)
Tobecompleted.
Theyearmonclass(Packagezoo)
Tobecompleted.
ThetimeDateclass(PackagefCalendar)
Tobecompleted.
Creatingazooobject
Tocreateazooobjectoneneedsatimeindexanddata.Thetimeindexmusthavethesamenumberof
rowsasthedataobjectandcanbeanyvectorcontainingorderedobservations.Typically,thetimeindex
isanobjectofclassDat e,POSI Xct ,year mon,year qt r ort i meDat e.
Considercreatingzooobjectsfromthemonthlyinformationinthedat a. f r ameobjectssbux. df
andmsf t . df .First,createatimeindexofclassDat estartinginMarch,1993andendinginMarch,
2003
> t d = seq( as. Dat e( " 1993/ 3/ 1" ) , as. Dat e( " 2003/ 3/ 1" ) , " mont hs" )
> cl ass( t d)
[ 1] " Dat e"
> head( t d)
[ 1] " 1993- 03- 01" " 1993- 04- 01" " 1993- 05- 01" " 1993- 06- 01" " 1993- 07- 01"
[ 6] " 1993- 08- 01"
Nowthatwehaveatimeindex,wecancreatethezooobjectbycombiningthetimeindexwith
numericdata
> sbux. z = zoo( x=sbux. df $Adj . Cl ose, or der . by=t d)
> msf t . z = zoo( x=msf t . df $Adj . Cl ose, or der . by=t d)
> cl ass( sbux. z)
[ 1] " zoo"
> st r ( sbux. z)
zoo ser i es f r om1993- 03- 01 t o 2003- 03- 01
Dat a: num[ 1: 121] 1. 19 1. 21 1. 5 1. 53 1. 48 1. 52 1. 71 1. 67 1. 39 1. 39
. . .
I ndex: Cl ass ' Dat e' num[ 1: 121] 8460 8491 8521 8552 8582 . . .
> head( sbux. z)
1993- 03- 01 1993- 04- 01 1993- 05- 01 1993- 06- 01 1993- 07- 01 1993- 08- 01
1. 19 1. 21 1. 50 1. 53 1. 48 1. 52
Anadvantageofzooobjectsisthatsubsettingcanbedonewiththetimeindex.Forexample,toextract
thedataforMarch1993andMarch2004use
> sbux. z[ as. Dat e( c( " 2003/ 3/ 1" , " 2004/ 3/ 1" ) ) ]
2003- 03- 01 2004- 03- 01
12. 88 18. 93
2003- 10- 01 2003- 11- 01 2003- 12- 01 2004- 01- 01 2004- 02- 01 2004- 03- 01
15. 80 16. 08 16. 58 18. 31 18. 70 18. 93
Creatinglagsanddifferencesworksthesamewayforzooobjectsasitdoesfort sobjects.
Mergingzooobjects
Thepl ot ( ) functioncanbeusedtoplotzooobjects,andfollowsasyntaxsimilartothepl ot ( )
functionusedforplottingt sobjects.ThefollowingcommandsproducetheplotillustratedinFigure4
> # pl ot one ser i es at a t i me and add a l egend
> pl ot ( sbux. z, col =" bl ue" , l t y=1, l wd=2, yl i m=c( 0, 50) )
> l i nes( msf t . z, col =" r ed" , l t y=2, l wd=2)
> l egend( x=" t opl ef t " , l egend=c( " SBUX" , " MSFT" ) , col =c( " bl ue" , " r ed" ) ,
+ l t y=1: 2)
> # pl ot mul t i pl e ser i es at once
> pl ot ( sbuxmsf t . z, pl ot . t ype=" si ngl e" , col =c( " bl ue" , " r ed" ) , l t y=1: 2,
+ l wd=2)
> l egend( x=" t opl ef t " , l egend=c( " SBUX" , " MSFT" ) , col =c( " bl ue" , " r ed" ) ,
+ l t y=1: 2)
Convertingatsobjecttoazooobject
Tobecompleted.
0
1
0
2
0
3
0
4
0
5
0
Index
s
b
u
x
m
s
f
t
.
z
1995 2000 2005
SBUX
MSFT
Importingdataintoazooobject
Tobecompleted.
Thefunctionr ead. zoo( ) canreaddatafromatextfilestoredondiskandcreateazooobject.This
functionisbasedontheBaseRfunctionr ead. t abl e( ) andhasasimilarsyntax.Forexample,toread
thedateandpriceinformationinthetextfilesbux. csvandcreatethezooobjectsbux. zuse
ImportingDataDirectlyfromYahoo!