You are on page 1of 14

WorkingwithTimeSeriesDatainR

EricZivot
DepartmentofEconomics,UniversityofWashington
October7,2008
PreliminaryandIncomplete
ImportingCommaSeparatedValue(.csv)DataintoR

Whenyoudownloadassetpricedatafromfinance.yahoo.com,itgetssavedinacommaseparatedvalue
(. csv)file.Thisisatextfilewhereeachvalueisseparated(delimited)byacomma,.Thistypeoffile
iseasilyreadintobothExcelandR.Excelopens. csvfilesdirectly.Theeasiestwayimportdatain.csv
filesintoRistousetheRfunctionr ead. csv( ) .

Toillustrate,considerthemonthlyadjustedclosingpricedataonStarbucks(SBUX)andMicrosoft(MSFT)
inthefilessbuxPr i ces. csvandmsf t Pr i ces. csv.Thesefileareavailableontheclass
homeworkpage.Thefirst5rowsofthesbuxPr i ces. csvfileare

Dat e, Adj Cl ose


3/ 31/ 1993, 1. 19
4/ 1/ 1993, 1. 21
5/ 3/ 1993, 1. 5
6/ 1/ 1993, 1. 53

Noticethatthefirstrowcontainsthenamesofthecolumns,thedateinformationisinthefirstcolumn
andtheadjustedclosingprice(closepriceadjustedforstocksplitsanddividends)isinthesecond
column.AssumethatthisfileislocatedinthedirectoryC: \ cl asses\ econ424\ f al l 2008.Toread
thedataintoRuse

> sbux. df = r ead. csv( " C: / cl asses/ econ424/ f al l 2008/ sbuxPr i ces. csv" ,
+ header = TRUE, st r i ngsAsFact or s = FALSE)

NowdothesamefortheMicrosoftdata.

Remarks:
1. Notehowthedirectorystructureisspecifiedusingforwardslashes/
2. Theargumentheader = TRUEindicatesthatthecolumnnamesareinthefirstrowofthefile
3. Theargumentst r i ngsAsFact or s = FALSEtellsthefunctiontotreatthedate
informationascharacterdataandnottoconvertittoafactorvariable.
TheSBUXdataisimportedintosbux. df whichisanobjectofclassdat a. f r ame
> cl ass( sbux. df )
[ 1] " dat a. f r ame"
Adat a. f r ameobjectisarectangulardataobjectwiththedataincolumns.Thecolumnnamesare
> col names( sbux. df )
[ 1] " Dat e" " Adj . Cl ose"
Andthefirst6rowsare
> head( sbux. df )

Dat e Adj . Cl ose
1 3/ 31/ 1993 1. 19
2 4/ 1/ 1993 1. 21
3 5/ 3/ 1993 1. 50
4 6/ 1/ 1993 1. 53
5 7/ 1/ 1993 1. 48
6 8/ 2/ 1993 1. 52

Thedatainthecolumnscanbeofdifferenttypes.TheDat ecolumncontainsthedateinformationas
characterdataandtheAdj . Cl osecolumncontainstheadjustedpricedataasnumericdata.Notice
thatthedatesarenotallmonthlyclosingdatesbutthattheadjustedclosingpricesareforthelast
tradingdayofthemonth.
> cl ass( sbux. df $Dat e)
[ 1] " char act er "
> cl ass( sbux. df $Adj . Cl ose)
[ 1] " numer i c"
Representingtimeseriesdatainadat a. f r ameobjecthasthedisadvantagethatthedateindex
informationcannotbeefficientlyused.Youcannotsubsetobservationsbasedonthedateindex.You
mustsubsetbyobservationnumber.Forexample,toextractthepricesbetweenMarch,1994and
March,1995youmustuse
> whi ch( sbux. df $Dat e == " 3/ 1/ 1994" )
[ 1] 13
> whi ch( sbux. df $Dat e == " 3/ 1/ 1995" )
[ 1] 25
> sbux. df [ 13: 25, ]
Dat e Adj . Cl ose
13 3/ 1/ 1994 1. 52
14 4/ 4/ 1994 1. 86

25 3/ 1/ 1995 1. 50

Inaddition,thedefaultplotmethodfordat a. f r ameobjectsdonotutilizethedateinformationfor
thexaxis.Forexample,thefollowingcalltopl ot ( ) createsanerror

> pl ot ( sbux. df $Dat e, sbux. df $Adj . Cl ose, t ype=" l " )



RepresentingDataasTimeSeriesObjects

Regularlyspacedtimeseriesdata,datathatareseparatedbyafixedintervaloftime,maybe
representedasobjectsofclasst s.Suchdataaretypicallyobservedmonthly,quarterlyorannually.t s
objectsarecreatedusingthet s( ) constructorfunction(baseR).Forexample,

> sbux. t s = t s( dat a=sbux. df $Adj . Cl ose, f r equency = 12,


st ar t =c( 1993, 3) , end=c( 2008, 3) )
> cl ass( sbux. t s)
[ 1] " t s"

> msf t . t s = t s( dat a=msf t . df $Adj . Cl ose, f r equency = 12,
st ar t =c( 1993, 3) , end=c( 2008, 3) )

Theargumentf r equency = 12specifiesthatthatpricesaresampledmonthly.Thestartingand


endingmonthsarespecifiedasatwoelementvectorwiththefirstelementgivingtheyearandthe
secondelementgivingthemonth.Whenprinted,t sobjectsshowthedatesassociatedwiththe
observations.

> sbux. t s
J an Feb Mar Apr May J un J ul Aug Sep Oct Nov
1993 1. 19 1. 21 1. 50 1. 53 1. 48 1. 52 1. 71 1. 67 1. 39

Thefunctionsst ar t ( ) andend( ) showthefirstandlastdatesassociatedwiththedata



> st ar t ( sbux. t s)
[ 1] 1993 3
> end( sbux. t s)
[ 1] 2008 3

Thet i me( ) functionextractsthetimeindexasat sobject

> t i me( sbux. t s)


J an Feb Mar Apr May J un
1993 1993. 167 1993. 250 1993. 333 1993. 417

Thefrequencyperperiodandtimeintervalbetweenobservationsofat sobjectmaybeextractedusing

> f r equency( sbux. t s)
[ 1] 12

> del t at ( sbux. t s)
[ 1] 0. 08333333

However,subsettingat sobjectproducesanumericobject

> t mp = sbux. t s[ 1: 5]
> cl ass( t mp)
[ 1] " numer i c"
> t mp
[ 1] 1. 19 1. 21 1. 50 1. 53 1. 48

Tosubsetat sobjectandpreservethedateinformationusethewi ndow( ) function



> t mp = wi ndow( sbux. t s, st ar t =c( 1993, 3) , end=c( 1993, 8) )
> cl ass( t mp)
[ 1] " t s"

> t mp
Mar Apr May J un J ul Aug
1993 1. 19 1. 21 1. 50 1. 53 1. 48 1. 52

Theargumentsst ar t =c( 1993, 3) andend=c( 1993, 8) specifythebeginningandendingdates
ofthewindow.

Mergingtsobjects

Tocombinethetsobjectssbux.tsandmsft.tsintoasingleobjectusethecbind()function
> sbuxmsf t . t s = cbi nd( sbux. t s, msf t . t s)

> cl ass( sbuxmsf t . t s)
[ 1] " mt s" " t s"

Sincesbuxmsf t . t scontainstwot sobjectsitisassignedtheadditionalclassmt s(multipletime


series).Thefirstfiverowsare
> wi ndow( sbuxmsf t . t s, st ar t =c( 1993, 3) , end=c( 1993, 7) )
sbux. t s msf t . t s
Mar 1993 1. 19 2. 43
Apr 1993 1. 21 2. 25
May 1993 1. 50 2. 44
J un 1993 1. 53 2. 32
J ul 1993 1. 48 1. 95

Plottingtsobjects

t sobjectshavetheirownplotmethod(pl ot . t s)

> pl ot ( sbux. t s, col =" bl ue" , l wd=2, yl ab=" Adj ust ed cl ose" ,
+ mai n=" Mont hl y cl osi ng pr i ce of SBUX" )

WhichproducestheplotinFigure1.Toplotasubsetofthedatausethewi ndow( ) functioninsideof


pl ot ( )

> pl ot ( wi ndow( sbux. t s, st ar t =c( 2000, 3) , end=c( 2008, 3) ) ,
+ yl ab=" Adj ust ed cl ose" , col =" bl ue" , l wd=2,
+ mai n=" Mont hl y cl osi ng pr i ce of SBUX" )

Figure1Plotcreatedwithplot.ts()
Monthl y closing price of SBUX
Time
Adjusted close
1995 2000 2005
0
10
20
30
Fort sobjectswithmultiplecolumns(mt sobjects),twotypesofplotscanbecreated.Thefirsttype,
illustratedinFigure2,putseachseriesinaseparatepanel
> pl ot ( sbuxmsf t . t s)


Figure2Multipletimeseriesplot

Thesecondtype,showninFigure3,putsallseriesonthesameplot
> pl ot ( sbuxmsf t . t s, pl ot . t ype=" si ngl e" ,
+ mai n=" Mont hl y cl osi ng pr i ces on SBUX and MSFT" ,
+ yl ab=" Adj ust ed cl ose pr i ce" ,
+ col =c( " bl ue" , " r ed" ) , l t y=1: 2)

> l egend( 1995, 45, l egend=c( " SBUX" , " MSFT" ) , col =c( " bl ue" , " r ed" ) ,
+ l t y=1: 2)

0
1
0
2
0
3
0
s
b
u
x
.
t
s
1
0
2
0
3
0
4
0
5
0
1995 2000 2005
m
s
f
t
.
t
s
Time
sbuxmsft.ts

Figure3Multipletimeseriesplot
Manipulatingtsobjectsandcomputingreturns

Somecommonmanipulationsoftimeseriesdatainvolvelagsanddifferencesusingthefunctions
l ag( ) anddi f f ( ) . Forexample,tolagthepricedatainsbux. t sbyonetimeperioduse
> l ag( sbux. t s)

Tolagthepricedataby12periodsuse
> l ag( sbux. t s, k=12)

Noticewhathappenswhenyoucombineat sobjectwithitslag
> cbi nd( sbux. t s, l ag( sbux. t s) )
sbux. t s l ag( sbux. t s)
Feb 1993 NA 1. 19
Mar 1993 1. 19 1. 21
Apr 1993 1. 21 1. 50
May 1993 1. 50 1. 53
J un 1993 1. 53 1. 48

Monthly closing prices on SBUX and MSFT


Time
A
d
j
u
s
t
e
d

c
l
o
s
e

p
r
i
c
e
1995 2000 2005
0
1
0
2
0
3
0
4
0
5
0
SBUX
MSFT
Noticethatthel ag( ) functionshiftsthetimeindexbackbyanamountk.Toshiftthetimeindex
forwardsetktoanegativenumber
> l ag( sbux. t s, k=- 1)
> l ag( sbux. t s, k=- 12)
> cbi nd( sbux. t s, l ag( sbux. t s, k=- 1) )
sbux. t s l ag( sbux. t s, k = - 1)
Mar 1993 1. 19 NA
Apr 1993 1. 21 1. 19
May 1993 1. 50 1. 21
J un 1993 1. 53 1. 50
J ul 1993 1. 48 1. 53

Tocomputethefirstdifferenceinpricesuse
> di f f ( sbux. t s)

Noticethatapplicationofdiff()isequivalentto
> sbux. t s l ag( sbux. t s, k=- 1)

Tocomputea12lagdifference(annualdifferenceformonthlydata)use
> di f f ( sbux. t s, l ag=12)

whichisequivalenttousing
> sbux. t s l ag( sbux. t s, k=- 12)

Noticewhathappenswhenyoucombineat sobjectwithitsfirstdifference
> cbi nd( sbux. t s, di f f ( sbux. t s) )
sbux. t s di f f ( sbux. t s)
Mar 1993 1. 19 NA
Apr 1993 1. 21 0. 02
May 1993 1. 50 0. 29
J un 1993 1. 53 0. 03
J ul 1993 1. 48 - 0. 05

Youcanusethedi f f ( ) andl ag( ) functionstogethertocomputethesimpleoneperiodreturn
> sbuxRet Si mpl e. t s = di f f ( sbux. t s) / l ag( sbux. t s, k=- 1)
> msf t Ret Si mpl e. t s = di f f ( msf t . t s) / l ag( msf t . t s, k=- 1)
> wi ndow( cbi nd( sbuxRet Si mpl e. t s, msf t Ret Si mpl e. t s) ,
+ st ar t =c( 1993, 4) , end=c( 1993, 7) )
sbuxRet Si mpl e. t s msf t Ret Si mpl e. t s
Apr 1993 0. 01680672 - 0. 07407407
May 1993 0. 23966942 0. 08444444
J un 1993 0. 02000000 - 0. 04918033
J ul 1993 - 0. 03267974 - 0. 15948276

Similarly,tocomputethe12periodsimplereturnuse
> di f f ( sbux. t s, l ag=12) / l ag( sbux. t s, k=- 12)

Youcanusethel og( ) anddi f f ( ) functionstogethertocomputecontinuouslycompoundedreturns
> sbuxRet . t s = di f f ( l og( sbux. t s) )

> msf t Ret . t s = di f f ( l og( msf t . t s) )

> wi ndow( cbi nd( sbuxRet . t s, msf t Ret . t s) , st ar t =c( 1993, 4) ,
+ end=c( 1993, 7) )
sbuxRet . t s msf t Ret . t s
Apr 1993 0. 01666705 - 0. 07696104
May 1993 0. 21484475 0. 08106782
J un 1993 0. 01980263 - 0. 05043085
J ul 1993 - 0. 03322565 - 0. 17373781

Tocomputethe12periodcontinuouslycompoundedreturnuse
> di f f ( l og( sbux. t s) , l ag=12)
RepresentingTimeSeriesDataaszooobjects

Thet sclassisratherlimited,especiallyforrepresentingfinancialdatathatisnotregularlyspaced.For
example,thet sclasscannotbeusedtorepresentdailyfinancialdatabecausesuchdataareonly
observedonbusinessdays.Thatis,abusinessdaytimeclockgenerallyrunsfromMondaytoFriday
skippingtheweekends.Sodataareequallyspacedintimewithintheweekbutthespacingbetween
FridayandMondayisdifferent.Thistypeofirregularspacingcannotberepresentedusingthet sclass.
Averyflexibletimeseriesclassiszoo(Zeileisorderedobservations)createdbyAchimZeileisand
GaborGrothendieckandavailableinthepackagezooonCRAN.Thezooclasswasdesignedtohandle
timeseriesdatawithanarbitraryorderedtimeindex.Thisindexcouldbearegularlyspacedsequenceof
dates,anirregularlyspacedsequenceofdates,oranumericindex.Azooobjectessentiallyattaches
dateinformationwithdata.
InstallandloadthepackagezoointoRbeforecompletingtheexamplesinthenextsections.
> l i br ar y( zoo)
Creatingatimeindex

ThereareseveralwaystorepresentatimeindexorsequenceofdatesinR.Table1summarizesthe
maintimeindexclassesavailableinR
Table1DateindexclassesinR
Class Package Description
Date Base Representcalendardatesasthenumberofdayssince19700101
POSIXct Base Representcalendardatesasthe(signed)numberofsecondssincethe
beginningof1970asanumericvector.Supportsvarioustimezone
specifications(e.g.GMT,PST,ESTetc.)
yearmon zoo Representmonthlydata.Internallyitholdsthedataasyearplus0for
January,1/12forFebruary,2/12forMarchandsooninorderthatits
internalrepresentationisthesameast sclasswithf r equency = 12.
yearqtr zoo Representquarterlydata.Internallyitholdsthedataasyearplus0for
Quarter1,1/4forQuarter2andsooninorderthatitsinternal
representationisthesameast sclasswithf r equency = 4.

TheDateclass(BaseR)
TocreateatimeindexofclassDat estartinginMarch,1993andendinginMarch,2003use
> t d = seq( as. Dat e( " 1993/ 3/ 1" ) , as. Dat e( " 2003/ 3/ 1" ) , " mont hs" )

> cl ass( t d)
[ 1] " Dat e"

> head( t d)
[ 1] " 1993- 03- 01" " 1993- 04- 01" " 1993- 05- 01" " 1993- 06- 01" " 1993- 07- 01"
[ 6] " 1993- 08- 01"

Dat eobjectsinternallyrepresentdatesasthenumberofdayssinceJanuary1,1970.Forexample,
March1st,1993is8460daysfromJanuary1,1970:
> as. numer i c( t d[ 1] )
[ 1] 8460

Havinganumericrepresentationfordatesallowsforsomesimpledatearithmetic.Forexample,there
are31daysbetweenApril1,1993andMarch1,1993
> t d[ 2] - t d[ 1]
Ti me di f f er ence of 31 days

ThePOSIXctclass(BaseR)
Tobecompleted.
Theyearmonclass(Packagezoo)
Tobecompleted.
Theyearmonclass(Packagezoo)
Tobecompleted.
ThetimeDateclass(PackagefCalendar)
Tobecompleted.
Creatingazooobject

Tocreateazooobjectoneneedsatimeindexanddata.Thetimeindexmusthavethesamenumberof
rowsasthedataobjectandcanbeanyvectorcontainingorderedobservations.Typically,thetimeindex
isanobjectofclassDat e,POSI Xct ,year mon,year qt r ort i meDat e.
Considercreatingzooobjectsfromthemonthlyinformationinthedat a. f r ameobjectssbux. df
andmsf t . df .First,createatimeindexofclassDat estartinginMarch,1993andendinginMarch,
2003
> t d = seq( as. Dat e( " 1993/ 3/ 1" ) , as. Dat e( " 2003/ 3/ 1" ) , " mont hs" )

> cl ass( t d)
[ 1] " Dat e"

> head( t d)
[ 1] " 1993- 03- 01" " 1993- 04- 01" " 1993- 05- 01" " 1993- 06- 01" " 1993- 07- 01"
[ 6] " 1993- 08- 01"

Nowthatwehaveatimeindex,wecancreatethezooobjectbycombiningthetimeindexwith
numericdata
> sbux. z = zoo( x=sbux. df $Adj . Cl ose, or der . by=t d)
> msf t . z = zoo( x=msf t . df $Adj . Cl ose, or der . by=t d)

> cl ass( sbux. z)
[ 1] " zoo"

> st r ( sbux. z)
zoo ser i es f r om1993- 03- 01 t o 2003- 03- 01
Dat a: num[ 1: 121] 1. 19 1. 21 1. 5 1. 53 1. 48 1. 52 1. 71 1. 67 1. 39 1. 39
. . .
I ndex: Cl ass ' Dat e' num[ 1: 121] 8460 8491 8521 8552 8582 . . .

> head( sbux. z)
1993- 03- 01 1993- 04- 01 1993- 05- 01 1993- 06- 01 1993- 07- 01 1993- 08- 01
1. 19 1. 21 1. 50 1. 53 1. 48 1. 52

Thetimeindexanddatacanbeextractedusingthei ndex( ) andcor edat a( ) functions


> i ndex( sbux. z)
[ 1] " 1993- 03- 01" " 1993- 04- 01" " 1993- 05- 01" " 1993- 06- 01" " 1993- 07- 01"

> cor edat a( sbux. z)


[ 1] 1. 19 1. 21 1. 50 1. 53 1. 48 1. 52 1. 71 1. 67 1. 39 1. 39

Thest ar t ( ) andend( ) functionsalsoworkforzooobjects


> st ar t ( sbux. z)
[ 1] " 1993- 03- 01"

> end( sbux. z)
[ 1] " 2003- 03- 01"

Anadvantageofzooobjectsisthatsubsettingcanbedonewiththetimeindex.Forexample,toextract
thedataforMarch1993andMarch2004use
> sbux. z[ as. Dat e( c( " 2003/ 3/ 1" , " 2004/ 3/ 1" ) ) ]
2003- 03- 01 2004- 03- 01
12. 88 18. 93

Thewi ndow( ) functionalsoworkswithzooobjects


> wi ndow( sbux. z, st ar t =as. Dat e( " 2003/ 3/ 1" ) , end=as. Dat e( " 2004/ 3/ 1" ) )

2003- 03- 01 2003- 04- 01 2003- 05- 01 2003- 06- 01 2003- 07- 01 2003- 08- 01

2003- 10- 01 2003- 11- 01 2003- 12- 01 2004- 01- 01 2004- 02- 01 2004- 03- 01
15. 80 16. 08 16. 58 18. 31 18. 70 18. 93

Creatinglagsanddifferencesworksthesamewayforzooobjectsasitdoesfort sobjects.
Mergingzooobjects

Tocombinethezooobjectssbux. zandmsf t . zintoasingleobjectuseeitherthecbi nd( ) orthe


mer ge( ) functions
> sbuxmsf t . z = cbi nd( sbux. z, msf t . z)

> cl ass( sbuxmsf t . z)
[ 1] " zoo"

> head( sbuxmsf t . z)
sbux. z msf t . z
1993- 03- 01 1. 19 2. 43
1993- 04- 01 1. 21 2. 25
1993- 05- 01 1. 50 2. 44
1993- 06- 01 1. 53 2. 32
1993- 07- 01 1. 48 1. 95
1993- 08- 01 1. 52 1. 98

Usecbi nd( ) whencombiningzooobjectsthathavethesametimeindex,andusemer ge( ) when
theobjectshavedifferenttimeindices.Note,youcanonlycombinezooobjectsforwhichthetime
indexisofacommonclass(e.g.alltimeindicesareDat eobjects).
Plottingzooobjects

Thepl ot ( ) functioncanbeusedtoplotzooobjects,andfollowsasyntaxsimilartothepl ot ( )
functionusedforplottingt sobjects.ThefollowingcommandsproducetheplotillustratedinFigure4
> # pl ot one ser i es at a t i me and add a l egend
> pl ot ( sbux. z, col =" bl ue" , l t y=1, l wd=2, yl i m=c( 0, 50) )
> l i nes( msf t . z, col =" r ed" , l t y=2, l wd=2)
> l egend( x=" t opl ef t " , l egend=c( " SBUX" , " MSFT" ) , col =c( " bl ue" , " r ed" ) ,
+ l t y=1: 2)

> # pl ot mul t i pl e ser i es at once
> pl ot ( sbuxmsf t . z, pl ot . t ype=" si ngl e" , col =c( " bl ue" , " r ed" ) , l t y=1: 2,
+ l wd=2)
> l egend( x=" t opl ef t " , l egend=c( " SBUX" , " MSFT" ) , col =c( " bl ue" , " r ed" ) ,
+ l t y=1: 2)

Convertingatsobjecttoazooobject
Tobecompleted.
0
1
0
2
0
3
0
4
0
5
0
Index
s
b
u
x
m
s
f
t
.
z
1995 2000 2005
SBUX
MSFT
Importingdataintoazooobject
Tobecompleted.
Thefunctionr ead. zoo( ) canreaddatafromatextfilestoredondiskandcreateazooobject.This
functionisbasedontheBaseRfunctionr ead. t abl e( ) andhasasimilarsyntax.Forexample,toread
thedateandpriceinformationinthetextfilesbux. csvandcreatethezooobjectsbux. zuse
ImportingDataDirectlyfromYahoo!

Thefunctionget . hi st . quot einthepackagetseriescanbeusedtodirectlyimportdataonasingle


tickersymbolfromf i nance. yahoo. comintoazooobject(multiplesymbolsarenotsupported).
TodownloaddailyadjustedclosingpricedataonSBUXovertheperiodMarch1,1993throughMarch1,
2008use(makesurethetseriespackagehasbeeninstalled)
> l i br ar y( t ser i es)
> SBUX. z = get . hi st . quot e( i nst r ument =" sbux" , st ar t =" 1993- 03- 01" ,
+ end=" 2008- 03- 01" , quot e=" Adj Cl ose" ,
+ pr ovi der =" yahoo" , or i gi n=" 1970- 01- 01" ,
+ compr essi on=" d" , r et cl ass=" zoo" )
t r yi ng URL
' ht t p: / / char t . yahoo. com/ t abl e. csv?s=sbux&a=2&b=01&c=1993&d=2&e=01&f =20
08&g=d&q=q&y=0&z=sbux&x=. csv'
Cont ent t ype ' t ext / csv' l engt h unknown
opened URL
downl oaded 179 Kb

t i me ser i es ends 2008- 02- 29

Theoptionalargumentor i gi n=1970- 01- 01setstheorigindatefortheinternalnumeric
representationofthedateindex,andtheargumentcompr essi on=dindicatesthatdailydata
shouldbedownloaded.TheobjectSBUX. zisofclasszooandthedateindexisofclassDat e
> cl ass( SBUX. z)
[ 1] " zoo"

> cl ass( i ndex( SBUX. z) )
[ 1] " Dat e"

> st ar t ( SBUX. z)
[ 1] " 1993- 03- 01"

> end( SBUX. z)
[ 1] " 2008- 02- 29"

You might also like