Introduction To R For Finance

Introduction to R for Finance
Software Required to be Installed for R
 R software(3.6.0 version): https://cran.r-project.org/bin/windows/base/
 R Studio: https://www.rstudio.com/products/rstudio/download/
Scope
 The Basics
(Arithmetic in R, Assignment & Variables, Data Type Exploration)
 Vectors and Matrices

(Combine, Coerce, Vector names, Visualise the vector, Matrix subsetting, correlation)
 Data Frames
(Data frame with structure, accessing and subsetting data frames, PV of cashflows data working)
 Factors
(Create a factor, Factor levels, Strings as factor, Bucketing a numeric variable into a factor)
 Lists
(Create a list, Named lists, Removing from a list, Split it, Split-Apply-Combine)
The Basics
 Arithmetic in R
# Addition
3+5
## [1] 8
# Subtraction
6-4
## [1] 2
# Multiplication
3*4
## [1] 12
The Basics
# Division
4/2
## [1] 2
# Exponentiation
2^4
## [1] 16
# Modulo (# The modulo returns the remainder of the division of the number to the left by the number on the
right.)
7 %% 3
## [1] 1
The Basics
# Assignment and variables
You use <- to assign a variable
Ex.1
# Assign 200 to savings

savings <- 200
# Print the value of savings to the console

savings
## [1] 200
Ex.2
# Assign 100 to my_money
my_money <- 100
# Assign 200 to dans_money

dans_money <- 200
# Add my_money and dans_money

my_money + dans_money
## [1] 300
The Basics
# Financial returns
multiplier = 1 + (return / 100)
# Variables for starting_cash and 5% return during January

starting_cash <- 200
jan_ret <- 5 # 5% interest rate
jan_mult <- 1 + (jan_ret / 100)
# How much money do you have at the end of January?

post_jan_cash <- starting_cash * jan_mult
# Print post_jan_cash
post_jan_cash
## [1] 210
# January 10% return multiplier

jan_ret_10 <- 10
jan_mult_10 <- 1 + 10 / 100
The Basics
# How much money do you have at the end of January now?
post_jan_cash_10 <- starting_cash * jan_mult_10
# Print post_jan_cash_10
post_jan_cash_10
## [1] 220
# Starting cash and returns

starting_cash <- 200
jan_ret <- 4 # 4% interest rate
feb_ret <- 5
# Multipliers
jan_mult <- 1 + 4 / 100
feb_mult <- 1 + 5 / 100
# Total cash at the end of the two months

total_cash <- starting_cash * jan_mult * feb_mult
# Print total_cash
total_cash
## [1] 218.4
The Basics
# Data Type Exploration
Numerics are decimal numbers like 4.5. A special type of numeric is an integer, which is a numeric without a
decimal piece. Integers must be specified like 4L
Logicals are the boolean values TRUE and FALSE. Capital letters are important here; true and false are not valid
Characters are text values like “hello world”.
# Apple's stock price is a numeric

apple_stock <- 150.45
# Bond credit ratings are characters

credit_rating <- "AAA"
# You like the stock market. TRUE or FALSE?

my_answer <- TRUE
# Print my_answer
my_answer
## [1] TRUE
The Basics
# What’s that data type?
A way to find what data type a variable is: class(my_var)
a <- TRUE
class(a)
## [1] "logical"
b <- 5.5
class(b)
## [1] "numeric"
c <- "Hello World"

class(c)
## [1] "character"
Vectors and Matrices
# c()ombine
# Another numeric vector

ibm_stock <- c(159.82, 160.02, 159.84)
# Another character vector

finance <- c("stocks", "bonds", "investments")
# A logical vector
logic <- c(TRUE, FALSE, TRUE)
# Coerce it
A vector can only be composed of one data type.
This means that you cannot have both a numeric and a character in the same vector.
If you attempt to do this, the lower ranking type will be coerced into the higher ranking type.
For example: c(1.5, “hello”) results in c(“1.5”, “hello”) where the numeric 1.5 has been coerced into the
character data type.
The hierarchy for coercion is:

logical < integer < numeric < character
Logicals are coerced a bit differently depending on what the highest data type is. c(TRUE, 1.5) will return c(1,
1.5) where TRUE is coerced to the numeric 1 (FALSE would be converted to a 0).
On the other hand, c(TRUE, “this_char”) is converted to c(“TRUE”, “this_char”).

# Vectors of 12 months of returns, and month names
ret <- c(5, 2, 3, 7, 8, 3, 5, 9, 1, 4, 6, 3)
months <- c("Jan", "Feb", "Mar", "Apr", "May", "Jun", "Jul", "Aug", "Sep", "Oct", "Nov", "Dec")
# Add names to ret

names(ret) <- months
# Print out ret to see the new names!

ret
## Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec
## 5 2 3 7 8 3 5 9 1 4 6 3
# Visualise your vector
# Look at the data

apple_stock <- c(109.49, 109.90, 109.11, 109.95, 111.03, 112.12, 113.95, 113.30, 115.19, 115.19, 115.82,
115.97, 116.64, 116.95, 117.06, 116.29, 116.52, 117.26, 116.76, 116.73, 115.82)
# Plot the data points

plot(apple_stock) # The default is "p" for points
# Plot the data as a line graph

plot(apple_stock, type = "l")
Weighted average
The weighted average allows you to calculate your portfolio return over a time period. Consider the following
example:
Assume you have 20% of your cash in Microsoft stock, and 80% of your cash in Sony stock. If, in January,
Microsoft earned 5% and Sony earned 7%, what was your total portfolio return?
# # Weights and returns

micr_ret <- 7
sony_ret <- 9
micr_weight <- .2
sony_weight <- .8
# Portfolio return
portf_ret <- micr_ret * micr_weight + sony_ret * sony_weight
R does arithmetic with vectors! Take advantage of this fact to calculate the portfolio return more efficiently.
# Weights, returns, and company names
ret <- c(7, 9)
weight <- c(.2, .8)
companies <- c("Microsoft", "Sony")
Assign company names to your vectors
names(ret) <- companies
names(weight) <- companies
# Multiply the returns and weights together

ret_X_weight <- ret * weight
# Print ret_X_weight
ret_X_weight
## Microsoft Sony
## 1.4 7.2
# Sum to get the total portfolio return

portf_ret <- sum(ret_X_weight)
# Print portf_ret
portf_ret
## [1] 8.6
Vector Subsetting
What if you only wanted the first month of returns from the vector of 12 months of returns?
# Define ret
ret <- c(5, 2, 3, 7, 8, 3, 5, 9, 1, 4, 6, 3)
names(ret) <- c("Jan", "Feb", "Mar", "Apr", "May", "Jun", "Jul", "Aug", "Sep", "Oct", "Nov", "Dec")
ret
## Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec
## 5 2 3 7 8 3 5 9 1 4 6 3
# # First 6 months of returns
ret[1:6]
## Jan Feb Mar Apr May Jun

## 5 2 3 7 8 3
# Just March and May

ret[c("Mar", "May")]
## Mar May
## 3 8
# Omit the first month of returns
ret[-1]
## Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec
## 2 3 7 8 3 5 9 1 4 6 3
# Create a Matrix
Matrices are similar to vectors, except they are in 2 dimensions!
# A vector of 9 numbers
my_vector <- c(1, 2, 3, 4, 5, 6, 7, 8, 9)
# 3x3 matrix
my_matrix <- matrix(data = my_vector, nrow = 3, ncol = 3)
# Print my_matrix
my_matrix
# Filling across using byrow = TRUE

matrix(data = c(2, 3, 4, 5), nrow = 2, ncol = 2, byrow = TRUE)
# Matrix <- bind vectors
Create them from multiple vectors that you want to combine together.
# Define vectors
apple <- c(109.49, 109.90, 109.11, 109.95, 111.03, 112.12, 113.95, 113.30, 115.19, 115.19, 115.82, 115.97, 116.64,
116.95, 117.06, 116.29, 116.52, 117.26, 116.76, 116.73, 115.82)
ibm <- c(159.82, 160.02, 159.84, 160.35, 164.79, 165.36, 166.52, 165.50, 168.29, 168.51, 168.02, 166.73, 166.68,
167.60, 167.33, 167.06, 166.71, 167.14, 166.19, 166.60, 165.99)
micr <- c(59.20, 59.25, 60.22, 59.95, 61.37, 61.01, 61.97, 62.17, 62.98, 62.68, 62.58, 62.30, 63.62, 63.54, 63.54, 63.55,
63.24, 63.28, 62.99, 62.90, 62.14)
# cbind the vectors together

cbind_stocks <- cbind(apple, ibm, micr)
# Print cbind_stocks
cbind_stocks
# rbind the vectors together
rbind_stocks <- rbind(apple, ibm, micr)
# Print rbind_stocks
rbind_stocks
Visualize your matrix

# Define matrix
apple_micr_matrix <- cbind(apple, micr)
# View the data

apple_micr_matrix
# Scatter plot of Microsoft vs Apple

plot(apple_micr_matrix)
cor()relation
Correlation is a measure of association between two things, here, stock prices, and is represented by a number from -1
to 1.
• 1 represents perfect positive correlation,
• -1 represents perfect negative correlation, and
• 0 means that the stocks move independently of each other.
The cor() function will calculate the correlation between two vectors, or will create a correlation matrix when given
a matrix.
# Correlation of Apple and IBM

cor(apple, ibm)
# stock matrix
stocks <- cbind(apple, micr, ibm)
# cor() of all three

cor(stocks)
Matrix Subsetting
The basic structure is:

my_matrix[row, col]
• To select the first row and first column of stocks from the last example: stocks[1,1]
• To select the entire first row, leave the col empty: stocks[1, ]
• To select the first two rows: stocks[1:2, ] or stocks[c(1,2), ]
• To select an entire column, leave the row empty: stocks[, 1] or stocks[, "apple"]
# Third row
stocks[3, ]
# Fourth and fifth row of the ibm column

stocks[4:5, "ibm"]
# apple and micr columns

stocks[, c("apple", "micr")]
Data Frames
# The data frame combines the structure of a matrix with the flexibility of having different types of data in each
column. The data frame is a powerful tool and the most important data structure in R.
Create your first data.frame() # It is like a table

Create a data frame of your business’s future cash flows, using the data.frame() function.
# Variables
company <- c("A", "A", "A", "B", "B", "B", "B")
cash_flow <- c(1000, 4000, 550, 1500, 1100, 750, 6000)
year <- c(1, 3, 4, 1, 2, 4, 5)
# Data frame
cash <- data.frame(company, cash_flow, year)
# Print cash
cash
Data Frames
Making head()s and tail()s of your data with some str()ucture
A few very useful functions:

• head() - Returns the first few rows of a data frame. By default, 6. To change this, use head(cash, n = ___)
• tail() - Returns the last few rows of a data frame. By default, 6. To change this, use tail(cash, n = ___)
• str() - Check the structure of an object. This fantastic function will show you the data type of the object you pass in
(here, data.frame), and will list each column variable along with its data type.
# Call head() for the first 4 rows

head(cash, n = 4)
# Call tail() for the last 3 rows

tail(cash, n= 3)
# Call str()
str(cash)
Data Frames
Naming your columns / rows
Change your column names with the colnames() function and row names with the rownames() function.
# Fix your column names

colnames(cash) <- c("company", "cash_flow", "year")
# Print out the column names of cash

colnames(cash)
Accessing and subsetting data frames

# Third row, second column
cash[3, 2]
# Fifth row of the "year" column

cash[5, "year"]
# Select the year column

cash$year
Data Frames
# Select the cash_flow column and multiply by 2
cash$cash_flow * 2
# Delete the company column

cash$company <- NULL
# Print cash again

cash
What if you are only interested in the cash flows from company A? For more flexibility, try
subset(cash, company == "A")
• The first argument is the name of your data frame, cash.
• You shouldn’t put company in quotes!
• The == is the equality operator. It tests to find where two things are equal, and returns a logical vector.
# Restore cash
company <- c("A", "A", "A", "B", "B", "B", "B")
cash_flow <- c(1000, 4000, 550, 1500, 1100, 750, 6000)
year <- c(1, 3, 4, 1, 2, 4, 5)
Data Frames
cash <- data.frame(company, cash_flow, year)
# Rows about company B

subset(cash, company == "B")
# Rows with cash flows due in 1 year

subset(cash, year == 1)
Adding new columns

Create a new column in your data frame using data_frame$new_column
# Quarter cash flow scenario

cash$quarter_cash <- cash$cash_flow * 0.25
cash
# Double year scenario

cash$double_year <- cash$year * 2
cash
Data Frames
Present value of projected cash flows
Calculate the present value of $100 to be received 1 year from now at a 5% interest rate.
present_value <- cash_flow * (1 + interest / 100) ^ -year
# Restore cash
cash$quarter_cash <- NULL
cash$double_year <- NULL
# Present value of $4000, in 3 years, at 5%

present_value_4k <- 4000 * (1+0.05)^(-3)
# Present value of all cash flows

cash$present_value <- cash$cash_flow * (1+0.05)^(-cash$year)
# Print out cash

cash
Data Frames
Calculate how much company A and company B individually contribute to the total present value
# Total present value of cash

total_pv <- sum(cash$present_value)
total_pv
# Company B information
cash_B <- subset(cash, company == "B")
cash_B
# Total present value of cash_B

total_pv_B <- sum(cash_B$present_value)
total_pv_B
Factors
Are you a male or female? On a scale of 1-10, how are you feeling? These are questions with answers that fall into a
limited number of categories. These types of data can be classified as factors. In this chapter, you will use bond credit
ratings to learn all about creating, ordering, and subsetting factors!
Create a factor
Create a factor by using the factor() function
# credit_rating character vector
credit_rating <- c("BB", "AAA", "AA", "CCC", "AA", "AAA", "B", "BB")
# Create a factor from credit_rating

credit_factor <- factor(credit_rating)
# Print out your new factor

credit_factor
# Call str() on credit_rating

str(credit_rating)
Factors
# Call str() on credit_factor
str(credit_factor)
Access the unique levels of your factor by using the levels() function
# Identify unique levels

levels(credit_factor)
# Rename the levels of credit_factor

levels(credit_factor) <- c("2A", "3A", "1B", "2B", "3C")
# Print credit_factor
credit_factor
Factors
Factor summary
Present a table of the counts of each bond credit rating by using the summary() function.
# Restore credit_factor
levels(credit_factor) <- c("AA", "AAA", "B", "BB", "CCC")
# Summarize the character vector, credit_rating

summary(credit_rating)
# Summarize the factor, credit_factor

summary(credit_factor)
Visualize your factor

Visualize a table by using the plot() function
# Visualize your factor!

plot(credit_factor)
Factors
Bucketing a numeric variable into a factor
Create a factor from a numeric vector by using the cut() function.
# Define AAA_rank.
AAA_rank <- c(31, 48, 100, 53, 85, 73, 62, 74, 42, 38, 97, 61, 48, 86, 44, 9, 43, 18, 62, 38, 23, 37, 54, 80, 78, 93, 47, 100,
22, 22, 18, 26, 81, 17, 98, 4, 83, 5, 6, 52, 29, 44, 50, 2, 25, 19, 15, 42, 30, 27)
# Create 4 buckets for AAA_rank using cut()

AAA_factor <- cut(x = AAA_rank, breaks = c(0, 25, 50, 75, 100))
# Rename the levels

levels(AAA_factor)
levels(AAA_factor) <- c("low", "medium", "high", "very_high")
# Print AAA_factor
AAA_factor
# Plot AAA_factor
plot(AAA_factor)
Factors
Create an ordered factor
To order your factor, there are two options.
1. When creating a factor,

o credit_rating <- c("AAA", "AA", "A", "BBB", "AA", "BBB", "A")
o credit_factor_ordered <- factor(credit_rating, ordered = TRUE, levels = c("AAA", "AA", "A", "BBB"))
2. For an existing unordered factor

o ordered(credit_factor, levels = c("AAA", "AA", "A", "BBB"))
# Use unique() to find unique words

unique(credit_rating)
# Create an ordered factor

credit_factor_ordered <- factor(credit_rating, ordered = TRUE, levels = c("AAA", "AA", "BB", "B", "CCC"))
# Plot credit_factor_ordered
plot(credit_factor_ordered)
Factors
Subsetting a factor
Removing AAA from credit_factor doesn’t remove the AAA level. To remove the AAA level entirely, add drop =
TRUE
# Define credit_factor
credit_factor <- factor(c("AAA", "AA", "A", "BBB", "AA", "BBB", "A"), ordered = TRUE, levels = c("BBB", "A", "AA",
"AAA"))
# Remove the A bonds at positions 3 and 7. Don't drop the A level.

keep_level <- credit_factor[-c(3, 7)]
# Plot keep_level
plot(keep_level)
# Remove the A bonds at positions 3 and 7. Drop the A level

drop_level <- droplevels(keep_level)
# Plot drop_level
plot(drop_level)
Factors
stringsAsFactors
R’s default behavior when creating data frames is to convert all characters into factors. You can turn off this behavior by
adding stringsAsFactors = FALSE
# Variables
credit_rating <- c("AAA", "A", "BB")
bond_owners <- c("Dan", "Tom", "Joe")
# Create the data frame of character vectors, bonds

bonds <- data.frame(credit_rating, bond_owners, stringsAsFactors = FALSE)
bonds
# Use str() on bonds

str(bonds)
# Create a factor column in bonds called credit_factor from credit_rating

bonds$credit_factor <- factor(bonds$credit_rating, ordered = TRUE, levels = c("AAA", "A", "BB"))
# Use str() on bonds again

str(bonds)
Lists
Creat a list by using the list() function
# List components
name <- "Apple and IBM"

apple <- c(109.49, 109.90, 109.11, 109.95, 111.03)
ibm <- c(159.82, 160.02, 159.84, 160.35, 164.79)
cor_matrix <- cor(cbind(apple, ibm))
# Create a list
portfolio <- list(name, apple, ibm, cor_matrix)
# View your first list

portfolio
Lists
Named lists
# Add names to your portfolio

names(portfolio) <- c("portfolio_name", "apple", "ibm", "correlation")
# Print portfolio
portfolio
Access elements in a list

• To access the elements in the list, use [ ].
• To pull out the data inside each element of your list, use [[ ]].
• If your list is named, you can use the $ operator: my_list$my_words. This is the same as using [[ ]] to return the inner
data
Lists
# Second and third elements of portfolio
portfolio[c(2,3)]
Use $ to get the correlation data

portfolio$correlation
# Third item of the second element of portfolio

portfolio[[c(2,3)]]
Adding to a list
Add new elements to an exiting list by using existingList$newElement or c(existingList, newElement)
# Add weight: 20% Apple, 80% IBM

portfolio$weight <- c(apple = 0.2, ibm = 0.8)
# Print portfolio
portfolio
# Change the weight variable: 30% Apple, 70% IBM

portfolio$weight <- c(apple = 0.3, ibm = 0.7)
Lists
# Print portfolio to see the changes
portfolio
Removing from a list

Remove elements from a list by using $, [], or [[]].
# Define portfolio
portfolio_name <- "Apple and IBM"
apple <- c(109.49, 109.90, 109.11, 109.95, 111.03)
ibm <- c(159.82, 160.02, 159.84, 160.35, 164.79)
microsoft <- c(150.0, 152.0, 154.0, 154.5)
correlation <- cor(cbind(apple, ibm))
portfolio <- list(portfolio_name = portfolio_name, apple = apple, ibm = ibm, microsoft = microsoft, correlation =
correlation)
#Take a look at portfolio

portfolio
Lists
# Remove the microsoft stock prices from your portfolio
portfolio$microsoft <- NULL
Portfolio
Split it
Split a dataframe by group using split(). And get your original data frame back by using unsplit().
# Define cash
cash$present_value <- NULL
# Define grouping from year

grouping <- cash$year
# Split cash on your new grouping

split_cash <- split(cash, grouping)
# Look at your split_cash list

split_cash
str(split_cash)
Lists
# # Unsplit split_cash to get the original data back.
original_cash <- unsplit(split_cash, grouping)
# Print original_cash
cash
Split-Apply-Combine
A common data science problem is to split your data frame by a grouping, apply some transformation to each
group, and then recombine those pieces back into one data frame. This is such a common class of problems in R
that it has been given the name split-apply-combine
# Define split_cash and grouping

split_cash <- split(cash, company)
grouping <- company
# Print split_cash
split_cash
# Print the cash_flow column of B in split_cash

split_cash$B$cash_flow
Lists
# Set the cash_flow column of company A in split_cash to 0
split_cash$A$cash_flow <- 0
# Use the grouping to unsplit split_cash

cash_no_A <- unsplit(split_cash, grouping)
# Print cash_no_A
cash_no_A
Attributes
Return a list of attributes about the object you pass in by using attributes(). Access a specific attribute by
using attr()
# my_matrix and my_factor

my_matrix <- matrix(c(1,2,3,4,5,6), nrow = 2, ncol = 3)
rownames(my_matrix) <- c("Row1", "Row2")
colnames(my_matrix) <- c("Col1", "Col2", "Col3")
Lists
# my_factor <- factor(c("A", "A", "B"), ordered = T, levels = c("A", "B"))
# attributes of my_matrix
attributes(my_matrix)
# Just the dim attribute of my_matrix

attr(my_matrix, which = "dim")
# attributes of my_factor
attributes(my_factor)

Introduction To R For Finance

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Introduction To R For Finance

Uploaded by

Copyright:

Available Formats

Introduction to R for Finance

Software Required to be Installed for R

 Vectors and Matrices

# Assign 200 to savings

# Print the value of savings to the console

# Assign 200 to dans_money

# Add my_money and dans_money

# Variables for starting_cash and 5% return during January

# How much money do you have at the end of January?

# January 10% return multiplier

# Starting cash and returns

# Total cash at the end of the two months

Characters are text values like “hello world”.

# Apple's stock price is a numeric

# Bond credit ratings are characters

# You like the stock market. TRUE or FALSE?

c <- "Hello World"

# Another numeric vector

# Another character vector

The hierarchy for coercion is:

On the other hand, c(TRUE, “this_char”) is converted to c(“TRUE”, “this_char”).

# Add names to ret

# Print out ret to see the new names!

# Look at the data

# Plot the data points

# Plot the data as a line graph

# # Weights and returns

# Multiply the returns and weights together

# Sum to get the total portfolio return

## Jan Feb Mar Apr May Jun

# Just March and May

# Omit the first month of returns

# Filling across using byrow = TRUE

# cbind the vectors together

Visualize your matrix

# View the data

# Scatter plot of Microsoft vs Apple

# Correlation of Apple and IBM

# cor() of all three

The basic structure is:

# Fourth and fifth row of the ibm column

# apple and micr columns

Create your first data.frame() # It is like a table

A few very useful functions:

# Call head() for the first 4 rows

# Call tail() for the last 3 rows

# Fix your column names

# Print out the column names of cash

Accessing and subsetting data frames

# Fifth row of the "year" column

# Select the year column

# Delete the company column

# Print cash again

# Rows about company B

# Rows with cash flows due in 1 year

Adding new columns

# Quarter cash flow scenario

# Double year scenario

# Present value of $4000, in 3 years, at 5%

# Present value of all cash flows

# Print out cash

# Total present value of cash

# Total present value of cash_B