Professional Documents
Culture Documents
Question 1.
Question 2.
a)
b)
Question 3)
Box plot
Question 4.
Boxplot
ANOVA Results
An ANOVA test was performed to determine whether hourly earnings vary by
province/region. The results shows that there is statistically significant differences in hourly
earnings by province/region F (4, 24081) = 74.32, P <0.05). This confirms that the hourly
earnings vary by province/region.
Question 5
Scatterplot
Regression analysis
A regression analysis was performed to examine whether there is a significant relationship
between hours of work per week and hourly earnings across workers. The results show that
there is a statistically significant relationship between hours of work per week and hourly
earnings across workers (F(1, 24084) =723.9, P <0.05).
The regression model given the coefficient values is
HRLYEARN =0.231431UTOTHRS +19.619471
Therefore, from
y=bx+ α
The slope (b) from the equation is 0.231431 while the e intercept ¿) is 19.619471. The r2
value from the regression output = 0.02918. Therefore, the r value = sqrt (0.02918) =
0.1708215
setwd("C:/Users/HP/Desktop")
getwd()
install.packages("readxl")
install.packages("dplyr")
install.packages("ggplot2")
library(readxl)
library(dplyr)
library(ggplot2)
data <- read_excel("July_2017_21_Labour_Force_Survey_dataset.xlsx")
# Open a data viewer in RStudio
View(data)
data <- data %>%
mutate(PERMTEMP2 = ifelse(PERMTEMP == 1, "Permanent", "Temporary"))
# Recode PROV into PROV2
data <- data %>%
mutate(PROV2 = case_when(
PROV %in% c(10, 11, 12, 13) ~ "Atlantic",
PROV == 24 ~ "Quebec",
PROV == 35 ~ "Ontario",
PROV %in% c(46, 47, 48) ~ "Prairies",
PROV == 59 ~ "British Columbia",
TRUE ~ as.character(PROV) # Keep other values as is
))
# Create a boxplot
ggplot(data, aes(x = PERMTEMP2, y = HRLYEARN)) +
geom_boxplot() +
labs(title = "Relationship between Job Permanency and Hourly Earnings",
x = "Job Permanency",
y = "Hourly Earnings")
# Hypothesis testing
t_test_result <- t.test(HRLYEARN ~ PERMTEMP2, data = data)
t_test_result
# Create a boxplot
ggplot(data, aes(x = PROV2, y = HRLYEARN)) +
geom_boxplot() +
labs(title = "Hourly Earnings Across Regions",
x = "Region",
y = "Hourly Earnings")
# ANOVA test
anova_result <- aov(HRLYEARN ~ PROV2, data = data)