Professional Documents
Culture Documents
Answer:- dim(mtcars)
Data Structure
Q.2 According to R, what type of variable is am?
Answer factor
Levels
Q.3 # Look at the levels of the variable am
Answer levels(mtcars$am)
Recoding Variables
Q #Assign the value of mtcars to the new variable mtcars2
mtcars2 <- mtcars
Q #Assign the label "high" to mpgcategory where mpg is greater than or equal
to 20
mtcars2$mpgcategory[mtcars2$mpg >= 20] <- "high"
Q #Assign the label "low" to mpgcategory where mpg is less than 20
mtcars2$mpgcategory[mtcars2$mpg < 20] <- "low"
Q #Assign mpgcategory as factor to mpgfactor
mtcars2$mpgfactor <- as.factor(mtcars2$mpgcategory)
Examining Frequencies
Q #How many of the cars have a manual transmission?
13
Cumulative Frequency
Q # What percentage of cars have 3 or 5 gears?
62.5
50xp
Possible Answers
Because transmission is categorical, and carb is continuous
Distributions
50xp
Possible Answers
Graph 1 is left skewed, graph 2 is normally distributed, graph 3 is right
skewed.
Mode
# Produce a sorted frequency table of `carb` from `mtcars`
sort(table(mtcars$carb), decreasing = TRUE)
Range
# Minimum value
x <- min(mtcars$mpg)
# Maximum value
y <- max(mtcars$mpg)
# Calculate the range of mpg using x and y
y–x
Quartiles
Q # What is the value of the second quartile?
17.7100
Q # What is the value of the first quartile?
16.8925
IQR outliers
Q # What is the threshold value for an outlier below the first quartile?
13.88125
Q # What is the threshold value for an outlier above the third quartile?
21.91125
Standard Deviation
Q # Find the IQR of horsepower
IQR(mtcars$hp)
Q # Find the standard deviation of horsepower
sd(mtcars$hp)
Q # Find the IQR of miles per gallon
IQR(mtcars$mpg)
Q # Find the standard deviation of miles per gallon
sd(mtcars$mpg)
Mean, median and mode.
50xp
Mean, median and mode are all measures of the average. In a perfect normal
distribution the mean, median and mode values are identical, but when the data is
skewed this changes. In the the graph on the right which of the following
statements are most accurate?
The mode is higher than the mean. It makes most sense to use the
median to measure central tendency.
Calculating Z-scores
# Calculate the z-scores of mpg
(mtcars$mpg - mean(mtcars$mpg)) / sd(mtcars$mpg)
Distributions And Z-scores
50xp
In the distribution shown on the right, what percentage of data will fall between
the z-scores of -2 and 2?
95 %
Z-score Outliers
50xp
-3 and 3