Professional Documents
Culture Documents
Home My courses MATH 1280 - AY2021-T2 Final Exam (Days 1 - 4) Review Quiz
Information
An automobile producing company operates car dealerships that sell its cars. In order to assess the service costumers get at the dealerships a team
from the Customer Relationship Department was assembled. The team was sent to 20 randomly selected dealerships. Each dealership was visited once
by the team for an entire working day. During a visit the team interviewed all the customers that arrived at the that dealership.
One of the questions that each of the customers were asked is: "Do you currently possess a car made by our company?" The answers were marked
down as "Yes", "No", or "Refuse to Answer", depending on the customer's response.
Select one:
The correct answer is: Customers that arrive at dealerships of the company.
Select one:
d. Customers that arrive at the selected dealerships during the day of visit.
The correct answer is: Customers that arrive at the selected dealerships during the day of visit.
Question 3 Not answered Marked out of 1.00
Select one:
b. Customers that arrive at the selected dealerships during the day of visit.
The correct answer is: The percentage of costumers that possess a car made by the company.
A statistic that may be used to summarize the outcome of the survey is:
Select one:
a. Chi-Square
b. Percentage
c. T-Test
d. Anova
Information
The Customers Service Center of a large bank receive calls from customers. The number of incoming calls between 8:00 AM and 8:10 AM in consecutive
days were recorded. The number of incoming calls during the working days of the month of September were:
The number of incoming calls during the working days of the month of February were:
9, 11, 14, 6, 4, 6, 7, 3, 3, 2, 5, 6, 6, 5, 6, 7, 5, 4, 3, 5.
Create two R objects, one by the name "Sep", and the other by the name "Feb". The rst object should contain the rst data and the second object
should contain the second data. Produce a frequency table (with the function "table") for each of the objects and a bar plot (with the combination of the
function "plot" and the function "table"). For comparison, bar plots like the one you should obtain are presented in Figure 1.6. and Figure 1.7.
In light of the description given above, and based on the tables and/or the plots, select the correct answer in each of the following 6 questions:
Select one:
a. A population.
b. A sample.
c. A parameter.
d. A statistic.
The average number of calls that arrived between 8:00 AM and 8:10 AM during the working days of all months is:
Select one:
a. A population.
b. A sample.
c. A parameter.
d. A statistic.
The average number of calls recorded in the object "Feb" is 5.85. The average in the object "Sep" is 8.5. The di erence between these two numbers is:
Select one:
a. A population.
b. A sample.
c. A statistic.
In which of the months the number of incoming calls between 8:00 AM and 8:10 AM tends to be smaller?
Select one:
a. September.
b. February.
The height of the forth bar from the left of the bar plot for the month of February represents the fact that?
Select one:
The correct answer is: In 4 of the days of the month of February there were 5 incoming calls.
Question 10 Not answered Marked out of 1.00
The location of the highest bar of the bar plot for the month of September represents the fact that?
Select one:
The correct answer is: 10 incoming calls came during 6 days of the month of September.
Information
The next two questions refer to the following relative frequency table on hurricanes that have made direct hits on the U.S. between 1851 and 2004.
Hurricanes are given a strength category rating based on the minimum wind speed generated by the storm. (http://www.nhc.noaa.gov/gifs/table5.gif )
(ALTERNATE DOWNLOAD LINK)
What is the relative frequency of direct hits that were category 2 hurricanes?
Select one:
a. 0.2637
b. 0.7363
c. 0.2601
The total of all relative frequencies is 1.000. Denote by p the relative frequency of category 2 hurricanes. Observe that
0.3993 + p + 0.2601 + 0.0659 + 0.0110 = 1.0000
Consequently,
p = 1 - 0.3993 + 0.2601 + 0.0659 + 0.0110 = 1 - 0.7363 = 0.2637
The relative frequency of direct hits that were AT LEAST a category 3 storm is . (The numerical answer that you provide should be of the
The relative frequency of category 3 or more is the sum of the relative frequencies of categories 3, 4 , and 5:
0.2601 + 0.0659 + 0.0110 = 0.3370
Sixty adults with gum disease were asked the number of times per week they used to oss before their diagnoses. The (incomplete) results are shown
below:
Answer:
It is given that the total number of adults with gum disease is 60. There are 3 such adults that ossed 6 times per week. Therefore, the relative frequency is 3/60 =
0.0500
Answer:
The relative frequency of adults that do not oss at all is 0.4500. All other adults oss at least once a week. Their relative frequency is 1.0000 - 0.4500 = 0.5500.
Information
The number of malfunctioning products per production series was recorded for several production series. The data was entered into an R object by the
name "malfunction". The next 3 questions refer to the following R code:
Select one:
a. 8
b. 9
c. 72
The object "freq" contain the table of frequency of the production series, divided according to the number of malfunctioning products that they had. The cumulative
frequency of all the production series that had 8 malfunctioning products or less, which includes all production series, is reported under the number "8" in the output
of the expression "cumsum(freq)". This number is 72.
The frequency of production series where there are 4 malfunctioning products is:
Select one:
a. 57
b. 9
c. 16
The cumulative frequency of production series that have 4 malfunctioning products or less is 57. The cumulative frequency of production series that have 3
malfunctioning products or less or less is 41. The frequency of production series that have exactly 4 production series is the di erence between these two numbers:
57 - 41 = 16.
The frequency of production series where there are less than 7 malfunctioning products is:
Select one:
a. 70
b. 71
c. 72
Having less that 7 malfunctioning products corresponds to having 6 malfunctioning products or less. The cumulative frequency of production series with 6
malfunctioning products or less is 70.
Information
The le "ex2.csv" contains information on the blood pressure of a group of healthy individuals. The le is located in
http://pluto.huji.ac.il/~msby/StatThink/Datasets/ex2.csv. (ALTERNATE DOWNLOAD LINK) Read the data into R and answer the following 4 questions:
Answer:
After saving the le "ex2.csv" in the working directory one can use the code
> ex2 <- read.csv("ex2.csv")
in order to read the le into a data frame by the name "ex2". Writing the content of the object to the screen will produce:
> ex2
...
Select one:
a. numeric
b. factor
All the values are numbers. Technically, R treats this variable as a numeric sequence. However, one would typically not use this variable for statistical inference.
Usually, it serves as a key, a unique identi er, in data set management.
Select one:
a. numeric
b. factor
Select one:
a. numeric
b. factor
Information
A study was done to determine the age, type of activity, number of times per week and the duration (amount of time) of resident use of a local park in
San Jose. The rst house in the neighborhood around the park was selected randomly and then every 8th house in the neighborhood around the park
was interviewed. Answer the following 2 questions:
Select one:
a. qualitative/factor
b. quantitative/numeric
Select one:
a. qualitative/factor
b. quantitative/numeric
Information
Identify the type of data that would be used to describe a response for each of the items below:
Select one:
a. qualitative/factor
b. quantitative/numeric
Select one:
a. qualitative/factor
b. quantitative/numeric
Select one:
a. qualitative/factor
b. quantitative/numeric
Select one:
a. qualitative/factor
b. quantitative/numeric
Select one:
a. qualitative/factor
b. quantitative/numeric
In Figure A you will nd box plots for three sets of data. In Figure B are the histograms for the same sets of data, but in a di erent order. Associate each
box plot with its relative histogram.
Figure A:
Figure B:
Observe the range of distribution each data: [10,40] in Histogram a, [-40,30] in Histogram b and [14,28] in Histogram c. (You may want to increase the Plot. That can be
done on many browsers with Control plus the "+" key. Or you may download the gure an open it with a graphical application.)
The correct answer is: Box plot 1 → Histogram b, Box plot 2 → Histogram a, Box plot 3 → Histogram c
Question 30 Not answered Marked out of 1.00
Consider the box plots in Figure A. Which of the data has a smaller inter-quartile range (IQR)?
Figure A
Select one:
a. Box plot 1
b. Box plot 2
c. Box plot 3
The hight of the central box in Box-plot 3 is the least of the three.
Information
Select one:
a. 9.1
b. 13.3
c. 14.8
Run the code: > x<-c(11.9,11.0,12.4,16.9,16.3,13.3,9.1, 17.0,11.0, 9.3,25.3,17.4,17.4) > median(x) [1] 13.3
Select one:
a. True
b. False
Run the code: > x<-c(11.9,11.0,12.4,16.9,16.3,13.3,9.1, 17.0,11.0, 9.3,25.3,17.4,17.4) > boxplot(x) Observe, in the box plot that is created, that there are
no outliers.
Select one:
a. True
b. False
Run the code: > x<-c(11.9,11.0,12.4,16.9,16.3,13.3,9.1, 17.0,11.0, 9.3,25.3,17.4,17.4) > boxplot(x) Observe, in the box plot that is created, that there are
no outliers.
Create an R data frame with the name "ex.2" that contains the data in the le "ex2.csv" (Select the le name to download it).
Compute the standard deviation of each of the numeric variables. Among the following, the variable with the largest standard deviation is:
Select one:
a. age
b. bmi
c. systolic
d. diastolic
> sd(ex.2$age)
[1] 3.805571
> sd(ex.2$bmi)
[1] 3.881489
> sd(ex.2$systolic)
[1] 11.27262
> sd(ex.2$diastolic)
[1] 11.56522
Twenty-one randomly selected students were asked the number of pairs of sneakers they owned. The number of pairs of sneakers owned by each
student was recorded in an R object by the name "x". The frequency table of the data "x" is:
> table(x) x
1 2 3 4 5 6
4 7 3 3 2 2
Answer:
Run the code: > x.val <- c(1,2,3,4,5,6) > freq <- c(4,7,3,3,2,2) > rel.freq <- freq/sum(freq) > x.bar <- sum(x.val*rel.freq) > x.bar [1] 2.904762
Answer:
Answer:
Observe that more than 25% of the distribution has accumulated at value "2" but less than that at value "1".
The median is
Answer:
that more than 50% of the distribution has accumulated at value "2" but less than that at value "1".
Answer:
that more than 75% of the distribution has accumulated at value "4" but less than that at value "3".
Answer:
Observe that the frequency of the values "1" and "2" is more than the frequency of the values "5" and "6".
The relative frequency of the students that owned more than one but less than 5 sneakers is is
Answer:
Information
Following are the possible weights (in pounds) of some football team members.
232, 251, 257, 268, 238, 222, 265, 263, 252, 246, 253, 248, 256, 248, 230, 219, 224, 267, 259, 254, 254, 261, 248, 221, 252, 269, 269, 273, 273, 259, 251,
222, 248, 224
Answer:
Answer:
Answer:
The median is
Answer:
Answer:
The USC quarterback Matt Barkley weighed 220 pounds in the spring of 2010. How many standard deviations above or below the mean was he in
comparison to the data given above? (Give the answer in the format x.xxx, without the plus/minus sign)
Answer:
The following frequency table shows the lengths of 42 international phone calls using a $5 prepaid calling card. The data was stored in an object by the
name "x":
x 4 14 24 34 44 54
2 6 13 13 6 2
Using the data, and without computing the mean and the median, determine which ONE of the answers is correct:
Select one:
Observe that the distribution is symmetric. The values are equally spaced and the frequencies evenly distibuted.
The correct answer is: The mean and the median are equal.
Question 49 Not answered Marked out of 1.00
Consider the following data set: 4, 6, 6, 12, 18, 18, 18, 200. What value is (approximately) 0.75 standard deviations below the mean?
Select one:
b. Approximately -15
c. Approximately 4
d. Approximately 34.5
Consider the code: > x <- c(4, 6, 6, 12, 18, 18, 18, 200) > mean(x) - 0.75*sd(x) [1] -14.87231
In Chapter 3 the data frame "ex.2", which contained information associated to the blood pressure of a sample of 150 men and women, was introduced.
Let us assume that this sample was taken from an imaginary population of size 100,000 and let the information for all the members of this population
be stored in a CSV le by the name "pop2.csv".
Read the content of the population le into an data frame under the name "pop.2". Applying the function "summary" to this data frame produces:
> summary(pop.2)
group
HIGH :28126
LOW : 4215
NORMAL:67659
The variables in this data frame are the same variables that were included in the data frame "ex.2" of Chapter 3. The variables that are
included in this data frame are:
id:
A numerical variable. A 7 digits number that serves as a unique identi er of the subject.
sex:
A factor variable. The sex of each subject. The values are either "MALE" or "FEMALE".
age:
A numerical variable. The age of each subject.
bmi:
A numerical variable. The body mass index of each subject.
systolic:
A numerical variable. The systolic blood pressure of each subject.
diastolic:
A numerical variable. The diastolic blood pressure of each subject.
group:
A factor variable. The blood pressure category of each subject. The values are "NORMAL" both the systolic blood pressure is within its normal range
(between 90 and 139) and the diastolic blood pressure is within its normal range (between 60 and 89). The value is "HIGH" if either measurements of
blood pressure are above their normal upper limits and it is "LOW" if either measurements are below their normal lower limits.
The next 6 questions correspond to a person that is sampled at random from this population. The answers below may be rounded up to two decimal
places.
Answer:
Run the code: > pop.2 <- read.csv("pop2.csv") > median(pop.2$age) [1] 35
Answer:
Run the code: > pop.2 <- read.csv("pop2.csv") > var(pop.2$diastolic) [1] 171.6469
The standard deviation of the di erence between the systolic and diastolic blood pressures was: (Hint: Observe that the di erence
" pop.2$systolic - pop.2$diastolic " produces the di erence between the two types of blood pressure for all the members in the population.)
Run the code: > pop.2 <- read.csv("pop2.csv") > sd(pop.2$systolic - pop.2$diastolic) [1] 3.950757
Answer:
The probability that someone sampled from this data will have normal blood pressure is:
Answer:
Run the code: > pop.2 <- read.csv("pop2.csv") > summary(pop.2$group) HIGH LOW NORMAL 28126 4215 67659 There are 67,659 individuals that are classi ed as
"NORMAL" among the total population of 100,000. Hence, the probability is 0.67659.
Mark the following statement as either TRUE or FALSE: The standard deviation of the di erence between the systolic and diastolic blood pressures is
equal to the di erence between the standard deviation of systolic blood pressure and the standard deviation of diastolic blood pressure
The
Distribution
of Y
Value Probability
1.5 0.15
4
5.5 0.10
6 0.23
7.5 0.11
10 0.05
Complete the probabilities of the random variable Y in above table and compute
Answer:
The missing probability is equal to 1 -(0.15 + 0.1 + 0.23 + 0.11 + 0.05) = 0.36 The event in question involves the values 1.5 and 4. Hence P(Y < 5) = 0.15 + 0.36 = 0.51.
Answer:
The missing probability is equal to 1 -(0.15 + 0.1 + 0.23 + 0.11 + 0.05) = 0.36 The event in question involves the values 1.5 and 5.5 and 7.5. Hence P(not an integer) =
0.15 + 0.10 + 0.11 = 0.36.
E(Y) equals
Answer:
The missing probability is equal to 1 -(0.15 + 0.1 + 0.23 + 0.11 + 0.05) = 0.36 Run the code:
> Y.val <- c(1.5,4,5.5,6,7.5,10) > P.val <- c(0.15,0.36,0.1,0.23,0.11,0.05) > sum(Y.val*P.val) [1] 4.92
Answer:
The missing probability is equal to 1 -(0.15 + 0.1 + 0.23 + 0.11 + 0.05) = 0.36 Run the code:
> Y.val <- c(1.5,4,5.5,6,7.5,10) > P.val <- c(0.15,0.36,0.1,0.23,0.11,0.05) > E <- sum(Y.val*P.val) > Var <- sum((Y.val-E)^2*P.val) > sqrt(Var) [1] 2.093705
One selects an integer between 1 and 9 (including 1 and 9) at random. Let X be a random variable that obtains as a value the integer that was selected.
The following questions correspond to this random variable.
The
Distribution
of X
Value Probability
Complete table of distribution of X and compute:
E(X) is equal to
Answer:
The values of the random variables are the integers between 1 and 9 and the probability of each value is 1/9 In order to compute the expectation run the code:
> X.val <- c(1,2,3,4,5,6,7,8,9) > P.val <- c(1,1,1,1,1,1,1,1,1)/9 > sum(X.val*P.val) [1] 5
Answer:
The values of the random variables are the integers between 1 and 9 and the probability of each value is 1/9 In order to compute the variance run the code:
> X.val <- c(1,2,3,4,5,6,7,8,9) > P.val <- c(1,1,1,1,1,1,1,1,1)/9 > E <- sum(X.val*P.val) > sum((X.val-E)^2*P.val) [1] 6.666667
Answer:
The values of the random variables are the integers between 1 and 9 and the probability of each value is 1/9 In order to compute the variance run the code:
> X.val <- c(1,2,3,4,5,6,7,8,9) > P.val <- c(1,1,1,1,1,1,1,1,1)/9 > E <- sum(X.val*P.val) > Var <- sum((X.val-E)^2*P.val) > sqrt(Var) [1] 2.581989
Information
Suppose that you are o ered the following "deal." An impartial person selects an integer between 1 and 9 at random and you try to guess beforehand
which number will be selected. If you guess wrong then you pay $1 and if you guess right then you win $10. Call the outcome of the game your "gain."
(Note that if you pay money then your gain is negative.)
Question 63 Not answered Marked out of 1.00
Select one:
d. Unknown.
You select a number. The probability that the speci c number that you have selected will turn out is 1/9. The probability that you miss is 8/9.
Answer:
You select a number. The probability that the speci c number that you have selected will turn out is 1/9. The probability that you miss is 8/9. Let X be the gain from the
gain. The gain is 10 if you win (with probability 1/9) and -1 if you loose (with probability 8/9). The expectation is E(X) = 10/9 + (-1)*8/9 = 2/9 = 0.2222
Information
Approximately 85% of statistics students do their homework in time for it to be collected and graded. Let X be the number students that submit their
homework in time out of a statistics class of 70 students. The following 4 questions refer to this X. (The answer may be rounded up to 3 decimal places
of the actual value.)
Select one:
The possible outcome of X, the number of students out of 70 that submit their homework, is an integer and the range of values starts at 0 (no one submits) and ends
in 70 (all submit).
The correct answer is: The integers between (and including) 0 and 70.
The probability that less than 60 of the 70 students will do their homework on time is:
Answer:
Less that 60 means 59 or less. The probability P(X ≤ 59) can be computed with the code: > pbinom(59,70,0.85) [1] 0.4842268
Answer:
Answer:
The variance of X is n * p * (1-p) = 70 * 0.85 * 0.15 = 8.925. The standard deviation is the square root of the variance, namely 2.987474. The correct answer is:
2.987474.
Information
In Chapter 1 it was claimed that when tossing a fair coin 4 times it is quite likely to not obtain 2 heads and 2 tails. However, when tossing a fair coin
4,000 times one should expect to obtain number of tails in the range between 1940 and 2060. Let us compare the situation for 2 versus 2,000 coins.
Let X be the number of heads when tossing a fair coin 2 times and let Y be the number of heads when tossing a fair coin 2,000 times. (The answer may
be rounded up to 3 decimal places of the actual value).
P( X = 1) is equal to
Answer:
The distribution of X is Binomial(2,0.5). The probability P(X = 1) can be computed with the code > dbinom(1,2,0.5) [1] 0.5
Answer:
If P(X = 1) = 0.5 then the complementary probability (X is not equal to 1) is equal to 1 - P(X = 1) = 1 - 0.5 = 0.5
Answer:
If P(Y = 1000) = 0.01783901 then the complementary probability (Y is not equal to 1000) is equal to 1 - P(Y = 1000) = 1 - 0.01783901 = 0.982161
Answer:
The distribution of Y is Binomial(2000,0.5). The probability P(Y = 1000) can be computed with the code > dbinom(1000,2000,0.5) [1] 0.01783901
Answer:
The probability P(940 ≤ Y ≤ 1,060) is equal to the di erence between P(Y ≤ 1,060) and the probability P(Y < 940). The letter probability is equal to P(Y ≤ 939). This
di erence can be computed with the code: > pbinom(1060,2000,0.5) - pbinom(939,2000,0.5) [1] 0.9931974
E(X) is equal to
Answer:
Answer:
The variance of X is equal to n * p * (1-p) = 2 * 0.5 * 0.5 = 0.5. The standard deviation is the square root of 0.5, which is equal to 0.7071068
E(Y) is equal to
Answer:
Answer:
The variance of Y is equal to n * p * (1-p) = 2000 * 0.5 * 0.5 = 500. The standard deviation is the square root of 500, which is equal to 22.36068
Meiosisis the process in which a diploid cell that contains two copies of the genetic material produces an haploid cell with only one copy (sperms and
eggs). The resulting molecule of genetic material is linear molecule that is composed of consecutive segments: a segment that originated from one of
the two copies followed by a segment from the other copy and vice versa. The border points between segments are called points of crossover. The
Haldane model for crossovers states that the number of crossovers between two loci on the genome has a Poisson(λ) distribution. Assume that the
expected number of crossovers between two loci in a xed period of time is 2.25.The next 3 questions refer to this model for crossovers. (The answer
may be rounded up to 3 decimal places of the actual value.)
Answer:
The number of crossovers has a Poisson distribution with parameter λ = 2.25. The probability of exactly 4 crossovers can be computed with the code:
> dpois(4,2.25) [1] 0.1125528
Answer:
The number of crossovers has a Poisson distribution with parameter λ = 2.25. The probability of at least 4 crossovers can be computed as the di erence between 1
and the probability of 3 or less crossovers. The computation can be conducted with the code: > 1 - ppois(3,2.25) [1] 0.1905669
A recombination between two loci occurs if the number of crossovers is odd. The probability of recombination between the two loci is, approximately,
equal to (Compute the probability of recombination approximately using the function "dpois". Ignore odd values larger than 9)
Values (not larger than 9) that lead to recombination are 1, 3, 5, 7, and 9. The probability of these values for the Poisson distribution with parameter that is equal to λ =
2.25 can be computed with the code: > sum(dpois(c(1,3,5,7,9),2.25)) [1] 0.4944251
Information
The amount of time, in minutes, that a person must wait for a bus is uniformly distributed between 0 and 17 minutes, inclusive. The next 3 questions
refer to this waiting time. (The answer may be rounded up to 3 decimal places of the actual value.)
Answer:
Let the X be the length of time the person waits. The distribution of X is Uniform(0,17). The probability P(X ≤ 12.5) can be computed with the code
> punif(12.5,0,17) [1] 0.7352941
Answer:
Let the X be the length of time the person waits. The distribution of X is Uniform(0,17). The expectation of x is equal to (a+b)/2 = (0+17)/2 = 8.5
Answer:
Let the X be the length of time the person waits. The distribution of X is Uniform(0,17). The variance of x is equal to (b-a)^2/12 =(17-0)^2/12 = 24.08333. . The standard
deviation is the square root of the variance and is equal to 4.907477
Information
Let X be amount of time (in minutes) a postal clerk spends with his/her customer. Assume that X has an Exponential(λ) distribution and that E(X) = 7
minutes. The next 3 questions refer to this waiting time. (The answer may be rounded up to 3 decimal places of the actual value.)
Answer:
The expectation in the Exponential distribution is the reciprocal of the parameter λ.Consequently, the parameter λ is equal to the inverse of the expectation: λ = 1/E(X).
The expectation E(X) = 7, hence λ = 1/7 = 0.1428571
The probability that a clerk spends between four to ve minutes with a randomly selected customer is
Answer:
Let the distribution of X be Exponential(1/7). The probability P(4 ≤ X ≤ 5) is equal to the di erence between P(X ≤ 5) and the probability P(X < 4). The letter probability is
equal to P(X ≤ 4), since the distribution is continuous. This di erence can be computed with the code: > pexp(5,1/7)-pexp(4,1/7) [1] 0.07517646
The probability that a clerk spends more than 10 minutes with a customer is
Answer:
Let the distribution of X be Exponential(1/7). The probability P(10 < X) is equal to the di erence between 1 and the probability P(X ≤ 10). This di erence can be
computed with the code: > 1-pexp(10,1/7) [1] 0.2396510
According to some study, the height for Northern European adult males is normally distributed with an average of 181 centimeter and a standard
deviation of 7.3 centimeter. Suppose such an adult male is randomly chosen. Let X be height of that person. The next 3 questions correspond to this
information. The answer may be rounded up to 3 decimal places of the actual value.
The probability that the person is between 160 and 170 centimeters is
Answer:
2
The distribution of X is Normal(181,(7.3) ). The Probability of the interval is equal to the di erence between P(X ≤ 170) and P(X ≤ 160)
> pnorm(170,181,7.3) - pnorm(160,181,7.3) [1] 0.06391543
Answer:
2
The distribution of X is Normal(181,(7.3) ). The Probability of being larger than 190 is equal to the di erence between 1 and P(X ≤ 190)
> 1 - pnorm(190,181,7.3) [1] 0.1088109
Select one:
a. (171.6, 190.4)
b. (173.4, 188.6)
c. (174.9, 187.1)
d. (176.1, 185.7)
2
The distribution of X is Normal(181,(7.3) ). The central region that contains 60% of the distribution is the region between the 0.2-percentile and the 0.8-percentile
> qnorm(0.2,181,7.3) [1] 174.8562 > qnorm(0.8,181,7.3) [1] 187.1438
Information
Terri Vogel, an amateur motorcycle racer, averages 129.71 seconds per 2.5 mile lap (in a 7 lap race) with a standard deviation of 2.28 seconds . The
distribution of her race times is normally distributed. We are interested in one of her randomly selected laps. The next 4 questions correspond to this
information. The answer may be rounded up to 3 decimal places of the actual value:
The proportion (a number between 0 and 1) of her laps that are completed in less than 125 seconds is
Answer:
2
The distribution of a random lap is Normal(129.71,(2.28) ). The Probability of being less or equal than 125 is > pnorm(125,129.71,2.28) [1] 0.01942418
The fastest 10% of her laps are completed under how many seconds?
Answer:
2
The distribution of a random lap is Normal(129.71,(2.28) ). The fastest 10% of the laps are completed under the 0.1-percentile of the given probability. This percentile
is > qnorm(0.1,129.71,2.28) [1] 126.7881
The middle 90% of her laps are from a seconds to b seconds. Then a is equal to
Answer:
2
The distribution of a random lap is Normal(129.71,(2.28) ). The central region that contains 90% is [a,b], where a is equal to the 0.05-percentile of the distribution
> qnorm(0.05,129.71,2.28) [1] 125.9597
The middle 90% of her laps are from a seconds to b seconds. Then b is equal to
Answer:
2
The distribution of a random lap is Normal(129.71,(2.28) ). The central region that contains 90% is [a,b], where a is equal to the 0.95-percentile of the distribution
> qnorm(0.95,129.71,2.28) [1] 133.4603
Information
Suppose that Ricardo and Anita attend the same college. Ricardo's GPA is better than 30% of his school mates but worse than the other 70%. Anita's
GPA is 0.60 standard deviations below her school average. All the students that were at least one standard deviation above the mean obtained an "A"
or an "A+" score, which corresponded to about 16% of the students. Assume GPA are Normally distributed. For each of the following sentences mark
one of the options:
Select one:
a. Always True
c. Always False
Ricardo's GPA corresponds to the 0.3-percentile of the students. This score is -0.5244005 standards deviations below the average since, if we consider standardized z-
scores we get > qnorm(0.3) [1] -0.5244005 According to the information in the question Anita's GPA is 0.6 standard deviations below the average. Consequently,
her z-score is -0.6, which is lower than Ricardo's.
Select one:
a. Always True
c. Always False
Ricardo's GPA corresponds to the 0.3-percentile of the students. This score is 0.5244005 standards deviations below the average since, if we consider standardized z-
scores we get > qnorm(0.3) [1] -0.5244005 The resulting z-score is negative.
Select one:
a. Always True
c. Always False
There is no information in the question that tels us what is the standard deviation of the GPA. The number 0.16 in the question refers to a probability, not to the
standard deviation.
Information
Some measurement on a population has a Normal distribution with expectation of 1000 and standard deviation of 150. We denote de di erence
between the 0.75-percentile and the 0.25-percentile the Interquartile Range. Identify a measurement value as outlier if it is larger than the 0.75-
percentile plus 1.5 times the interquartile range or it is smaller than the the 0.25-percentile minus 1.5 times the interquartile range. The next 5
questions correspond to this information. The answer may be rounded up to 3 decimal places of the actual value:
Answer:
2
The distribution of the measurement is Normal(1000,(150) ). The 0.25 percentile of the measurement is > qnorm(0.25,1000,150) [1] 898.8265
Answer:
2
he distribution of the measurement is Normal(1000,(150) ). The 0.75-percentile of the measurement is > qnorm(0.75,1000,150) [1] 1101.173
Answer:
The inter-quartile range is the di erence between the 0.75-percentile (Q3) and the 0.25-percentile (Q1). That is, 1101.173 - 898.8265 = 202.3465
Answer:
The upper threshold for the identi cation of an outlier is Q3 + 1.5*(Q3 - Q1). The probability of being above this threshold is equal to the probability of being below the
lower threshold. The probability of identifying an outlier twice the probability of being above the upper threshold
> u <- 1101.173 + 1.5*202.3465 > 2*(1-pnorm(u,1000,150)) [1] 0.006976757 This is the same probability that we found in Subsection 6.2.4
Denote the probability of an outlier that was computed in the previous question by p. Consider a di erent Normal measurement that has a di erent
mean and di erent standard deviation than the measurement that was considered before. In the context of the current measurement, the probability
of a random measurement to be identi ed as and outlier is:
Select one:
If the mean or the standard change then the quartiles and the inter-quartile range change accordingly. In terms of the z-scores, the computation reduces to the
computation that was presented in Subsection 6.2.4.
Information
The patient recovery time from a particular surgical procedure is normally distributed with a mean of 5.3 days and a standard deviation of 2.1 days. The
next 4 questions correspond to this information.
What is the inter-quartile range of the recovery time? (Choose the closest possibility.)
Select one:
a. 2.8
b. 5.3
c. 7.4
d. 2.1
The distribution is Normal with expectation of 5.3 and standard deviation of 2.1. The inter-quartile range is the di erence between the 0.75-percentile (Q3) and the
0.25-percentile (Q1). > qnorm(0.75,5.3,2.1) - qnorm(0.25,5.3,2.1) [1] 2.832857
What is the z-score for a patient who takes 6 days to recover? (Choose the closest possibility.)
Select one:
a. 1.5
b. 0.3
c. 2.2
d. 7.3
The distribution is Normal with expectation of 5.3 and standard deviation of 2.1. The z-score of the value 10 is equal to (6 - 5.3)/2.1 = 0.3333333
What is the probability of spending less than 2 days in recovery? (Choose the closest possibility.)
Select one:
a. 0.0580
b. 0.8447
c. 0.0553
d. 0.9420
The distribution is Normal with expectation of 5.3 and standard deviation of 2.1. The probability of being below 2 is > pnorm(2,5.3,2.1) [1] 0.05804157
Select one:
a. 8.89
b. 7.07
c. 7.99
d. 4.32
The distribution is Normal with expectation of 5.3 and standard deviation of 2.1. The 32th percentile for recovery times is > qnorm(0.32,5.3,2.1) [1] 4.317833
Information
Let the distribution of X be Binomial(150,0.8). The next 4 questions correspond to this information. The answer may be rounded up to 3 decimal places
of the actual value:
Answer:
The probability P(121 < X ≤ 129) is equal to the di erence P(X ≤ 129) - P(X ≤ 121). The di erence is equal to
> pbinom(129,150,0.8) - pbinom(121,150,0.8) [1] 0.3645757
The Normal approximation (without continuity correction) of the probability P(121 < X ≤ 129) is equal to
Answer:
The probability P(121 < X ≤ 129) is equal to the di erence P(X ≤ 129) - P(X ≤ 121). In the Normal approximation we substitute the Binomial distribution with the Normal
distribution that has the same expectation and variance. The expectation of the Binomial is 150*0.8 and the variance is 150*0.8*0.2. The approximation produces
> mu <- 150*0.8 > sig <- sqrt(150*0.8*0.2) > pnorm(129,mu,sig) - pnorm(121,mu,sig) [1] 0.3860320
The Normal approximation (with continuity correction) of the probability P(121< X ≤ 129) is equal to
Answer:
The probability P(121 < X ≤ 129) involves the values {122, 123, ..., 129}. In the Normal approximation we substitute the Binomial distribution with the Normal
distribution that has the same expectation and variance. The expectation of the Binomial is 150*0.8 and the variance is 150*0.8*0.2. The continuity correction involves
the consideration of intervals of radius 0.5 about each value. The di erence to be considered is P(X ≤ 129.5) - P(X ≤ 121.5). The approximation produces
> mu <- 150*0.8 > sig <- sqrt(150*0.8*0.2) > pnorm(129.5,mu,sig) - pnorm(121.5,mu,sig) [1] 0.3534917
Answer:
The probability P(121 < X ≤ 129) is equal to the di erence P(X ≤ 129) - P(X ≤ 121). In the Poisson approximation we substitute the Binomial distribution with the Poisson
distribution that has the same expectation. The expectation of the Binomial is 150*0.8. The approximation produces
> lam <- 150*0.8 > ppois(129,lam) - ppois(121,lam) [1] 0.2478270
Information
Let the distribution of X be Binomial(15,0.8). The next 4 questions correspond to this information. The answer may be rounded up to 3 decimal places
of the actual value:
Answer:
The probability P(13 < X ≤ 16) is equal to the di erence P(X ≤ 16) - P(X ≤ 13). The di erence is equal to > pbinom(16,15,0.8) - pbinom(13,15,0.8) [1] 0.1671258
The Normal approximation (without continuity correction) of the probability P(13 < X ≤ 16) is equal to
Answer:
The probability P(13 < X ≤ 16) is equal to the di erence P(X ≤ 16) - P(X ≤ 13). In the Normal approximation we substitute the Binomial distribution with the Normal
distribution that has the same expectation and variance. The expectation of the Binomial is 15*0.8 and the variance is 15*0.8*0.2. The approximation produces
> mu <- 15*0.8 > sig <- sqrt(15*0.8*0.2) > pnorm(16,mu,sig) - pnorm(13,mu,sig) [1] 0.2543909
The Normal approximation (with continuity correction) of the probability P(13 < X ≤ 16) is equal to
Answer:
The probability P(13 < X ≤ 16) involves the values {14, 15, 16}. In the Normal approximation we substitute the Binomial distribution with the Normal distribution that
has the same expectation and variance. The expectation of the Binomial is 15*0.8 and the variance is 15*0.8*0.2. The continuity correction involves the consideration
of intervals of radius 0.5 about each value. The di erence to be considered is P(X ≤ 16.5) - P(X ≤ 13.5). The approximation produces
> mu <- 15*0.8 > sig <- sqrt(15*0.8*0.2) > pnorm(16.5,mu,sig) - pnorm(13.5,mu,sig) [1] 0.164623
Answer:
The probability P(13 < X ≤ 16) is equal to the di erence P(X ≤ 16) - P(X ≤ 13). In the Poisson approximation we substitute the Binomial distribution with the Poisson
distribution that has the same expectation. The expectation of the Binomial is 15*0.8. The approximation produces
> lam <- 15*0.8 > ppois(16,lam) - ppois(13,lam) [1] 0.2171734
Information
Recall that the population average of the heights in the le "pop1.csv" is μ = 170.035. Using simulation it can be shown that the probability of the
sample average of the height falling within 2 centimeter of the population average is approximately equal to 0.925. From the simulations we also got
that the standard deviation of the sample average is (approximately) equal to 1.122. In the next 3 questions you are asked to apply the Normal
approximation to the distribution of the sample average using this information. The answer may be rounded up to 3 decimal places of the actual value:
Using the Normal approximation, the probability that sample average of the heights falls within 2 centimeter of the population average is
Answer:
Using the Normal approximation, the computation of a probability associated with the random variable is conducted with the functions of the Normal distribution for
the same expectation and standard deviation as the original distribution. The expectation is μ = 170.035. The standard deviation is σ = 1.122. The event corresponds to
the interval [μ - 2, &mu + 2]. Therefore, the approximated probability is
> mu <- 170.035 > sig <- 1.122 > pnorm(mu+2,mu,sig) - pnorm(mu-2,mu,sig) [1] 0.9253374
Using the Normal approximation we get that the central region that contains 90% of the distribution of the sample average is of the form 170.035 ± z ·
1.122. The value of z is
Answer:
The structure of the central region that contains 90% of the Normal distribution is μ ± qnorm(0.95) · σ. However, μ = 170.035 and σ = 1.122. Therefore, z =
qnorm(0.95) = 1.644854.
Using the Normal approximation, the probability that sample average of the heights is less than 169 is
Answer:
Using the Normal approximation, the computation of a probability associated with the random variable is conducted with the functions of the Normal distribution for
the same expectation and standard deviation as the original distribution. The expectation is μ = 170.035. The standard deviation is σ = 1.122. The event corresponds to
the values less than 169. Therefore, the approximated probability is > mu <- 170.035 > sig <- 1.122 > pnorm(169,mu,sig) [1] 0.1781444
According to the Internal Revenue Service, the average length of time for an individual to complete (record keep, learn, prepare, copy, assemble and
send) IRS Form 1040 is 10.53 hours (without any attached schedules). The distribution is unknown. Let us assume that the standard deviation is 2
hours. Suppose we randomly sample 36 taxpayers and compute their average time to completing the forms. Then the probability that the average is
less than 10 hours is approximately equal to (The answer may be rounded up to 3 decimal places of the actual value)
Answer:
Applying the Central Limit Theorem we carry out the computation using the Normal distribution with the same expectation and standard deviation as the sample
average. The expectation of the sample average is equal to the expectation of a single measurement (10.53) and the standard deviation is equal to the standard
deviation of a single measurement (2), divided by the square root of the number of observations (36). Consequently, the approximate probability is
> pnorm(10,10.53,2/sqrt(36)) [1] 0.0559174
Information
Suppose that a category of world class runners are known to run a marathon (26 miles) in an expectation of 145 minutes with a standard deviation of
14 minutes. Consider 49 random races. In the next 3 questions you are asked to apply the Normal approximation to the distribution of the sample
average using this information. The answer may be rounded up to 3 decimal places of the actual value:
The probability that the runner will average between 143 and 145 minutes in these 49 marathons is
Answer:
Applying the Central Limit Theorem we carry out the computation using the Normal distribution with the same expectation and standard deviation as the sample
average. The expectation of the sample average is equal to the expectation of a single measurement (145) and the standard deviation is equal to the standard
deviation of a single measurement (14), divided by the square root of the number of observations (49). Consequently, the approximate probability is
> pnorm(145,145,14/sqrt(49))- pnorm(143,145,14/sqrt(49)) [1] 0.3413447
Answer:
Applying the Central Limit Theorem we carry out the computation using the Normal distribution with the same expectation and standard deviation as the sample
average. The expectation of the sample average is equal to the expectation of a single measurement (145) and the standard deviation is equal to the standard
deviation of a single measurement (14), divided by the square root of the number of observations (49). Consequently, the approximate 0.9-percentile is
> qnorm(0.9,145,14/sqrt(49)) [1] 147.5631
Answer:
Applying the Central Limit Theorem we carry out the computation using the Normal distribution with the same expectation and standard deviation as the sample
average. The expectation of the sample average is equal to the expectation of a single measurement (145) and the standard deviation is equal to the standard
deviation of a single measurement (14), divided by the square root of the number of observations (49). Consequently, the approximate 0.25-percentile (Q1) is the 0.25-
percentile of the the appropriate Normal distribution. The same holds for the 0.75-percentile (Q3). The inter-quartile range is the di erence between these two
numbers > qnorm(0.75,145,14/sqrt(49)) - qnorm(0.25,145,14/sqrt(49)) [1] 2.697959
The time to wait for a particular rural bus is distributed uniformly from 0 to 25 minutes. 25 riders are randomly sampled and their waiting times
measured. The 90th percentile of the average waiting time (in minutes) computed for the sample is (approximately):
Select one:
a. 210.0
b. 26.9
c. 14.3
d. 13.2
2
The expectation of the Uniform(0,25) distribution is 25/2 = 12.5 and the variance is 25 /12 = 52.08333. Applying the Central Limit Theorem we carry out the
computation using the Normal distribution with the same expectation and standard deviation as the sample average. The expectation of the sample average is equal
to the expectation of a single measurement (12.5) and the standard deviation is equal to the standard deviation of a single measurement (7.216878 = the square root
of 52.08333), divided by the square root of the number of observations (25). Consequently, the approximate 0.90-percentile is
> qnorm(0.90,12.5,7.216878/sqrt(25)) [1] 14.34976
Information
A switching board receives a random number of phone calls. The expected number of calls is 7.4 per minute. Assume that distribution of the number of
calls is Poisson. The average number of calls per minute is recorded by counting the total number of calls received in one hour, divided by 60, the
number of minutes in an hour. In the next 4 questions you are asked to apply the Normal approximation to the distribution of the sample average
using this information. The answer may be rounded up to 3 decimal places of the actual value:
Answer:
The expectation of the sample average is equal to the expectation of a single measurement, which is 7.4 in this example.
Answer:
The variance in the Poisson distribution is equal to the expectation. The standard deviation of the sample average is equal to the standard deviation of a single
measurement (2.720294 = the square root of 7.4), divided by the square root of the number of observations (60). The resulting standard deviation is 0.3511885.
Answer:
Applying the Central Limit Theorem we carry out the computation using the Normal distribution with the same expectation and standard deviation as the sample
average. The expectation of the sample average is equal to 7.4 and the standard deviation is equal to 0.3511885. Consequently, the approximate probability is
> pnorm(7,7.4,0.3511885) [1] 0.1273538
The probability that number of calls in a random minute is less than 7 is . (Note, the question is with resect to a random minute, and not
the average.)
The distribution of the number of calls in a random minute is Poisson(7.4). The event of less than 7 calls corresponds to the events of 6 calls or less. Consequently,
> ppois(6,7.4) [1] 0.3919617
Information
It is claimed that the expected length of time some computer part may work before requiring a reboot is 3.5 months. In order to examine this claim 70
identical parts are set to work. Assume that the distribution of the length of time the part can work (in months) is Exponential. In the next 4 questions
you are asked to apply the Normal approximation to the distribution of the average of the 70 parts that are examined. The answer may be rounded up
to 3 decimal places of the actual value:
Answer:
The expectation of the sample average is equal to the expectation of a single measurement, which is 3.5 in this example.
Answer:
2
The variance in the Exponential(λ) distribution is equal to 1/λ . Since the expectation is 1/λ, we get that the variance is equal to the square of the expectation, which is
2 2
3.5 . The standard deviation of the sample average is equal to the standard deviation of a single measurement (3.5 = the square root of 3.5 ), divided by the square
root of the number of observations (70). The resulting standard deviation is 0.41833.
The central region that contains 90% of the distribution of the average is of the form E(X) ± c, where E(X) is the expectation of the sample average. The
value of c is
Answer:
The structure of the central region that contains 90% of the Normal distribution is μ ± qnorm(0.95) · σ. However, μ = E(X), σ = 0.41833 and qnorm(0.95) = 1.644854.
Consequently, c = qnorm(0.95) · σ = 1.644854 · 0.41833 = 0.6880918
Answer:
Applying the Central Limit Theorem we carry out the computation using the Normal distribution with the same expectation and standard deviation as the sample
average. The expectation of the sample average is equal to 3.5 and the standard deviation is equal to 0.41833. The probability of being more than 4 is equal to the
di erence between one and the probability of being less or equal 4. Consequently, the approximate probability is > 1 - pnorm(4,3.5,0.41833) [1] 0.1159989
Consider the following study: We want to know the average amount of money rst year college students spend at NDD College on accessories. We
randomly survey 22 rst year students at the college. Four out of the 22 students spent $33, $110, $180, and $197, respectively. The target parameter
of this survey is:
Select one:
a. The average amount of money spent by the 22 rst year college students that participated in the survey.
d. The average amount of money spent by all rst year students of the college.
The correct answer is: The average amount of money spent by all rst year students of the college.
Question 131 Not answered Marked out of 1.00
Consider the following relative frequency table on hurricanes that have made direct hits on the U.S. between 1851 and 2004. Hurricanes are given a
strength category rating based on the minimum wind speed generated by the storm. (http://www.nhc.noaa.gov/gifs/table5.gif ) (ALTERNATE
DOWNLOAD LINK)
Answer:
The sample variance is equal to (the answer may be rounded up to 3 decimal places of the actual value):
Answer:
Distribution
of Y
Value Probability
0 0.23
4 0.14
12
16 0.16
20 0.20
Complete the probabilities of the random variable Y in above table. The expectation of Y is equal to
Answer:
Meiosis is the process in which a diploid cell that contains two copies of the genetic material produces an haploid cell with only one copy (sperms and
eggs). The resulting molecule of genetic material is linear molecule that is composed of consecutive segments: a segment that originated from one of
the two copies followed by a segment from the other copy and vice versa. The border points between segments are called points of crossover. The
Haldane model for crossovers states that the number of crossovers between two loci on the genome has a Poisson(λ) distribution. Assume that the
expected number of crossovers between two loci is 0.66. The probability of obtaining at most 1 crossovers between the two loci is (The answer may be
rounded up to 3 decimal places of the actual value.)
Answer:
The patient recovery time from a particular surgical procedure is Normally distributed with a mean of 9.3 days and a standard deviation of 3.5 days. The
probability of spending between 5 to 10 days in recovery is: (The answer may be rounded up to 3 decimal places of the actual value.)
Answer:
It is claimed that the expected length of time some computer part may work before requiring a reboot is 26 days. In order to examine this claim 80
identical parts are set to work. Assume that the distribution of the length of time the part can work (in days) is Exponential. The central region that
contains 70% of the distribution of the average is of the form E(X) ± c, where E(X) is the expectation of the sample average. The value of c is (You are
asked to apply the Normal approximation to the distribution of the average of the 80 parts that are examined. The answer may be rounded up to 3
decimal places of the actual value.)
Answer:
Jump to...
Home My courses MATH 1280 - AY2021-T2 Final Exam (Days 1 - 4) Review Quiz
Information
An automobile producing company operates car dealerships that sell its cars. In order to assess the service costumers get at the dealerships a team
from the Customer Relationship Department was assembled. The team was sent to 20 randomly selected dealerships. Each dealership was visited once
by the team for an entire working day. During a visit the team interviewed all the customers that arrived at the that dealership.
One of the questions that each of the customers were asked is: "Do you currently possess a car made by our company?" The answers were marked
down as "Yes", "No", or "Refuse to Answer", depending on the customer's response.
Select one:
The correct answer is: Customers that arrive at dealerships of the company.
Select one:
d. Customers that arrive at the selected dealerships during the day of visit.
The correct answer is: Customers that arrive at the selected dealerships during the day of visit.
Question 3 Not answered Marked out of 1.00
Select one:
b. Customers that arrive at the selected dealerships during the day of visit.
The correct answer is: The percentage of costumers that possess a car made by the company.
A statistic that may be used to summarize the outcome of the survey is:
Select one:
a. Chi-Square
b. Percentage
c. T-Test
d. Anova
Information
The Customers Service Center of a large bank receive calls from customers. The number of incoming calls between 8:00 AM and 8:10 AM in consecutive
days were recorded. The number of incoming calls during the working days of the month of September were:
The number of incoming calls during the working days of the month of February were:
9, 11, 14, 6, 4, 6, 7, 3, 3, 2, 5, 6, 6, 5, 6, 7, 5, 4, 3, 5.
Create two R objects, one by the name "Sep", and the other by the name "Feb". The rst object should contain the rst data and the second object
should contain the second data. Produce a frequency table (with the function "table") for each of the objects and a bar plot (with the combination of the
function "plot" and the function "table"). For comparison, bar plots like the one you should obtain are presented in Figure 1.6. and Figure 1.7.
In light of the description given above, and based on the tables and/or the plots, select the correct answer in each of the following 6 questions:
Select one:
a. A population.
b. A sample.
c. A parameter.
d. A statistic.
The average number of calls that arrived between 8:00 AM and 8:10 AM during the working days of all months is:
Select one:
a. A population.
b. A sample.
c. A parameter.
d. A statistic.
The average number of calls recorded in the object "Feb" is 5.85. The average in the object "Sep" is 8.5. The di erence between these two numbers is:
Select one:
a. A population.
b. A sample.
c. A statistic.
In which of the months the number of incoming calls between 8:00 AM and 8:10 AM tends to be smaller?
Select one:
a. September.
b. February.
The height of the forth bar from the left of the bar plot for the month of February represents the fact that?
Select one:
The correct answer is: In 4 of the days of the month of February there were 5 incoming calls.
Question 10 Not answered Marked out of 1.00
The location of the highest bar of the bar plot for the month of September represents the fact that?
Select one:
The correct answer is: 10 incoming calls came during 6 days of the month of September.
Information
The next two questions refer to the following relative frequency table on hurricanes that have made direct hits on the U.S. between 1851 and 2004.
Hurricanes are given a strength category rating based on the minimum wind speed generated by the storm. (http://www.nhc.noaa.gov/gifs/table5.gif )
(ALTERNATE DOWNLOAD LINK)
What is the relative frequency of direct hits that were category 2 hurricanes?
Select one:
a. 0.2637
b. 0.7363
c. 0.2601
The total of all relative frequencies is 1.000. Denote by p the relative frequency of category 2 hurricanes. Observe that
0.3993 + p + 0.2601 + 0.0659 + 0.0110 = 1.0000
Consequently,
p = 1 - 0.3993 + 0.2601 + 0.0659 + 0.0110 = 1 - 0.7363 = 0.2637
The relative frequency of direct hits that were AT LEAST a category 3 storm is . (The numerical answer that you provide should be of the
The relative frequency of category 3 or more is the sum of the relative frequencies of categories 3, 4 , and 5:
0.2601 + 0.0659 + 0.0110 = 0.3370
Sixty adults with gum disease were asked the number of times per week they used to oss before their diagnoses. The (incomplete) results are shown
below:
Answer:
It is given that the total number of adults with gum disease is 60. There are 3 such adults that ossed 6 times per week. Therefore, the relative frequency is 3/60 =
0.0500
Answer:
The relative frequency of adults that do not oss at all is 0.4500. All other adults oss at least once a week. Their relative frequency is 1.0000 - 0.4500 = 0.5500.
Information
The number of malfunctioning products per production series was recorded for several production series. The data was entered into an R object by the
name "malfunction". The next 3 questions refer to the following R code:
Select one:
a. 8
b. 9
c. 72
The object "freq" contain the table of frequency of the production series, divided according to the number of malfunctioning products that they had. The cumulative
frequency of all the production series that had 8 malfunctioning products or less, which includes all production series, is reported under the number "8" in the output
of the expression "cumsum(freq)". This number is 72.
The frequency of production series where there are 4 malfunctioning products is:
Select one:
a. 57
b. 9
c. 16
The cumulative frequency of production series that have 4 malfunctioning products or less is 57. The cumulative frequency of production series that have 3
malfunctioning products or less or less is 41. The frequency of production series that have exactly 4 production series is the di erence between these two numbers:
57 - 41 = 16.
The frequency of production series where there are less than 7 malfunctioning products is:
Select one:
a. 70
b. 71
c. 72
Having less that 7 malfunctioning products corresponds to having 6 malfunctioning products or less. The cumulative frequency of production series with 6
malfunctioning products or less is 70.
Information
The le "ex2.csv" contains information on the blood pressure of a group of healthy individuals. The le is located in
http://pluto.huji.ac.il/~msby/StatThink/Datasets/ex2.csv. (ALTERNATE DOWNLOAD LINK) Read the data into R and answer the following 4 questions:
Answer:
After saving the le "ex2.csv" in the working directory one can use the code
> ex2 <- read.csv("ex2.csv")
in order to read the le into a data frame by the name "ex2". Writing the content of the object to the screen will produce:
> ex2
...
Select one:
a. numeric
b. factor
All the values are numbers. Technically, R treats this variable as a numeric sequence. However, one would typically not use this variable for statistical inference.
Usually, it serves as a key, a unique identi er, in data set management.
Select one:
a. numeric
b. factor
Select one:
a. numeric
b. factor
Information
A study was done to determine the age, type of activity, number of times per week and the duration (amount of time) of resident use of a local park in
San Jose. The rst house in the neighborhood around the park was selected randomly and then every 8th house in the neighborhood around the park
was interviewed. Answer the following 2 questions:
Select one:
a. qualitative/factor
b. quantitative/numeric
Select one:
a. qualitative/factor
b. quantitative/numeric
Information
Identify the type of data that would be used to describe a response for each of the items below:
Select one:
a. qualitative/factor
b. quantitative/numeric
Select one:
a. qualitative/factor
b. quantitative/numeric
Select one:
a. qualitative/factor
b. quantitative/numeric
Select one:
a. qualitative/factor
b. quantitative/numeric
Select one:
a. qualitative/factor
b. quantitative/numeric
In Figure A you will nd box plots for three sets of data. In Figure B are the histograms for the same sets of data, but in a di erent order. Associate each
box plot with its relative histogram.
Figure A:
Figure B:
Observe the range of distribution each data: [10,40] in Histogram a, [-40,30] in Histogram b and [14,28] in Histogram c. (You may want to increase the Plot. That can be
done on many browsers with Control plus the "+" key. Or you may download the gure an open it with a graphical application.)
The correct answer is: Box plot 1 → Histogram b, Box plot 2 → Histogram a, Box plot 3 → Histogram c
Question 30 Not answered Marked out of 1.00
Consider the box plots in Figure A. Which of the data has a smaller inter-quartile range (IQR)?
Figure A
Select one:
a. Box plot 1
b. Box plot 2
c. Box plot 3
The hight of the central box in Box-plot 3 is the least of the three.
Information
Select one:
a. 9.1
b. 13.3
c. 14.8
Run the code: > x<-c(11.9,11.0,12.4,16.9,16.3,13.3,9.1, 17.0,11.0, 9.3,25.3,17.4,17.4) > median(x) [1] 13.3
Select one:
a. True
b. False
Run the code: > x<-c(11.9,11.0,12.4,16.9,16.3,13.3,9.1, 17.0,11.0, 9.3,25.3,17.4,17.4) > boxplot(x) Observe, in the box plot that is created, that there are
no outliers.
Select one:
a. True
b. False
Run the code: > x<-c(11.9,11.0,12.4,16.9,16.3,13.3,9.1, 17.0,11.0, 9.3,25.3,17.4,17.4) > boxplot(x) Observe, in the box plot that is created, that there are
no outliers.
Create an R data frame with the name "ex.2" that contains the data in the le "ex2.csv" (Select the le name to download it).
Compute the standard deviation of each of the numeric variables. Among the following, the variable with the largest standard deviation is:
Select one:
a. age
b. bmi
c. systolic
d. diastolic
> sd(ex.2$age)
[1] 3.805571
> sd(ex.2$bmi)
[1] 3.881489
> sd(ex.2$systolic)
[1] 11.27262
> sd(ex.2$diastolic)
[1] 11.56522
Twenty-one randomly selected students were asked the number of pairs of sneakers they owned. The number of pairs of sneakers owned by each
student was recorded in an R object by the name "x". The frequency table of the data "x" is:
> table(x) x
1 2 3 4 5 6
4 7 3 3 2 2
Answer:
Run the code: > x.val <- c(1,2,3,4,5,6) > freq <- c(4,7,3,3,2,2) > rel.freq <- freq/sum(freq) > x.bar <- sum(x.val*rel.freq) > x.bar [1] 2.904762
Answer:
Answer:
Observe that more than 25% of the distribution has accumulated at value "2" but less than that at value "1".
The median is
Answer:
that more than 50% of the distribution has accumulated at value "2" but less than that at value "1".
Answer:
that more than 75% of the distribution has accumulated at value "4" but less than that at value "3".
Answer:
Observe that the frequency of the values "1" and "2" is more than the frequency of the values "5" and "6".
The relative frequency of the students that owned more than one but less than 5 sneakers is is
Answer:
Information
Following are the possible weights (in pounds) of some football team members.
232, 251, 257, 268, 238, 222, 265, 263, 252, 246, 253, 248, 256, 248, 230, 219, 224, 267, 259, 254, 254, 261, 248, 221, 252, 269, 269, 273, 273, 259, 251,
222, 248, 224
Answer:
Answer:
Answer:
The median is
Answer:
Answer:
The USC quarterback Matt Barkley weighed 220 pounds in the spring of 2010. How many standard deviations above or below the mean was he in
comparison to the data given above? (Give the answer in the format x.xxx, without the plus/minus sign)
Answer:
The following frequency table shows the lengths of 42 international phone calls using a $5 prepaid calling card. The data was stored in an object by the
name "x":
x 4 14 24 34 44 54
2 6 13 13 6 2
Using the data, and without computing the mean and the median, determine which ONE of the answers is correct:
Select one:
Observe that the distribution is symmetric. The values are equally spaced and the frequencies evenly distibuted.
The correct answer is: The mean and the median are equal.
Question 49 Not answered Marked out of 1.00
Consider the following data set: 4, 6, 6, 12, 18, 18, 18, 200. What value is (approximately) 0.75 standard deviations below the mean?
Select one:
b. Approximately -15
c. Approximately 4
d. Approximately 34.5
Consider the code: > x <- c(4, 6, 6, 12, 18, 18, 18, 200) > mean(x) - 0.75*sd(x) [1] -14.87231
In Chapter 3 the data frame "ex.2", which contained information associated to the blood pressure of a sample of 150 men and women, was introduced.
Let us assume that this sample was taken from an imaginary population of size 100,000 and let the information for all the members of this population
be stored in a CSV le by the name "pop2.csv".
Read the content of the population le into an data frame under the name "pop.2". Applying the function "summary" to this data frame produces:
> summary(pop.2)
group
HIGH :28126
LOW : 4215
NORMAL:67659
The variables in this data frame are the same variables that were included in the data frame "ex.2" of Chapter 3. The variables that are
included in this data frame are:
id:
A numerical variable. A 7 digits number that serves as a unique identi er of the subject.
sex:
A factor variable. The sex of each subject. The values are either "MALE" or "FEMALE".
age:
A numerical variable. The age of each subject.
bmi:
A numerical variable. The body mass index of each subject.
systolic:
A numerical variable. The systolic blood pressure of each subject.
diastolic:
A numerical variable. The diastolic blood pressure of each subject.
group:
A factor variable. The blood pressure category of each subject. The values are "NORMAL" both the systolic blood pressure is within its normal range
(between 90 and 139) and the diastolic blood pressure is within its normal range (between 60 and 89). The value is "HIGH" if either measurements of
blood pressure are above their normal upper limits and it is "LOW" if either measurements are below their normal lower limits.
The next 6 questions correspond to a person that is sampled at random from this population. The answers below may be rounded up to two decimal
places.
Answer:
Run the code: > pop.2 <- read.csv("pop2.csv") > median(pop.2$age) [1] 35
Answer:
Run the code: > pop.2 <- read.csv("pop2.csv") > var(pop.2$diastolic) [1] 171.6469
The standard deviation of the di erence between the systolic and diastolic blood pressures was: (Hint: Observe that the di erence
" pop.2$systolic - pop.2$diastolic " produces the di erence between the two types of blood pressure for all the members in the population.)
Run the code: > pop.2 <- read.csv("pop2.csv") > sd(pop.2$systolic - pop.2$diastolic) [1] 3.950757
Answer:
The probability that someone sampled from this data will have normal blood pressure is:
Answer:
Run the code: > pop.2 <- read.csv("pop2.csv") > summary(pop.2$group) HIGH LOW NORMAL 28126 4215 67659 There are 67,659 individuals that are classi ed as
"NORMAL" among the total population of 100,000. Hence, the probability is 0.67659.
Mark the following statement as either TRUE or FALSE: The standard deviation of the di erence between the systolic and diastolic blood pressures is
equal to the di erence between the standard deviation of systolic blood pressure and the standard deviation of diastolic blood pressure
The
Distribution
of Y
Value Probability
1.5 0.15
4
5.5 0.10
6 0.23
7.5 0.11
10 0.05
Complete the probabilities of the random variable Y in above table and compute
Answer:
The missing probability is equal to 1 -(0.15 + 0.1 + 0.23 + 0.11 + 0.05) = 0.36 The event in question involves the values 1.5 and 4. Hence P(Y < 5) = 0.15 + 0.36 = 0.51.
Answer:
The missing probability is equal to 1 -(0.15 + 0.1 + 0.23 + 0.11 + 0.05) = 0.36 The event in question involves the values 1.5 and 5.5 and 7.5. Hence P(not an integer) =
0.15 + 0.10 + 0.11 = 0.36.
E(Y) equals
Answer:
The missing probability is equal to 1 -(0.15 + 0.1 + 0.23 + 0.11 + 0.05) = 0.36 Run the code:
> Y.val <- c(1.5,4,5.5,6,7.5,10) > P.val <- c(0.15,0.36,0.1,0.23,0.11,0.05) > sum(Y.val*P.val) [1] 4.92
Answer:
The missing probability is equal to 1 -(0.15 + 0.1 + 0.23 + 0.11 + 0.05) = 0.36 Run the code:
> Y.val <- c(1.5,4,5.5,6,7.5,10) > P.val <- c(0.15,0.36,0.1,0.23,0.11,0.05) > E <- sum(Y.val*P.val) > Var <- sum((Y.val-E)^2*P.val) > sqrt(Var) [1] 2.093705
One selects an integer between 1 and 9 (including 1 and 9) at random. Let X be a random variable that obtains as a value the integer that was selected.
The following questions correspond to this random variable.
The
Distribution
of X
Value Probability
Complete table of distribution of X and compute:
E(X) is equal to
Answer:
The values of the random variables are the integers between 1 and 9 and the probability of each value is 1/9 In order to compute the expectation run the code:
> X.val <- c(1,2,3,4,5,6,7,8,9) > P.val <- c(1,1,1,1,1,1,1,1,1)/9 > sum(X.val*P.val) [1] 5
Answer:
The values of the random variables are the integers between 1 and 9 and the probability of each value is 1/9 In order to compute the variance run the code:
> X.val <- c(1,2,3,4,5,6,7,8,9) > P.val <- c(1,1,1,1,1,1,1,1,1)/9 > E <- sum(X.val*P.val) > sum((X.val-E)^2*P.val) [1] 6.666667
Answer:
The values of the random variables are the integers between 1 and 9 and the probability of each value is 1/9 In order to compute the variance run the code:
> X.val <- c(1,2,3,4,5,6,7,8,9) > P.val <- c(1,1,1,1,1,1,1,1,1)/9 > E <- sum(X.val*P.val) > Var <- sum((X.val-E)^2*P.val) > sqrt(Var) [1] 2.581989
Information
Suppose that you are o ered the following "deal." An impartial person selects an integer between 1 and 9 at random and you try to guess beforehand
which number will be selected. If you guess wrong then you pay $1 and if you guess right then you win $10. Call the outcome of the game your "gain."
(Note that if you pay money then your gain is negative.)
Question 63 Not answered Marked out of 1.00
Select one:
d. Unknown.
You select a number. The probability that the speci c number that you have selected will turn out is 1/9. The probability that you miss is 8/9.
Answer:
You select a number. The probability that the speci c number that you have selected will turn out is 1/9. The probability that you miss is 8/9. Let X be the gain from the
gain. The gain is 10 if you win (with probability 1/9) and -1 if you loose (with probability 8/9). The expectation is E(X) = 10/9 + (-1)*8/9 = 2/9 = 0.2222
Information
Approximately 85% of statistics students do their homework in time for it to be collected and graded. Let X be the number students that submit their
homework in time out of a statistics class of 70 students. The following 4 questions refer to this X. (The answer may be rounded up to 3 decimal places
of the actual value.)
Select one:
The possible outcome of X, the number of students out of 70 that submit their homework, is an integer and the range of values starts at 0 (no one submits) and ends
in 70 (all submit).
The correct answer is: The integers between (and including) 0 and 70.
The probability that less than 60 of the 70 students will do their homework on time is:
Answer:
Less that 60 means 59 or less. The probability P(X ≤ 59) can be computed with the code: > pbinom(59,70,0.85) [1] 0.4842268
Answer:
Answer:
The variance of X is n * p * (1-p) = 70 * 0.85 * 0.15 = 8.925. The standard deviation is the square root of the variance, namely 2.987474. The correct answer is:
2.987474.
Information
In Chapter 1 it was claimed that when tossing a fair coin 4 times it is quite likely to not obtain 2 heads and 2 tails. However, when tossing a fair coin
4,000 times one should expect to obtain number of tails in the range between 1940 and 2060. Let us compare the situation for 2 versus 2,000 coins.
Let X be the number of heads when tossing a fair coin 2 times and let Y be the number of heads when tossing a fair coin 2,000 times. (The answer may
be rounded up to 3 decimal places of the actual value).
P( X = 1) is equal to
Answer:
The distribution of X is Binomial(2,0.5). The probability P(X = 1) can be computed with the code > dbinom(1,2,0.5) [1] 0.5
Answer:
If P(X = 1) = 0.5 then the complementary probability (X is not equal to 1) is equal to 1 - P(X = 1) = 1 - 0.5 = 0.5
Answer:
If P(Y = 1000) = 0.01783901 then the complementary probability (Y is not equal to 1000) is equal to 1 - P(Y = 1000) = 1 - 0.01783901 = 0.982161
Answer:
The distribution of Y is Binomial(2000,0.5). The probability P(Y = 1000) can be computed with the code > dbinom(1000,2000,0.5) [1] 0.01783901
Answer:
The probability P(940 ≤ Y ≤ 1,060) is equal to the di erence between P(Y ≤ 1,060) and the probability P(Y < 940). The letter probability is equal to P(Y ≤ 939). This
di erence can be computed with the code: > pbinom(1060,2000,0.5) - pbinom(939,2000,0.5) [1] 0.9931974
E(X) is equal to
Answer:
Answer:
The variance of X is equal to n * p * (1-p) = 2 * 0.5 * 0.5 = 0.5. The standard deviation is the square root of 0.5, which is equal to 0.7071068
E(Y) is equal to
Answer:
Answer:
The variance of Y is equal to n * p * (1-p) = 2000 * 0.5 * 0.5 = 500. The standard deviation is the square root of 500, which is equal to 22.36068
Meiosisis the process in which a diploid cell that contains two copies of the genetic material produces an haploid cell with only one copy (sperms and
eggs). The resulting molecule of genetic material is linear molecule that is composed of consecutive segments: a segment that originated from one of
the two copies followed by a segment from the other copy and vice versa. The border points between segments are called points of crossover. The
Haldane model for crossovers states that the number of crossovers between two loci on the genome has a Poisson(λ) distribution. Assume that the
expected number of crossovers between two loci in a xed period of time is 2.25.The next 3 questions refer to this model for crossovers. (The answer
may be rounded up to 3 decimal places of the actual value.)
Answer:
The number of crossovers has a Poisson distribution with parameter λ = 2.25. The probability of exactly 4 crossovers can be computed with the code:
> dpois(4,2.25) [1] 0.1125528
Answer:
The number of crossovers has a Poisson distribution with parameter λ = 2.25. The probability of at least 4 crossovers can be computed as the di erence between 1
and the probability of 3 or less crossovers. The computation can be conducted with the code: > 1 - ppois(3,2.25) [1] 0.1905669
A recombination between two loci occurs if the number of crossovers is odd. The probability of recombination between the two loci is, approximately,
equal to (Compute the probability of recombination approximately using the function "dpois". Ignore odd values larger than 9)
Values (not larger than 9) that lead to recombination are 1, 3, 5, 7, and 9. The probability of these values for the Poisson distribution with parameter that is equal to λ =
2.25 can be computed with the code: > sum(dpois(c(1,3,5,7,9),2.25)) [1] 0.4944251
Information
The amount of time, in minutes, that a person must wait for a bus is uniformly distributed between 0 and 17 minutes, inclusive. The next 3 questions
refer to this waiting time. (The answer may be rounded up to 3 decimal places of the actual value.)
Answer:
Let the X be the length of time the person waits. The distribution of X is Uniform(0,17). The probability P(X ≤ 12.5) can be computed with the code
> punif(12.5,0,17) [1] 0.7352941
Answer:
Let the X be the length of time the person waits. The distribution of X is Uniform(0,17). The expectation of x is equal to (a+b)/2 = (0+17)/2 = 8.5
Answer:
Let the X be the length of time the person waits. The distribution of X is Uniform(0,17). The variance of x is equal to (b-a)^2/12 =(17-0)^2/12 = 24.08333. . The standard
deviation is the square root of the variance and is equal to 4.907477
Information
Let X be amount of time (in minutes) a postal clerk spends with his/her customer. Assume that X has an Exponential(λ) distribution and that E(X) = 7
minutes. The next 3 questions refer to this waiting time. (The answer may be rounded up to 3 decimal places of the actual value.)
Answer:
The expectation in the Exponential distribution is the reciprocal of the parameter λ.Consequently, the parameter λ is equal to the inverse of the expectation: λ = 1/E(X).
The expectation E(X) = 7, hence λ = 1/7 = 0.1428571
The probability that a clerk spends between four to ve minutes with a randomly selected customer is
Answer:
Let the distribution of X be Exponential(1/7). The probability P(4 ≤ X ≤ 5) is equal to the di erence between P(X ≤ 5) and the probability P(X < 4). The letter probability is
equal to P(X ≤ 4), since the distribution is continuous. This di erence can be computed with the code: > pexp(5,1/7)-pexp(4,1/7) [1] 0.07517646
The probability that a clerk spends more than 10 minutes with a customer is
Answer:
Let the distribution of X be Exponential(1/7). The probability P(10 < X) is equal to the di erence between 1 and the probability P(X ≤ 10). This di erence can be
computed with the code: > 1-pexp(10,1/7) [1] 0.2396510
According to some study, the height for Northern European adult males is normally distributed with an average of 181 centimeter and a standard
deviation of 7.3 centimeter. Suppose such an adult male is randomly chosen. Let X be height of that person. The next 3 questions correspond to this
information. The answer may be rounded up to 3 decimal places of the actual value.
The probability that the person is between 160 and 170 centimeters is
Answer:
2
The distribution of X is Normal(181,(7.3) ). The Probability of the interval is equal to the di erence between P(X ≤ 170) and P(X ≤ 160)
> pnorm(170,181,7.3) - pnorm(160,181,7.3) [1] 0.06391543
Answer:
2
The distribution of X is Normal(181,(7.3) ). The Probability of being larger than 190 is equal to the di erence between 1 and P(X ≤ 190)
> 1 - pnorm(190,181,7.3) [1] 0.1088109
Select one:
a. (171.6, 190.4)
b. (173.4, 188.6)
c. (174.9, 187.1)
d. (176.1, 185.7)
2
The distribution of X is Normal(181,(7.3) ). The central region that contains 60% of the distribution is the region between the 0.2-percentile and the 0.8-percentile
> qnorm(0.2,181,7.3) [1] 174.8562 > qnorm(0.8,181,7.3) [1] 187.1438
Information
Terri Vogel, an amateur motorcycle racer, averages 129.71 seconds per 2.5 mile lap (in a 7 lap race) with a standard deviation of 2.28 seconds . The
distribution of her race times is normally distributed. We are interested in one of her randomly selected laps. The next 4 questions correspond to this
information. The answer may be rounded up to 3 decimal places of the actual value:
The proportion (a number between 0 and 1) of her laps that are completed in less than 125 seconds is
Answer:
2
The distribution of a random lap is Normal(129.71,(2.28) ). The Probability of being less or equal than 125 is > pnorm(125,129.71,2.28) [1] 0.01942418
The fastest 10% of her laps are completed under how many seconds?
Answer:
2
The distribution of a random lap is Normal(129.71,(2.28) ). The fastest 10% of the laps are completed under the 0.1-percentile of the given probability. This percentile
is > qnorm(0.1,129.71,2.28) [1] 126.7881
The middle 90% of her laps are from a seconds to b seconds. Then a is equal to
Answer:
2
The distribution of a random lap is Normal(129.71,(2.28) ). The central region that contains 90% is [a,b], where a is equal to the 0.05-percentile of the distribution
> qnorm(0.05,129.71,2.28) [1] 125.9597
The middle 90% of her laps are from a seconds to b seconds. Then b is equal to
Answer:
2
The distribution of a random lap is Normal(129.71,(2.28) ). The central region that contains 90% is [a,b], where a is equal to the 0.95-percentile of the distribution
> qnorm(0.95,129.71,2.28) [1] 133.4603
Information
Suppose that Ricardo and Anita attend the same college. Ricardo's GPA is better than 30% of his school mates but worse than the other 70%. Anita's
GPA is 0.60 standard deviations below her school average. All the students that were at least one standard deviation above the mean obtained an "A"
or an "A+" score, which corresponded to about 16% of the students. Assume GPA are Normally distributed. For each of the following sentences mark
one of the options:
Select one:
a. Always True
c. Always False
Ricardo's GPA corresponds to the 0.3-percentile of the students. This score is -0.5244005 standards deviations below the average since, if we consider standardized z-
scores we get > qnorm(0.3) [1] -0.5244005 According to the information in the question Anita's GPA is 0.6 standard deviations below the average. Consequently,
her z-score is -0.6, which is lower than Ricardo's.
Select one:
a. Always True
c. Always False
Ricardo's GPA corresponds to the 0.3-percentile of the students. This score is 0.5244005 standards deviations below the average since, if we consider standardized z-
scores we get > qnorm(0.3) [1] -0.5244005 The resulting z-score is negative.
Select one:
a. Always True
c. Always False
There is no information in the question that tels us what is the standard deviation of the GPA. The number 0.16 in the question refers to a probability, not to the
standard deviation.
Information
Some measurement on a population has a Normal distribution with expectation of 1000 and standard deviation of 150. We denote de di erence
between the 0.75-percentile and the 0.25-percentile the Interquartile Range. Identify a measurement value as outlier if it is larger than the 0.75-
percentile plus 1.5 times the interquartile range or it is smaller than the the 0.25-percentile minus 1.5 times the interquartile range. The next 5
questions correspond to this information. The answer may be rounded up to 3 decimal places of the actual value:
Answer:
2
The distribution of the measurement is Normal(1000,(150) ). The 0.25 percentile of the measurement is > qnorm(0.25,1000,150) [1] 898.8265
Answer:
2
he distribution of the measurement is Normal(1000,(150) ). The 0.75-percentile of the measurement is > qnorm(0.75,1000,150) [1] 1101.173
Answer:
The inter-quartile range is the di erence between the 0.75-percentile (Q3) and the 0.25-percentile (Q1). That is, 1101.173 - 898.8265 = 202.3465
Answer:
The upper threshold for the identi cation of an outlier is Q3 + 1.5*(Q3 - Q1). The probability of being above this threshold is equal to the probability of being below the
lower threshold. The probability of identifying an outlier twice the probability of being above the upper threshold
> u <- 1101.173 + 1.5*202.3465 > 2*(1-pnorm(u,1000,150)) [1] 0.006976757 This is the same probability that we found in Subsection 6.2.4
Denote the probability of an outlier that was computed in the previous question by p. Consider a di erent Normal measurement that has a di erent
mean and di erent standard deviation than the measurement that was considered before. In the context of the current measurement, the probability
of a random measurement to be identi ed as and outlier is:
Select one:
If the mean or the standard change then the quartiles and the inter-quartile range change accordingly. In terms of the z-scores, the computation reduces to the
computation that was presented in Subsection 6.2.4.
Information
The patient recovery time from a particular surgical procedure is normally distributed with a mean of 5.3 days and a standard deviation of 2.1 days. The
next 4 questions correspond to this information.
What is the inter-quartile range of the recovery time? (Choose the closest possibility.)
Select one:
a. 2.8
b. 5.3
c. 7.4
d. 2.1
The distribution is Normal with expectation of 5.3 and standard deviation of 2.1. The inter-quartile range is the di erence between the 0.75-percentile (Q3) and the
0.25-percentile (Q1). > qnorm(0.75,5.3,2.1) - qnorm(0.25,5.3,2.1) [1] 2.832857
What is the z-score for a patient who takes 6 days to recover? (Choose the closest possibility.)
Select one:
a. 1.5
b. 0.3
c. 2.2
d. 7.3
The distribution is Normal with expectation of 5.3 and standard deviation of 2.1. The z-score of the value 10 is equal to (6 - 5.3)/2.1 = 0.3333333
What is the probability of spending less than 2 days in recovery? (Choose the closest possibility.)
Select one:
a. 0.0580
b. 0.8447
c. 0.0553
d. 0.9420
The distribution is Normal with expectation of 5.3 and standard deviation of 2.1. The probability of being below 2 is > pnorm(2,5.3,2.1) [1] 0.05804157
Select one:
a. 8.89
b. 7.07
c. 7.99
d. 4.32
The distribution is Normal with expectation of 5.3 and standard deviation of 2.1. The 32th percentile for recovery times is > qnorm(0.32,5.3,2.1) [1] 4.317833
Information
Let the distribution of X be Binomial(150,0.8). The next 4 questions correspond to this information. The answer may be rounded up to 3 decimal places
of the actual value:
Answer:
The probability P(121 < X ≤ 129) is equal to the di erence P(X ≤ 129) - P(X ≤ 121). The di erence is equal to
> pbinom(129,150,0.8) - pbinom(121,150,0.8) [1] 0.3645757
The Normal approximation (without continuity correction) of the probability P(121 < X ≤ 129) is equal to
Answer:
The probability P(121 < X ≤ 129) is equal to the di erence P(X ≤ 129) - P(X ≤ 121). In the Normal approximation we substitute the Binomial distribution with the Normal
distribution that has the same expectation and variance. The expectation of the Binomial is 150*0.8 and the variance is 150*0.8*0.2. The approximation produces
> mu <- 150*0.8 > sig <- sqrt(150*0.8*0.2) > pnorm(129,mu,sig) - pnorm(121,mu,sig) [1] 0.3860320
The Normal approximation (with continuity correction) of the probability P(121< X ≤ 129) is equal to
Answer:
The probability P(121 < X ≤ 129) involves the values {122, 123, ..., 129}. In the Normal approximation we substitute the Binomial distribution with the Normal
distribution that has the same expectation and variance. The expectation of the Binomial is 150*0.8 and the variance is 150*0.8*0.2. The continuity correction involves
the consideration of intervals of radius 0.5 about each value. The di erence to be considered is P(X ≤ 129.5) - P(X ≤ 121.5). The approximation produces
> mu <- 150*0.8 > sig <- sqrt(150*0.8*0.2) > pnorm(129.5,mu,sig) - pnorm(121.5,mu,sig) [1] 0.3534917
Answer:
The probability P(121 < X ≤ 129) is equal to the di erence P(X ≤ 129) - P(X ≤ 121). In the Poisson approximation we substitute the Binomial distribution with the Poisson
distribution that has the same expectation. The expectation of the Binomial is 150*0.8. The approximation produces
> lam <- 150*0.8 > ppois(129,lam) - ppois(121,lam) [1] 0.2478270
Information
Let the distribution of X be Binomial(15,0.8). The next 4 questions correspond to this information. The answer may be rounded up to 3 decimal places
of the actual value:
Answer:
The probability P(13 < X ≤ 16) is equal to the di erence P(X ≤ 16) - P(X ≤ 13). The di erence is equal to > pbinom(16,15,0.8) - pbinom(13,15,0.8) [1] 0.1671258
The Normal approximation (without continuity correction) of the probability P(13 < X ≤ 16) is equal to
Answer:
The probability P(13 < X ≤ 16) is equal to the di erence P(X ≤ 16) - P(X ≤ 13). In the Normal approximation we substitute the Binomial distribution with the Normal
distribution that has the same expectation and variance. The expectation of the Binomial is 15*0.8 and the variance is 15*0.8*0.2. The approximation produces
> mu <- 15*0.8 > sig <- sqrt(15*0.8*0.2) > pnorm(16,mu,sig) - pnorm(13,mu,sig) [1] 0.2543909
The Normal approximation (with continuity correction) of the probability P(13 < X ≤ 16) is equal to
Answer:
The probability P(13 < X ≤ 16) involves the values {14, 15, 16}. In the Normal approximation we substitute the Binomial distribution with the Normal distribution that
has the same expectation and variance. The expectation of the Binomial is 15*0.8 and the variance is 15*0.8*0.2. The continuity correction involves the consideration
of intervals of radius 0.5 about each value. The di erence to be considered is P(X ≤ 16.5) - P(X ≤ 13.5). The approximation produces
> mu <- 15*0.8 > sig <- sqrt(15*0.8*0.2) > pnorm(16.5,mu,sig) - pnorm(13.5,mu,sig) [1] 0.164623
Answer:
The probability P(13 < X ≤ 16) is equal to the di erence P(X ≤ 16) - P(X ≤ 13). In the Poisson approximation we substitute the Binomial distribution with the Poisson
distribution that has the same expectation. The expectation of the Binomial is 15*0.8. The approximation produces
> lam <- 15*0.8 > ppois(16,lam) - ppois(13,lam) [1] 0.2171734
Information
Recall that the population average of the heights in the le "pop1.csv" is μ = 170.035. Using simulation it can be shown that the probability of the
sample average of the height falling within 2 centimeter of the population average is approximately equal to 0.925. From the simulations we also got
that the standard deviation of the sample average is (approximately) equal to 1.122. In the next 3 questions you are asked to apply the Normal
approximation to the distribution of the sample average using this information. The answer may be rounded up to 3 decimal places of the actual value:
Using the Normal approximation, the probability that sample average of the heights falls within 2 centimeter of the population average is
Answer:
Using the Normal approximation, the computation of a probability associated with the random variable is conducted with the functions of the Normal distribution for
the same expectation and standard deviation as the original distribution. The expectation is μ = 170.035. The standard deviation is σ = 1.122. The event corresponds to
the interval [μ - 2, &mu + 2]. Therefore, the approximated probability is
> mu <- 170.035 > sig <- 1.122 > pnorm(mu+2,mu,sig) - pnorm(mu-2,mu,sig) [1] 0.9253374
Using the Normal approximation we get that the central region that contains 90% of the distribution of the sample average is of the form 170.035 ± z ·
1.122. The value of z is
Answer:
The structure of the central region that contains 90% of the Normal distribution is μ ± qnorm(0.95) · σ. However, μ = 170.035 and σ = 1.122. Therefore, z =
qnorm(0.95) = 1.644854.
Using the Normal approximation, the probability that sample average of the heights is less than 169 is
Answer:
Using the Normal approximation, the computation of a probability associated with the random variable is conducted with the functions of the Normal distribution for
the same expectation and standard deviation as the original distribution. The expectation is μ = 170.035. The standard deviation is σ = 1.122. The event corresponds to
the values less than 169. Therefore, the approximated probability is > mu <- 170.035 > sig <- 1.122 > pnorm(169,mu,sig) [1] 0.1781444
According to the Internal Revenue Service, the average length of time for an individual to complete (record keep, learn, prepare, copy, assemble and
send) IRS Form 1040 is 10.53 hours (without any attached schedules). The distribution is unknown. Let us assume that the standard deviation is 2
hours. Suppose we randomly sample 36 taxpayers and compute their average time to completing the forms. Then the probability that the average is
less than 10 hours is approximately equal to (The answer may be rounded up to 3 decimal places of the actual value)
Answer:
Applying the Central Limit Theorem we carry out the computation using the Normal distribution with the same expectation and standard deviation as the sample
average. The expectation of the sample average is equal to the expectation of a single measurement (10.53) and the standard deviation is equal to the standard
deviation of a single measurement (2), divided by the square root of the number of observations (36). Consequently, the approximate probability is
> pnorm(10,10.53,2/sqrt(36)) [1] 0.0559174
Information
Suppose that a category of world class runners are known to run a marathon (26 miles) in an expectation of 145 minutes with a standard deviation of
14 minutes. Consider 49 random races. In the next 3 questions you are asked to apply the Normal approximation to the distribution of the sample
average using this information. The answer may be rounded up to 3 decimal places of the actual value:
The probability that the runner will average between 143 and 145 minutes in these 49 marathons is
Answer:
Applying the Central Limit Theorem we carry out the computation using the Normal distribution with the same expectation and standard deviation as the sample
average. The expectation of the sample average is equal to the expectation of a single measurement (145) and the standard deviation is equal to the standard
deviation of a single measurement (14), divided by the square root of the number of observations (49). Consequently, the approximate probability is
> pnorm(145,145,14/sqrt(49))- pnorm(143,145,14/sqrt(49)) [1] 0.3413447
Answer:
Applying the Central Limit Theorem we carry out the computation using the Normal distribution with the same expectation and standard deviation as the sample
average. The expectation of the sample average is equal to the expectation of a single measurement (145) and the standard deviation is equal to the standard
deviation of a single measurement (14), divided by the square root of the number of observations (49). Consequently, the approximate 0.9-percentile is
> qnorm(0.9,145,14/sqrt(49)) [1] 147.5631
Answer:
Applying the Central Limit Theorem we carry out the computation using the Normal distribution with the same expectation and standard deviation as the sample
average. The expectation of the sample average is equal to the expectation of a single measurement (145) and the standard deviation is equal to the standard
deviation of a single measurement (14), divided by the square root of the number of observations (49). Consequently, the approximate 0.25-percentile (Q1) is the 0.25-
percentile of the the appropriate Normal distribution. The same holds for the 0.75-percentile (Q3). The inter-quartile range is the di erence between these two
numbers > qnorm(0.75,145,14/sqrt(49)) - qnorm(0.25,145,14/sqrt(49)) [1] 2.697959
The time to wait for a particular rural bus is distributed uniformly from 0 to 25 minutes. 25 riders are randomly sampled and their waiting times
measured. The 90th percentile of the average waiting time (in minutes) computed for the sample is (approximately):
Select one:
a. 210.0
b. 26.9
c. 14.3
d. 13.2
2
The expectation of the Uniform(0,25) distribution is 25/2 = 12.5 and the variance is 25 /12 = 52.08333. Applying the Central Limit Theorem we carry out the
computation using the Normal distribution with the same expectation and standard deviation as the sample average. The expectation of the sample average is equal
to the expectation of a single measurement (12.5) and the standard deviation is equal to the standard deviation of a single measurement (7.216878 = the square root
of 52.08333), divided by the square root of the number of observations (25). Consequently, the approximate 0.90-percentile is
> qnorm(0.90,12.5,7.216878/sqrt(25)) [1] 14.34976
Information
A switching board receives a random number of phone calls. The expected number of calls is 7.4 per minute. Assume that distribution of the number of
calls is Poisson. The average number of calls per minute is recorded by counting the total number of calls received in one hour, divided by 60, the
number of minutes in an hour. In the next 4 questions you are asked to apply the Normal approximation to the distribution of the sample average
using this information. The answer may be rounded up to 3 decimal places of the actual value:
Answer:
The expectation of the sample average is equal to the expectation of a single measurement, which is 7.4 in this example.
Answer:
The variance in the Poisson distribution is equal to the expectation. The standard deviation of the sample average is equal to the standard deviation of a single
measurement (2.720294 = the square root of 7.4), divided by the square root of the number of observations (60). The resulting standard deviation is 0.3511885.
Answer:
Applying the Central Limit Theorem we carry out the computation using the Normal distribution with the same expectation and standard deviation as the sample
average. The expectation of the sample average is equal to 7.4 and the standard deviation is equal to 0.3511885. Consequently, the approximate probability is
> pnorm(7,7.4,0.3511885) [1] 0.1273538
The probability that number of calls in a random minute is less than 7 is . (Note, the question is with resect to a random minute, and not
the average.)
The distribution of the number of calls in a random minute is Poisson(7.4). The event of less than 7 calls corresponds to the events of 6 calls or less. Consequently,
> ppois(6,7.4) [1] 0.3919617
Information
It is claimed that the expected length of time some computer part may work before requiring a reboot is 3.5 months. In order to examine this claim 70
identical parts are set to work. Assume that the distribution of the length of time the part can work (in months) is Exponential. In the next 4 questions
you are asked to apply the Normal approximation to the distribution of the average of the 70 parts that are examined. The answer may be rounded up
to 3 decimal places of the actual value:
Answer:
The expectation of the sample average is equal to the expectation of a single measurement, which is 3.5 in this example.
Answer:
2
The variance in the Exponential(λ) distribution is equal to 1/λ . Since the expectation is 1/λ, we get that the variance is equal to the square of the expectation, which is
2 2
3.5 . The standard deviation of the sample average is equal to the standard deviation of a single measurement (3.5 = the square root of 3.5 ), divided by the square
root of the number of observations (70). The resulting standard deviation is 0.41833.
The central region that contains 90% of the distribution of the average is of the form E(X) ± c, where E(X) is the expectation of the sample average. The
value of c is
Answer:
The structure of the central region that contains 90% of the Normal distribution is μ ± qnorm(0.95) · σ. However, μ = E(X), σ = 0.41833 and qnorm(0.95) = 1.644854.
Consequently, c = qnorm(0.95) · σ = 1.644854 · 0.41833 = 0.6880918
Answer:
Applying the Central Limit Theorem we carry out the computation using the Normal distribution with the same expectation and standard deviation as the sample
average. The expectation of the sample average is equal to 3.5 and the standard deviation is equal to 0.41833. The probability of being more than 4 is equal to the
di erence between one and the probability of being less or equal 4. Consequently, the approximate probability is > 1 - pnorm(4,3.5,0.41833) [1] 0.1159989
Consider the following study: We want to know the average amount of money rst year college students spend at NDD College on accessories. We
randomly survey 22 rst year students at the college. Four out of the 22 students spent $33, $110, $180, and $197, respectively. The target parameter
of this survey is:
Select one:
a. The average amount of money spent by the 22 rst year college students that participated in the survey.
d. The average amount of money spent by all rst year students of the college.
The correct answer is: The average amount of money spent by all rst year students of the college.
Question 131 Not answered Marked out of 1.00
Consider the following relative frequency table on hurricanes that have made direct hits on the U.S. between 1851 and 2004. Hurricanes are given a
strength category rating based on the minimum wind speed generated by the storm. (http://www.nhc.noaa.gov/gifs/table5.gif ) (ALTERNATE
DOWNLOAD LINK)
Answer:
The sample variance is equal to (the answer may be rounded up to 3 decimal places of the actual value):
Answer:
Distribution
of Y
Value Probability
0 0.23
4 0.14
12
16 0.16
20 0.20
Complete the probabilities of the random variable Y in above table. The expectation of Y is equal to
Answer:
Meiosis is the process in which a diploid cell that contains two copies of the genetic material produces an haploid cell with only one copy (sperms and
eggs). The resulting molecule of genetic material is linear molecule that is composed of consecutive segments: a segment that originated from one of
the two copies followed by a segment from the other copy and vice versa. The border points between segments are called points of crossover. The
Haldane model for crossovers states that the number of crossovers between two loci on the genome has a Poisson(λ) distribution. Assume that the
expected number of crossovers between two loci is 0.66. The probability of obtaining at most 1 crossovers between the two loci is (The answer may be
rounded up to 3 decimal places of the actual value.)
Answer:
The patient recovery time from a particular surgical procedure is Normally distributed with a mean of 9.3 days and a standard deviation of 3.5 days. The
probability of spending between 5 to 10 days in recovery is: (The answer may be rounded up to 3 decimal places of the actual value.)
Answer:
It is claimed that the expected length of time some computer part may work before requiring a reboot is 26 days. In order to examine this claim 80
identical parts are set to work. Assume that the distribution of the length of time the part can work (in days) is Exponential. The central region that
contains 70% of the distribution of the average is of the form E(X) ± c, where E(X) is the expectation of the sample average. The value of c is (You are
asked to apply the Normal approximation to the distribution of the average of the 80 parts that are examined. The answer may be rounded up to 3
decimal places of the actual value.)
Answer:
Jump to...