You are on page 1of 48

STATISTICAL ANALYSIS WITH SOFTWARE APPLICATION

(STATS 20053) MODULE ANSWERS

MODULE 1: INTRODUCTION TO THE STATISTICAL CONCEPTS

ACTIVITIES/ASSESSMENTS: Read each item carefully. Write the answer on the yellow
paper. Answers Only.

I. A research objective is presented. For each, identify the (A) population and (B) sample in
the study.

1. A polling organization contacts 2141 male university graduates who have a white-collar
job and asks whether or not they had received a raise at work during the past 4 months.

A. Male university graduates with white collar job.

B. 2141 male university graduates with white collar job

2. Every year the PSA releases the Current Population Report based on a survey of 50,000
households. The goal of this report is to learn the demographic characteristics, such as
income, of all households within the Philippines.

A. All households in the Philippines.

B. 50,000 households within the Philippines.

3. Researchers want to determine whether or not higher foliate intake is associated with a
lower risk of hypertension (high blood pressure) in women (27 to 44 years of age). To make
this determination, they look at 7373 cases of hypertension in these women and find that
those who consume at least 1000 micrograms per day of total foliate had a decreased risk
of hypertension compared with those who consume less than 200.

A. Women with hypertension ages 27 to 44 years old.

B. 7373 women with hypertension ages 27 to 44 years old who consume at least 1000
micrograms of foliate a day.

II. Indicate whether the following statements require the use of descriptive or inferential
statistics.

Inferential Statistics 1. A teacher wants to know the attitudes of all students towards
abortion.

Descriptive Statistics 2. A market analyst of a sales firm draws a chart showing the
sales figures of a given product for the period 2006-2007.
Descriptive Statistics 3. A forecaster predicts the results of an election using the
number of votes cast in 15 out of 25 barangays.

Inferential Statistics 4. Men are better in math than women.

Descriptive Statistics 5. Forty percent of the employees of an organization were


recorded tardy for at least 15 working days.

Inferential Statistics 6. There are very few gender-related occupations.

Descriptive Statistics 7. An account predicts accuracy rate of a client’s financial


resources.

Inferential Statistics 8. A quality control manager wishes to check production output.

Descriptive Statistics 9. Records indicated that 75% of the faculty in the graduate
school are doctoral degree holders.

Inferential Statistics 10. There is no relationship between educational qualification of


parents and academic achievement of their children.

III. Identify the qualitative and quantitative variables and indicate the highest level of
measurement required in each. If quantitative, classify whether discrete or continuous.

Qualitative-nominal 1. Occupation

Quantitative-ratio/discrete 2. Number of government officials

Qualitative-nominal 3. Favorite color

Quantitative-interval/continuous 4. Temperature in Celsius degrees

Qualitative-nominal 5. Type of school

Quantitative-ratio/continuous 6. Volume of mineral water sold daily

Qualitative-nominal 7. Employee number

Qualitative-nominal 8. Civil status

Qualitative-nominal 9. Equity accounts

Qualitative-nominal 10. Brands of soft drinks

Qualitative-nominal 11. Socioeconomic status

Qualitative-nominal 12. Status Employment

Quantitative-ratio/discrete 13. Number of missing teeth

Quantitative-ratio/discrete 14. Number of vehicles registered


Qualitative-nominal 15. Jersey Number

Quantitative-ratio/discrete 16. Number of employees collecting retirement benefits from


GSIS

Quantitative-interval/continuous 17. Duration of a seizure

Qualitative-nominal 18. Cause of death

Quantitative-ratio/discrete 19. Dividends

Qualitative-nominal 20. Current assets list

Quantitative-ratio/continuous 21. Number of heart attacks

Quantitative-ratio/discrete 22. Account receivable

Quantitative-ratio/discrete 23. Clothing size

Qualitative- nominal 24. Blood type

Qualitative- nominal 25. Ethnic group

1.
MODULE 2: DATA COLLECTION AND BASIC CONCEPTS IN SAMPLING DESIGN

ACTIVITIES/ASSESSMENTS:

I. Determine if the source would be a primary or a secondary source.

Secondary 1. Government Records

Secondary 2. Dictionary

Primary 3. Artifact

Secondary 4. A TV show explaining what happened in Philippines.

Secondary 5. Autobiography about Rodrigo Duterte.

Primary 6. Enrile diary describing what he thought about the World War II.

Primary 7. Audio and video recordings

Primary 8. Speeches

Secondary 9. Newspaper

Secondary 10. Review Articles

II. Determine the sample size of the following problems. Show your solution.

1. A dermatologist wishes to estimate the proportion of young adults who apply


sunscreen regularly before going out in the sun in the summer. Find the minimum
sample size required to estimate the proportion with precision of 3%, and 90%
confidence.

Confidence level 90%, means that �=1−0.90 = 0.10 so α/2=0.05.


�0.05=1.645. Since there is no prior knowledge of �, estimated �̂
=0.5. To estimate “to within three percentage points” means that
�=0.03.
�=(1.645)^2(0.5)(1−0.5)
0.03^2
=751.6736111 or 752

2. The administration at a college wishes to estimate, the proportion of all its entering
freshmen who graduate within four years, with 95% confidence. Estimate the
minimum size sample required. Assume 1. That the population standard deviation is
σ = 1.3 and precision level is 0.05.

�= 1- 0.95 = 0.05 so α/2= 0.025; � = 1.960; �̂ = 1.3 ; �=0.05


�= (1.960)^2 (1.3)
0.05^2
� = 1692
3. A government agency wishes to estimate the proportion of drivers aged 16–24 who
have been involved in a traffic accident in the last year. It wishes to make the
estimate to within 1% error and at 90% confidence. Find the minimum sample size
required, using the information that several years ago the proportion was 0.12.

z 2 (
n≥ ()

p 1− p )
n≥ (1.65/.01) ^2 (.12(1-.12))
n≥ 165^2 x .1056
n≥ 27,225 x .1056
n≥ 2875

4. An internet service provider wishes to estimate, to within one percentage error, the
current proportion of all email that is spam, with 85% confidence. Last year the
proportion that was spam was 71%. Estimate the minimum size sample required if
the total email that is spam is 10,000.

N
n≥ 2
1+ N ⅇ

n ≥ 10,000/ (1+10,000(.01^2))
n ≥ 10,000/2
n ≥ 5,000

III. Determine the type of sampling. (ex. Simple Random Sampling, Purposive Sampling)

Simple Random Sampling 1. To determine customer opinion of its boarding policy,


Southwest Airlines randomly selects 60 flights during a certain week and surveys all
passengers on the flights.

Cluster Sampling 2. A member of Congress wishes to determine her


constituency’s opinion regarding estate taxes. She divides her constituency into three
income classes: low-income households, middle-income households, and upper-income
households. She then takes a simple random sample of households from each income class.

Systematic Random Sampling 3. The presider of a guest lecture series at a university


stands outside the auditorium before a lecture begins and hands every fifth person who
arrives, beginning with the third, a speaker evaluation survey to be completed and
returned at the end of the program.

Simple Random Sampling 4. 24 Hour Fitness wants to administer a satisfaction survey to


its current members. Using its membership roster, the club randomly selects 40 club
members and asks them about their level of satisfaction with the club.

Convenience Sampling 5. A radio station asks its listeners to call in their


opinion regarding the use of U.S. forces in peacekeeping missions.

Systematic Random Sampling 6. A tax auditor selects every 1000th income tax return
that is received.
Multi-stage Sampling 7. For a survey, a sample of municipalities was selected
from every province in the country and included all child laborers in the selected
municipalities.

Stratified Random Sampling 8. To determine his DSL Internet connection speed,


Shawn divides up the day into four parts: morning, midday, evening, and late night. He
then measures his Internet connection speed at 5 randomly selected times during each part
of the day.

Cluster Sampling 9. A college official divides the student population into five
classes: freshman, sophomore, junior, senior, and graduate student. The official takes a
simple random sample from each class and asks the members opinions regarding student
services.

Simple Random Sampling 10. In the game of lotto, 6 balls are selected from a container
with 42 balls.

IV. Using proportional allocation, determine the sample size needed for every school. The
total population of students is 10,679, and the minimum sample is 2,450.

School Populatio Sample


n per
School
Antipolo National High School 3,360 771

Bagong Nayon National High 2,540 583


School
Dela Paz National High School 2,122 487

Sta. Cruz National High School 1,290 296

Tubigan National High School 1,367 314

Total 10,679 2,451

IV. SOLUTION

1. Antipolo National High School


Sample size = (2,450/ 10,679) x 3,360
= (0.2294 x 3,360)
Sample size = 770.784 or 771
2. Bagong Nayon National High School
Sample size = (2,450/ 10,679) x 2540
= (0.2294 x 2,540)
Sample size = 582.676 or 583
3. Dela Paz National High School
Sample size = (2,450/ 10,679) x 2,122
= (0.2294 x 2,122)
Sample size = 486.787 or 487
4. Sta. Cruz National High School
Sample size = (2,450/ 10,679) x 1,290
= (0.2294 x 1,290)
Sample size = 295.93 or 296
5. Tubigan National High School
Sample size = (2,450/ 10,679) x 1,367
= (0.2294 x 1,367)
Sample size = 313.59 or 314

MODULE 3: DESCRIPTIVE STATISTICS

ACTIVITIES/ASSESSMENTS:

1. Which one do you think is more informative? Why?

- For me, the second bar graph is more informative because it has a different categorization
of levels of likeliness. And it also provided the legends for the corresponding data. The
graphic’s axes are titled and labeled clearly. It also includes the units of measurement and
an appropriate data source.

2. What features of the ‘Good Presentation’ make it better than the ‘Bad Presentation’?
- The features that make the good presentation better than the bad presentation are; It is
more organized and neater to look at. The graph shows a more detailed and specific data
for the readers to easily interpret the given graph. The variables on the left-side graph of
the bad presentation lacks details. It only enumerates the wages in one-liner form and the
range of the specified quantity on the right-side graph is very difficult to estimate if one is
given a data like that. The good presentation has its title at the top, its variables
corresponding the subjects of the data obtained and its labels.

3. Review the table and consider questions such as the following.

1. What percentage of the employees originated from within the organization?

-59% of the employees originated within the organization

2. What percentage of the employees are both internal and rated ‘Very Good’?

- 23% of the internal employees are rated as Very Good.

3. What percentage of the employees received ‘Needs Improvement’ or ‘Poor’?

- 10% of the total percentage of the employees received a “Needs


Improvement” mark.

4. What category contains the greatest number of employees?

-The internal contains the greatest number of employees.

5. Do you see any notable differences in the percentage by category?

- There is a significant difference in terms of percentage of the number of


employees in both external and internal. The number of employees in external is
18% less than of the number of employees in the internal. Both external and internal
employees have the same number of percentages in terms of the numbers of
employees who received and are rated as “Excellent”.

4. Consider the above Frequency Distribution of Salaries.


1. What percentage of the employees earns less than or equal 80,000?

- The percentage of the employees who earns less than or equal to 80,000 is
78%.

2. What is the salary range of values?

- The salary range of values is 69,000 (110,000 – 41,000)

3. What salary categories have percentage less than 5?

- 41,000 – 50,000, 91,000 – 100,000 and 101,000 – 110,000.

4. What salary category includes the most employees?

- The salary category that includes the most employees are within the 61,000
-70, 000 salary range.

5. The length of life of an instrument produced by a machine has a normal distribution with
a mean of 12 months and standard deviation of 2 months. Find the probability that an
instrument produced by this machine will last
A. less than 7 months.

μ =12months
σ =2 months
z=(X−μ)/σ

For, Z=(7−12)/2=−2.5

P(X<7)= P(Z<−2.5)
=0.0062 (using z-score table)
B. between 7 and 12 months.

The required probability is P(7<X<12)= P[(7-12)/2 <Z< (12-12)/2]

=P(-2.5<Z<0)

=0.4938 (using z-score table).

Be sure to draw a normal curve with the area corresponding to the probability shaded.

6. The lengths of human pregnancies are approximately normally distributed, with


mean days and standard deviation days.
A. What proportion of pregnancies lasts more than 270 days?

μ =266
σ =16
z=(X−μ)/σ

Z=(270−266)/16=0.25
P(X>270) =P(Z>0.25)
= 0.4013

B. What proportion of pregnancies lasts less than 250 days?

μ =266
σ =16
z=(X−μ)/σ

Z=(250−266)/16=-1
P(X<250) =P(Z<-1)
= 0.1587
C. What proportion of pregnancies lasts between 240 and 280 days?

μ =266
σ =16
z=(X−μ)/σ

P(240<X<280)= P[(240-266)/16 <Z< (280-266)/16]


= 0.7571

D. What is the probability that a randomly selected pregnancy? lasts more than 280
days?

The probability that a randomly selected pregnancy lasts more than 280
days is 0.1908. The probability that a randomly selected pregnancy
lasts fewer than 240 days is 0.0521.
μ =266
σ =16
z=(X−μ)/σ

Z=(280−266)/16=0.875
P(X>280) =P(Z>0.875)
= 0.1908

Be sure to draw a normal curve with the area corresponding to the probability shaded.

7. Construct frequency distribution table based on the scores of 75 randomly selected


students.
Scores Frequency Percentage
(%)
26 to 30 13 17.33%
31to 35 10 13.33%
36 to 40 16 21.33%
41 to 45 18 24%
46 to 50 18 24%
Total 75 100%

A. Based on the frequency distribution, compute measures of central tendency,


measures of variation, Q1, D9, P10, Skewness and kurtosis.

Measures of Central Tendency

LC +UP
x=¿
2

26 +30
x=¿ x= 28
2

46+50
x = 2 x= 48

Scores Frequency x f(x)


Interval (f)
26 - 30 13 28 364

31 -35 10 33 330

36- 40 16 38 608

41 - 45 18 43 774

46 - 50 18 48 864

Total n= 75 2,940

Mean:
2,940
x̄ =
75
x̄ = 39. 2

Scores Frequency LB <cf


Interval (f)
26 - 30 13 25. 5 13

31 -35 10 30.5 23

36- 40 16 35.5 39

41 - 45 18 40.5 57

46 - 50 18 45.5 75

Total n= 75

n 75
2
= 2
= 37.5

The median class is 36-40 where it contains the 37.5 th item.

Median:

(37.5−23 ) 5
x̄ = 35.5 + = 40.03
16

Modal class:

d1 = 16– 10= 6

d2 =16 – 18= -2

6
Mode = 35.5 + 5 = 43
6 +(−2)

Measures of Variations:

Scores Interval Frequency x f(x) (xi – x f(xi – x̃ ) ² LB <cf


(f) ̃)²
26 - 30 13 28 364 125. 44 1,630.72 25.5 13

31 -35 10 33 330 38.44 384. 40 30.5 23


36- 40 16 38 608 1.44 23.04 35.5 39

41 - 45 18 43 774 14.44 259. 92 40.5 57

46 - 50 18 48 864 77.44 1,393, 92 45.5 75

Total n= 75 2,940 3,692

2,940
x̄ =
75
x̄ = 39. 2

(x1 – x̃ ) ² = (28- 39.2)² = 125.44

(x2 – x̃ ) ² = (33 – 39.2)² = 38.44

(x3 – x̃ ) ² = (38 - 39.2)² = 1.44

(x4– x̃ ) ² = (43 – 39.2)² = 14.44

(x5 – x̃ ) ² = (48 – 39.2)² = 77.44

f(x1 – x̃ ) ² = 13 (125.44) = 1,630.72

f(x2 – x̃ ) ² = 10 (38.44) =384.40

f(x3 – x̃ ) ² = 16 (1.44) = 23.04

f(x4 – x̃ ) ² = 18 (14.44) = 259. 92

f(x5 – x̃ ) ² = 18 (77.44) = 1,393.92

(1)Standard Deviation

3,692
s=
√ 75−1

s= 7.06341
(2) Variance
3692
s2=
75−1

s2= 49.89

Q1

nk (75 ) (1)
= = 18.75
4 4

(18.75−13)5
Q1 = 30.5+
10

Q1 = 33.375

D9

nk (75 ) (9)
= = 67.5
10 10

(67.5−57)5
D9 = 45.5 +
18

D9 = 48.417

P10

nk (75 ) (10)
= = 7.5
100 100

(7.5−0 ) 5
P10 = 25.5 +
13

P10 = 28.385

Q3

nk (75 ) (3)
= = 56.25
4 4

(56.25−39) 5
Q3 = 40.5 +
18
Q3 = 45. 292

P90

nk (75 ) (90)
= = 67.5
100 100

(67.5−57 ) 5
P90 = 45.5 +
18

P90 = 48.417

Karl Pearson’s Measure of Skewness

3 (39.2−40.03)
Sk=
7.06341

Sk = -0.35252

Kurtosis

QD 5.9585
k= k= k= 0.29745
P 90−P10 48.417−28.385

Q 3−Q1
QD =
2

45.292−33.375
= = 5.9585
2

BASED ON EXCEL:
B. Based on the raw data, compute measures of central tendency, measures of
variation, Skewness and kurtosis using Excel

C. Compute Skewness and kurtosis of grouped and ungrouped data. Make


sure to describe the shape of the distribution

- The negative value of coefficient of skewness implies a slight skew


to the left. Meaning that there are more lower scores tallied than
those with higher scores. Skewness is close to zero, so the
distribution is relatively symmetric.
- Kurtosis is negative, which means the distribution is quite flat -
there is no strong peak in the middle. If the kurtosis is less than
zero, then the distribution is light tails and is called a
platykurtic distribution.
-
The shape of distribution of both grouped and ungrouped data
show a similar shape in terms of Skewness and Kurtosis. It shows a
platykurtic shaped and it is slightly skewed to the left.
D. Do you think that computed value for grouped and ungrouped data are
the same?

- In this example, we want a data set with a large mean value and a
small standard deviation. The computed value for grouped and
ungrouped are different with each other. The grouped data has a
larger mean value compared to ungrouped data and it has a smaller
standard deviation compared to that of the ungrouped data.
Therefore, the grouped data is much more preferred here.

8. Begin with the following set of data, call it Data Set I.

5, −2, 6, 14, −3, 0, 1, 4, 3, 2, 5

A. Compute the sample standard deviation and sample mean of Data Set I.

35
x̄ = = 3.18 or 4
11

x 5 -2 6 14 -3 0 1 4 3 2 5

x-x̄ 1 -6 2 10 -7 -4 -3 0 -1 -2 1

(x−x̄ )² = (1)² + (-6)² + (2)² + (10)² + (-7)² + (-4)² + (-3)² + (0)² + (-1)² +
(-2)² + (1)²

( x−x̄ )² 221
s² = =
n−1 10

√s ² = √ 22.1 ≈ 4.70

B. Form a new data set, Data Set II, by adding 3 to each number in Data Set I.
Calculate the sample standard deviation and sample mean of Data Set II.

Data Set I. 5, -2, 6, 14, -3, 0, 1, 4, 3, 2, 5 (adding 3 each number for Data Set II)

Data Set II. 8, 1, 9, 17, 0, 3, 4, 7, 6, 5, 8

68
x̄ = = 6.18 or 7
11
x 8 1 9 17 0 3 4 7 6 5 8

x-x̄ 1 -6 2 10 -7 -4 -3 0 -1 -2 1

(x−x̄ )² = (1)² + (-6)² + (2)² + (10)² + (-7)² + (-4)² + (-3)² + (0)² + (-1)² + (-2)²
+ (1)²

( x−x̄ )² 221
s² = =
n−1 10

√s ² = √ 22.1 ≈ 4.70

C. Form a new data set, Data Set III, by subtracting 6 from each number in Data
Set I. Calculate the sample standard deviation and sample mean of Data Set III.

Data Set I. 5, -2, 6, 14, -3, 0, 1, 4, 3, 2, 5 (subtracting 6 in each number for Data
Set III)

Data Set III. -1, -8, 0, 8, -9, 6, -5, -2, -3, -4, -1

−31
x̄ = = -2.18 or -3
11

x -1 -8 0 8 -9 6 -5 -2 -3 -4 -1

x-x̄ 2 -5 3 11 -6 -3 -2 1 0 -1 2

(x−x̄ )² = (2)² + (-5)² + (3)² + (11)² + (-6)² + (-3)² + (-2)² + (1)² + (0)² + (-1)²
+ (2)²

( x−x̄ )² 214
s² = =
n−1 10

√s ² = √ 21.4 ≈ 4.63

D. Comparing the answers to parts (a), (b), and (c), can you guess the pattern?
State the general principle that you expect to be true.
The sample standard deviation of all three data sets is the same. The pattern
is that the action “add or subtract the same number from every data point”
doesn’t change the standard deviation of the whole data set. This makes
sense, because that action doesn’t change the spread of the data, only
its location. It slides the data along the number line without changing its
shape.

9. Using “Encoded Data file”, construct frequency distribution table for age, sex, marital
status and educational attainment and interpret the table.

Table 1 shows the frequency and percentage Table 3 shows the frequency
and percentage
distribution of the respondents in terms of sex. distribution of the respondents in
terms of
It can be gleaned from the data, out of 75 respondents, marital status. The information
from the
30 or 40% are male while there are 45 or 60% females. table shows that 40 or 53.33%
are single, 40% of 75 respondents are
married, 4% are divorce or separated,
and 2.67% are considered to be
widowed.

Table 2 shows the frequency and percentage Table 4 represents the


frequency and
distribution of the respondents in terms of age. percentage distribution of the
respondents in
There are more respondents aging from 28-32 terms of educational attainment. The
most
who answers the survey with 29.33% of the total number of respondents are 57
out of 75 which
percentage of all respondents. While there are 6.67% are college graduates. While
there are no
out of 75 respondents who ages from 38-42. recorded respondents within primary
level
and elementary graduates.
MODULE 4: INFERENTIAL STATISTICS

ACTIVITIES/ASSESSMENTS: Determine whether the sampling is dependent or


independent.

Dependent 1. A researcher wishes to compare academic aptitudes of married


mathematicians and their spouses. She obtains a random sample of 287 such
couples who take an academic aptitude test and determines each spouse’s academic
aptitude.

Dependent 2. A political scientist wants to know how a random sample of 18- to 25-
year-olds feel about Democrats and Republicans in Congress. She obtains a random
sample of 1030 registered voters 18 to 25 years of age and asks; do you have
favorable/unfavorable opinion of the Democratic/ Republican party? Each individual
was asked to disclose his or her opinion about each party.

Independent 3. An educator wants to determine whether a new curriculum


significantly improves standardized test scores for third grade students. She
randomly divides 80 third-graders into two groups. Group 1 is taught using the new
curriculum, while group 2 is taught using the traditional curriculum. At the end of
the school year, both groups are given the standardized test and the mean scores
are compared.

Independent 4. A stock analyst wants to know if there is difference between the


mean rate of return from energy stocks and that from financial stocks. He randomly
selects 13 energy stocks and computes the rate of return for the past year. He
randomly selects 13 financial stocks and compute the rate of return for the past
year.

Dependent 5. An urban economist believes that commute times to work in the South
are less than commute times to work in the Midwest. He randomly selects 40
employed individuals in the south and 45 employed individuals in the Midwest and
determines their commute times.

ACTIVITIES/ASSESSMENTS:

Solve the following problems. Make sure to follow the 6 steps procedure.

1. A study is designed to test whether there is a difference in mean daily calcium


intake in adults with normal bone density, adults with osteopenia (a low bone
density which may lead to osteoporosis) and adults with osteoporosis. Adults 60
years of age with normal bone density, osteopenia and osteoporosis are selected at
random from hospital records and invited to participate in the study. Each
participant's daily calcium intake is measured based on reported food intake and
supplements. The data are shown below.
Is there a significant difference in mean calcium intake in patients with normal bone
density as compared to patients with osteopenia and osteoporosis?

Set up hypotheses and determine the level of significance

H0: μ1 = μ2 = μ3 H1: Means are not all equal α=0.05

Select the appropriate test statistic.


The test statistic is the F statistic for ANOVA, F=MSB/MSE.
Set up decision rule.

In order to determine the critical value of F, we need degrees of freedom,


df1=k-1, and df2=N-k. In this example, df1=k-1=3-1=2 and df2=N-k=18-
3=15. The critical value is 3.68 and the decision rule is as follows: Reject H 0
if F > 3.68.

Compute the test statistic.

To organize our computations we will complete the ANOVA table. In order to


compute the sums of squares, we must first compute the sample means for
each group and the overall mean.

Normal Bone Osteopenia Osteoporosis


Density

n1=6 n2=6 n3=6

If we pool all N=18 observations, the overall mean is 817.8.

=
=

Normal Bone (X - (X -
Density 938.3) 938.3333)2

1200 261.6667 68,486.9


1000 61.6667 3,806.9

980 41.6667 1,738.9

900 -38.3333 1,466.9

750 -188.333 35,456.9

800 -138.333 19,126.9

Total 0 130,083.3

Thus,

For participants with osteoporosis:

Osteoporosi (X - (X -
s 715.0) 715.0)2

890 175 30,625

650 -65 4,225

1100 385 148,225

900 185 34,225

400 -315 99,225

350 -365 133,225

Total 0 449,750

Thus,

ANOVA TABLE

Source of Sums of Degrees Mean F


Variation Squares of Squares
(SS) freedom (MS)
(df)

Between 152,477.7 2 76,238.6 1.395


Treatments

Error or 819,833.3 15 54,655.5


Residual
Total 972,311.0 17

Conclusion.

We do not reject H0 because 1.395 < 3.68. We do not have statistically


significant evidence at a =0.05 to show that there is a difference in mean
calcium intake in patients with normal bone density as compared to
osteopenia and osteoporosis.

2. Some studies have shown that in the United States, men spend more than women
buying gifts and cards on Valentine’s Day. Suppose a researcher wants to test this
hypothesis by randomly sampling nine men and 10 women with comparable
demographic characteristics from various large cities across the United States to be
in a study. Each study participant is asked to keep a log beginning one month before
Valentine’s Day and record all purchases made for Valentine’s Day during that one-
month period. The resulting data are shown below. Use these data and a 1% level of
significance to test to determine if, on average, men actually do spend significantly
more than women on Valentine’s Day. Assume that such spending is normally
distributed in the population and that the population variances are equal.

x̄ 1=106.45 , x̄ 2=88.82
s1=26.80 and s2=23.90 ,
n1=12 and n2=12

Null hypothesis: the average spending of men on Valentine's Day is equal to the
average spending of women on Valentine's Day.
Alternative hypothesis: the average spending of men on Valentine's Day is greater
than the average spending of women on Valentine's Day.

Given,
x̄ 1=106.45, x̄ 2=88.82,n1=12,n2=12,s1=26.8,s2=23.9,α=0.01
H0:μ1=μ2
H0:μ1>μ2

degrees of freedom= n1+n2−2= 12+12−2= 22


critical value of static test, t 0 .01,22=2.5083

n1+n2−2(n1−1)s12+(n2−1)s22=12+12−2(12−1)26.82+(12−1)23.9= 644.725

3. A researcher is interested whether a training course increases the teaching


performance of the teachers who attended the training courses. Test at 10% level of
significance. The data are shown below:
State the null and alternative hypothesis:
Null hypothesis:
The training course does not increase the teaching performance of teachers who
attended the training course. H0: µ1 ≥ µ2

Alternative hypothesis:
The training course increases the teaching performance of teachers who attended
the training course. H0: µ1 < µ2

Set the level of significance:


α = 0.05

Determine the test distribution to use:


Since we are comparing means of two related groups, we will use the dependent sample t-
test.

Calculate the test statistic or P-value:


Indicator Treatment Mean T-value P – Value Decision Remark
Increase of Before 85.25 -9.6965 4.3134 Reject Significa
Teaching After 94.1 H0 nt
performanc
e

Conclusion:
There is sufficient evidence to support that the training course helped increase the
teaching performance of teachers who attended the training course.

4. A pediatrician wants to determine the relation that may exist between a child’s
height and head circumference. She randomly selects eleven 3- year-old children
from her practice, measures their heights and head circumference, and obtains the
data shown in the table below.

a) The explanatory variable is height and the response variable is head


circumference.

b) A scatter diagram is obtained in excel by following the step,

Select the data > INSERT > Recommended Charts > All Charts > Scatter X Y.
c) The correlation coefficient can be obtained using the excel function
=CORREL(array1, array2)

Correlation coefficient, r=0.866237

d) The critical value of correlation coefficient for degree of freedom = n -2 = 8 - 2 =


6 and significance level = 0.05 is,

rc= 0.707

r= 0.866 > rc = 0.707

Yes, there appears to be a positive linear association because r is positive and is


greater than the critical

e) The data values after conversion are,

Head
Sampl Heigh Circumferen
e t ce

1 68.58 44.196

2 64.77 43.688

3 66.04 43.688

65.40
4 5 43.18

5 70.48 44.45
5

6 67.31 43.688

66.67
7 5 43.688

67.94
8 5 44.196

The correlation coefficient is obtained using the excel function =CORREL(array1,


array2)

Correlation coefficient, r=0.866237.

There is no change in correlation coefficient after conversion (r unchanged)

5. The following data represent the smoking status from a random sample of 1054 U.S.
residents 18 years or older by level of education.

Test whether smoking status and level of education are independent at the α = 0.05
level of significance.

Objective 1: Does the smoking status and Level of Education are Independent?

Test used:

x² test of independence of attributes

Attributes:

1. A: smoking status

2. B: Level of Education

Hypothesis:

To test,
Ho: Attributes A&B are independent
H1: Attributes A & B are not independent.

Data:

Table of Observed Frequencies (Oij):

Smoking
Status
Level of Current Former Never Total
Education
<12 178 88 208 474
2 137 69 143 349
13-15 44 25 44 113
16 or more 34 33 51 118
Total 393 215 446 1054

Table for expected frequency (Eij)

Eij = Ai x Bj/ N

Smoking
Status
Level of Current Former Never Total
Education
<12 176.7381 96.6888 200.5731 474
2 130.31 71.1907 147.6793 349
13-15 42.13378 23.05028 47.81594 113
16 or more 43.9981 24. 07021 49.93169 118
Total 393 215 446 1054

Table for (Oij²/Eij)

Smoking
Status
Level of Current Former Never Total
Education
<12 176.2709 80.092 215.702 475.0648
2 144.2327 66. 87671 138. 469 349.5784
13-15 45.94888 27.11463 40. 48859 113.5521
16 or more 26.27386 45.24265 52. 09117 1123.6077
Total 395. 7263 219. 326 446. 7507 1061.803

Under Ho, the test statistic is,

Calculated value: 7.803

Tabulated value : x ² [(r −1)( c−1) ; a]=x ²(6 ; 0.05)=12.59


Decision Rule: Here, Calculated value > Tabulated value, so we accept H0 at 5% at
l.o.s

Conclusion: hence there is not sufficient evidence to reject ho and hence accept Ho
and conclude that smoking status and Level of Education are Independent.

6. A pediatrician wants to determine the relation that may exist between a child’s height
and head circumference. She randomly selects eleven 3-year-old children from her
practice, measures their heights and head circumference, and obtains the data shown in
the table below.
From the correlation coefficient value we can conclude that a linear relationship
exists between Height and head circumference

The correlation coefficient can be obtained using the excel function =CORREL(array1,
array2)

Correlation coefficient, r=0.866237

d) The critical value of correlation coefficient for degree of freedom = n -2 = 8 - 2 = 6 and


significance level = 0.05 is,

rc= 0.707

r= 0.866 > rc = 0.707

Yes, there appears to be a positive linear association because r is positive and is greater
than the critical

e) The data values after conversion are,


Head
Sam Hei Circumf
ple ght erence

68.
1 58 44.196

64.
2 77 43.688

66.
3 04 43.688

65.
4 405 43.18

70.
5 485 44.45

67.
6 31 43.688

66.
7 675 43.688

67.
8 945 44.196

The correlation coefficient is obtained using the excel function =CORREL(array1, array2)

Correlation coefficient, r=0.866237.

There is no change in correlation coefficient after conversion (r unchanged)


STATISTICAL ANALYSIS OF SOFTWARE APPLICATION
UNIT TEST

Directions: Read each item carefully. Write the letter corresponding to the best answer on a
yellow paper on each item. Write NONE if no correct choice is given. Make sure to write
also your solution.

1. A bank surveyed all of its 60 employees to determine the proportion who


participate in volunteer activities. Which of the following statements is true?
a) The bank should not use the data from this survey because this is an
observational study.
b) The bank does not need to use an inference procedure to determine
the proportion of employees who participate in volunteer activities
because the survey was a consensus of all employees.
c) The bank can use the result of this survey to prove that working for
the bank causes employees to participate in volunteer activities.
d) The bank did not select a random sample of employees, so the survey
will not provide the bank with useful information.
2. In the design of a survey, which of the following best explains how to
minimize response bias?
a) Increase the sample size
b) Carefully word and field-test survey questions
c) Randomly select the sample
d) Increase the number of questions in the survey
3. A body of principle, which deals with collection, allocation, analysis,
interpretation and presentation of numerical facts or data.
a) Statistics
b) Descriptive
c) Inferential
d) Statistics
4. Cluster sampling is an example of:
a) Simple random sampling
b) Probability sampling
c) Nonprobability sampling
d) Stratified sampling
5. Which of the following is an alternative hypothesis?
a) There will be no difference between the length of time taken to
complete a test online and the time taken to complete a test on paper.
b) There is no significant factors.
c) There will be no difference between the length of time taken to
complete tests online and tests completed on paper, and if there is it
due to chance.
d) None of the above
6. The alternative hypothesis of F-test is __________.
a) Equal variances assumed
b) Equal variances Not assumed
c) Data follows a normal distribution
d) Data does not follow a normal distribution
7. Which of the following statements regarding a researcher use of
inferential statistics is true?
a) It is best to measure every member of a population if possible.
b) A random sampling provides a perfect estimate of the population
values.
c) Descriptive statistics from a sample are used to estimate the
characteristics of the population.
d) We usually need to take several samples to obtain a good estimate of
the population values.
8. The two forms of t-test are
a) One-way and two-way
b) Independent and dependent
c) Chi-square – Independent
d) Pearson r and chi square
9. If a researcher conducts a study in which the reading ability of a class of
20 second graders is tested at the beginning and at the end of the year, the
appropriate statistical procedure to analyze the results would be
a) One-way ANOVA
b) Independent sample t-test
c) Dependent sample t-test
d) Pearson r
10. Suppose a researcher is conducting a study in which five groups of
adults, each group having a distinct life situation, are assessed on a measure
of stress. The appropriate statistical procedure to compare the group is a(n)
a) One-way ANOVA
b) Independent sample t-test
c) Dependent sample t-test
d) Pearson r
11. The _________ divides the distribution into ten equal parts.
a) Decile
b) Percentile
c) Median
d) Quartile
12. When the value of x variable increases and the value of y variable also
increases. It is known as _________.
a) No relationship
b) Direct relationship
c) Inverse relationship
d) None of the above
13. If the computed correlation coefficient of two continuous variable is 0.97,
then describe the relationship.
a) Weak negative and inverse relationship
b) Strong negative and inverse relationship
c) Strong positive and direct relationship
d) Weak positive and direct relationship
14. If the computed value for Pearson r is negative, this implies that there is
a/an _________ relationship between variables x and y.
a) No relationship
b) Direct relationship
c) Inverse relationship
d) Undefined
15. You find children who take vitamins have a higher health index scores
than children who do not take vitamins (p<0.05). You have found that the two
groups of children are
a) Significantly different
b) Different because of chance
c) Positively correlated
d) Negatively correlated
16. A conclusion in a research on Science Teaching in selected Quezon City
high school states. Most schools are lack of adequate facilities. Which is a
proper recommendation for this conclusion?
a) School administrators should be pro-active and skillful in acquiring
adequate facilities.
b) School administrators should conduct science achievement test that
are centralized and uniform
c) School administrators should hire more competent science teachers
for proper handling of the facilities.
d) School administrators should work on the revision of the science
curricula so that lessons may adopt with the facilities.
17. Which of the following is a positive correlation?
a) Gas mileage decreases as vehicle weight increases
b) As study time decreases, students achieve lower grades
c) As level of self-esteem decline, levels of depression increase
d) People who exercise regularly are less likely to be obese.
18. What sampling technique is used when the respondents are chosen on
the basis of pre-determined criteria set by the researchers?
a) Cluster sampling
b) Systematic sampling
c) Purposive sampling
d) Convenience sampling
19. In a ________ distribution the mean<median<mode.
a) Normal
b) Unimodal
c) Negatively skewed
d) Positively skewed
20. A friend of mine studies the effects of praise on happiness. She believes
that children who receive praise are happier overall than children who do not
receive praise. She measures happiness by continuing the number of times a
child smiles in one hour period. She knows that in the population of children
who do not receive praise smiles average 4 times per hour with a standard
deviation of .5, and that these data are normally distributed. She selects a
sample of 200 children whom she knows receive praise and finds that they
smile an average of 3.5times per hour.
An appropriate null hypothesis for this study is:
a) Children who receive praise smile more than children who do not.
b) Children who receive praise smile the same amount as children who
do not.
c) Children who receive praise are happier than children who do not.
d) Children who receive praise do not smile more than children who do
not.
21. Which one of the following variables is not categorical?
a) Score on the exam
b) Educational attainment: elementary graduate, high school graduate,
college graduate
c) Color: blue, red, white
d) Subject: algebra, calculus, trigonometry
22, Given the date set, 40, 50, 70, 70, 60, 90, 80, 80, 90. What will happen if
we replace the data value by set 5, will the standard deviation ________.
a) Increase
b) Decrease
c) Stay the same
d) None of the above
23. If the grades of Karen are 87, 85, 91, 89 and X, what must be the value of
X so that the average is 89?
a) 92
b) 95
c) 93
d) 91
24. In descriptive statistics, we study
a) The description of decision-making process
b) The methods for organizing, displaying, and describing data
c) How to describe the probability distribution
d) None of the above
25. In statistics, conducting a survey means
a) Collecting information from elements
b) Making mathematical calculations
c) Drawing graphs and pictures
d) None of the above
26. Which of the following represents the middle point in a set of numbers
arranged in order of magnitude?
a) Mean
b) Median
c) Mode
d) Variance
27. Mr. Martin had seven students in his after-school statistics tutorial. The
scores they received on their last quiz were as follows: 81, 73, 84, 78, 89, 82,
81. What was the mean score?
a) 81.14
b) 78.5
c) 82
d) 79.5
28. If all the units of a population are surveyed it is called
a) Survey
b) Population
c) Census
d) Sample
29. For percentiles, the total number of partition values are
a) 10
b) 25
c) 99
d) 100
30. Which of the following represents median?
a) First quartile
b) Fiftieth percentile
c) Sixth decile
d) Third quartile
31. 5 is subtracted from each observation of a set, then the mean of the
observation is reduced by
a) 5
b) 1
c) 0
d) 15
32. The standard deviation of 10 observations is 15. If 5 is added to each
observations the value of new standard deviation is
a) 5
b) 1
c) 0
d) 15
33. If the minimum value in a set is 9 and range is 57, the maximum value of
the set is
a) 33
b) 66
c) 48
d) 24
34. Which of the following situations exhibit the function of inferential
statistics?
a) The highest score is obtained by BSS section1 in their first quiz is 48.
b) All the ten scores are closely scattered around the average value.
c) Mathematical anxiety of the students will be related with their
academic performance.
d) Line graphs will be used to exhibit the fluctuating trend of monthly
consumption of electricity.
35. Which of the following situations exhibit the function of descriptive
statistics?
a) Determining the most favored characteristics of the ideal teacher
students perceived.
b) Relating the number of absences committed by students with their
academic performance.
c) Citing the differences in perception of the male and female student
towards NO ID-NO ENTRY policy.
d) Comparing the course grade in statistics of every section who are
taking the subject during the first semester.
For items 36 to 39, consider this situation. There were 200 students of PUP
San Juan enrolled in General Statistics in the first semester. A periodic
examination was given and it was found out that the average score is 93.
When a random section with 50 students is chosen, it was found out that 89
is the average score of the section.
36. What do we call to the number 200?
a) Statistics
b) Sample size
c) Parameter
d) Population size
37. What do we call the number 93?
a) Statistics
b) Sample size
c) Parameter
d) Population size
38. What do we call number 50?
a) Statistics
b) Sample size
c) Parameter
d) Population size
39. What do we call number 89?
a) Statistics
b) Sample size
c) Parameter
d) Population size
For items 40-42, consider this situation. A group of undergraduate
researchers aims to execute stratified random sampling among 63 section 1
student, 52 section 2 students, 48 section 3 students and 37 section 4
students. The margin of error is 5%.
40. What is the sample size?
a) 124 students
b) 134 students
c) 144 students
d) 154 students
41. How many students of Section 2 will be included in the sample?
a) 15 students
b) 25 students
c) 35 students
d) 45 students
42. How many students of Section 4 will be included in the sample?
a) 13 students
b) 17 students
c) 21 students
d) 25 students
43. Which of the following is an example of primary source data?
a) TV station
b) Encyclopedias
c) Living organisms
d) Scientific journals
44. A marketing team specializing in food products set standards in a mall to
determine the preference of the mall-goers in choosing and consuming
finger-foods. What is sampling technique is appropriate in doing this?
a) Cluster sampling
b) Purposive sampling
c) Convenience sampling
d) Systematic sampling
45. A market research company ask a sample of students to rate the taste of
a new soft drink. The response scale is really yummy, yummy, ok, yuck, really
yuck. This is an example of a
a) Nominal level
b) Ordinal level
c) Internal level
d) Ratio level
46. A researcher is studying students in college in PUP. She takes a sample of
400 students from 10 colleges. The average age of selected college students
is
a) Statistics
b) Parameter
c) The median
d) A population
47. A coffee shop wants to know the temperature of coffee that most people
prefer. They brew coffee at the typical temperature for the shop and then ask
customers “Do you prefer to be at this temperature?” and record a yes or no
answer for each customer. What is the level of measurement of the way they
measured preferred temperature?
a) Nominal
b) Ordinal
c) Interval
d) Ratio
48. The same coffee shop later repeats the study but this time they ask “do
you prefer coffee to be a lot cooler, this temperature, a little warmer or a lot
hotter?” and record the persons response. Now, what is the level of
measurement of the way they measured temperature?
a) Nominal
b) Ordinal
c) Interval
d) Ratio
49. What is the criterion for rejecting the null hypothesis using p value
approach?
a) If p value is less than or equal to the level of significance retain Ho,
otherwise Reject Ho.
b) If p value is less than or equal to the level of significance reject Ho,
otherwise retain Ho.
c) If p value is greater than or equal to the level of significance retain
Ho, otherwise retain Ho.
d) If p value is greater than or equal to the level of significance retain
Ho, otherwise Reject Ho.
50. The alternative hypothesis of Shapiro wilk test is __________.
a) Equal variances assumed
b) Equal variances not assumed
c) Data follows a normal distribution
d) Data does not follow a normal distribution
51. An inspector needs to learn if customers are getting fewer ounces of a
soft drink than the 28 ounces stated on the label. After she collects data from
a sample of bottles, she is going to conduct a test of a hypothesis. She should
use
a) A two tailed test
b) A one tailed test with an alternative to the right.
c) A one tailed test with an alternative to the left.
d) Either a one or a two tailed test because they are equivalent.
52. Determine the characteristics of a normal curve.
I. The normal curve is bell-shaped and symmetric about the mean.
II. The mean, median, and mode are not equal.
III. The total area under the curve is equal to one.
IV. The normal curve approaches, but never touches the x-axis at it ends
farther and farther away from the mean
a) I, II and III
b) I, II, III and IV
c) II, III and IV
d) I, III and IV
53. Given a normally distribution, find the area under the curve which lies to
the right of z=1.96.
a) 0.9750
b) 0.0196
c) 0.4750
d) 0.0250
54. A hypothesis test is done in which the alternative hypothesis is that more
than 10%of a population is left-handed. The computed p value is 0.25. Which
statement is correct?
a) We can conclude that more than 10% of the population is left-handed.
b) We can conclude that more than 25% of the population is left-handed.
c) We can conclude that exactly 25% of the population is left-handed.
d) We cannot conclude that more than 10% of the population is left-
handed.
55. If there is negative correlation between no. of absences students have
and grades. What can we conclude from this research finding?
a) That being absent leads to lower grades
b) That students that are absent more often are likely to have lower
grades
c) That low grade leads to people being absent
d) That this is an illusory correlation
For items 56 to 60, consider this situation. A researcher has collected the
following sample data. 5, 12, 6, 8, 5, 6, 7, 5, 12, 4
56. Find the median.
a) 5
b) 6
c) 7
d) 8
57. Find the mode.
a) 5
b) 6
c) 7
d) 8
58. Find the mean.
a) 5
b) 6
c) 7
d) 8
59. Find the standard deviation.
a) 1.2
b) 2.2
c) 3.2
d) 4.2
60. Find the Pearson coefficient of skewness using the value of median.
a) 1.2
b) 2.2
c) 3.2
d) 4.2

PROBLEM SOLVING

A. The PUPCET scores for the math portion of the test were normally
distributed, with a mean of 23.4 and a standard deviation of 4.8. Find the
probability that a randomly selected student who took the math portion of
the PUPCET has a score that is

(a) less than 18.


μ =23.4
σ =4.8
z=(X−μ)/σ

For, Z=(18−23.4)/4.8= -11.25

P(X<18)= P(Z<−11.25)
= 0.13029 (using z-score table)

(b) between 21 and 26.

μ =23.4
σ= 4.8
z=(X−μ)/σ

[P(21<X<23.4)= P[(240-23.4)/4.8<Z< (26-23.4)/4.8]


= 0.3974

B. Given the following frequency distribution.


Class Frequency x f(x) LB <cf
Interval
240 – 259 5 249.5 1,247.5 239.5 50
220- 239 5 229.5 1,147.5 219.5 45
200 – 219 12 209.5 2,514 199.5 40
180 – 199 13 189.5 2,463,5 179.5 28
160 – 179 5 169.5 847.5 159.5 15
140 – 159 10 149.5 1,495 139.5 10
Total n= 50 9,715

Compute the following:

(a) Mean

9,715
x̃ =
50
x̃ = 194.3

(b)Median

n 50
x̃ = = = 25
2 2

(25−15) 20
x̃ = 179.5 +
13

x̃ = 194.88

(c) Mode

d1= 13 – 10 = 3

d2= 13 – 12 = 1

3
x̃ = 179.5 + ( ¿ 20
(3+ 1 )

x̃ = 194.5

Class Frequency x f(x) (xi – x̃ ) ² f(xi – x̃ ) ²


Interval
240 – 259 5 249.5 1,247.5 3,047.04 15,235.20
220- 239 5 229.5 1,147.5 1,239.04 6,195.20
200 – 219 12 209.5 2,514 231. 04 2,772.48
180 – 199 13 189.5 2,463,5 23.04 299.52
160 – 179 5 169.5 847.5 615.04 3,075.20
140 – 159 10 149.5 1,495 2,007.04 20,070.40
Total n= 50 9,715 7,162.24 47,648
9,715
x̃ =
50
x̃ = 194.3

(x1- x̃ ) ² = (249.5 – 194.3)² = 3,047.04

(x2- x̃ ) ² = (229.5 – 194.3)² = 1,239.04

(x3- x̃ ) ² = (209.5 – 194.3)² = 231.04

(x4- x̃ ) ² = (189.5 – 194.3)² = 23.04

(x5- x̃ ) ² = (169.5 – 194.3)² = 615.04

(x6- x̃ ) ² = (149.5 – 194.3)² = 2,007.04

f(x1- x̃ ) ² = 5 (3,047.04) =15,235.20

f(x2- x̃ ) ² = 5 (1,239.04) =6,195.20

f(x3- x̃ ) ² =12 (231.04) =2,772.48

f(x4- x̃ ) ² = 13 (23.04) = 299.52

f(x5- x̃ ) ² = 5 (615.04) = 3,075.20

f(x6- x̃ ) ² = 10 (2,007.04) = 20,070.40

(d)Standard Deviation

47,648
s=
√ 50−1

s= 31. 18

Class Frequency x f(x) LB <cf


Interval
240 – 259 5 249.5 1,247.5 239.5 50
220- 239 5 229.5 1,147.5 219.5 45
200 – 219 12 209.5 2,514 199.5 40
180 – 199 13 189.5 2,463,5 179.5 28
160 – 179 5 169.5 847.5 159.5 15
140 – 159 10 149.5 1,495 139.5 10
Total n= 50 9,715

(e) Q1

nk (50 ) (1)
= = 12.5
4 4

(12.5−10)20
Q1 = 159.5 +
5

Q1 = 169.5

(f) Q3

nk (50 ) (3)
= = 37.5
4 4

(37.5−28)20
Q3 = 199.5 +
12

Q3 = 215.33

(g) D1

nk (50 ) (1)
= =5
10 10

(5−0)20
D1 = 139.5+
10

D1 = 149.50

(h) D9
nk (50 ) (9)
= = 45
10 10

(45−40) 20
D9 = 219.5 +
5

D9 = 239. 50

(i) P10

nk (50 ) (10)
= =5
100 100

(5−0 ) 20
P10 = 139. 5+
10

P10 = 149.50

(j) P90

nk (50 ) (90)
= = 45
100 100

( 45−40 ) 20
P90 = 219.5 +
5

P90 = 239.50

(k) Karl Pearson’s Measure of Skewness

3 (194.30−194.88)
Sk=
31.18

Sk = -0.05581

(l) Kurtosis

QD 22.915
k= k= k= 0.25461
P 90−P10 239.50−149.50

Q 3−Q1
QD =
2
215.33−169.50
= = 22.915
2

C. Construct a frequency distribution table.


No. of Children Frequency Percentage (%)
0 5 10
1 5 10
2 12 24
3 13 26
4 5 10
5 10 20
Total 50 100%

(a) What percentage of couples married seven years has two children?

- The percentage of couples married seven years that has two


children are 10% of the total percentage.

(b) What percentage of couples married seven years has at least two
children?

- The percentage of couples married seven years that has at


least two children are 80% of the total percentage.

D. The ACT is a college entrance exam. ACT has determined that a score of
22 on the mathematics portion of the ACT suggests that a student is ready
for college-level mathematics. To achieve this goal, ACT recommends that
students take a core curriculum of math courses: Algebra I, Algebra II, and
Geometry. Suppose a random of 2020 students who completed this core set
of courses results in a mean ACT math score of 22.6 with a standard
deviation of 3.9. Do these results suggest that students who complete the
core curriculum is ready for college-level mathematics? That is, are they
scoring above 22 on the math portion of the ACT?

(a) State the appropriate null and alternative hypotheses.


(b) If p – value is 0.001, write your decision and conclusion.

The smaller the p-value, the greater the evidence against the null
hypothesis. If P value is 0.001, it is statistically highly significant and we can reject
HO.

You might also like