You are on page 1of 18

Name: Jeremy Cordero Martinez

Section: H60
Date: 10/Febrero/2023
Ejercicios Asignados - Muestreo y Datos
1.1 Definitions of Statistics, Probability, and Key Terms
Use the following information to answer the next five exercises. Studies are often
done by pharmaceutical companies to determine the effectiveness of a treatment
program. Suppose that a new AIDS antibody drug is currently under study. It is
given to patients once the AIDS symptoms have revealed themselves. Of interest is
the average (mean) length of time in months’ patients live once they start the
treatment. Two researchers each follow a different set of 40 patients with AIDS
from the start of treatment until their deaths. The following data (in months) are
collected.

Researcher A:
3; 4; 11; 15; 16; 17; 22; 44; 37; 16; 14; 24; 25; 15; 26; 27; 33; 29; 35; 44; 13; 21;
22; 10; 12; 8; 40; 32; 26; 27; 31; 34; 29; 17; 8; 24; 18; 47; 33; 34

Researcher B:
3; 14; 11; 5; 16; 17; 28; 41; 31; 18; 14; 14; 26; 25; 21; 22; 31; 2; 35; 44; 23; 21; 21;
16; 12; 18; 41; 22; 16; 25; 33; 34; 29; 13; 18; 24; 23; 42; 33; 29

Determine what the key terms refer to in the example for Researcher A.

1. Population: They are the patients that receive the treatment after they show
symptoms of AIDS.
2. Sample: They are the 40 AIDS patients who were chosen by the researchers to
keep a record of treatment until the date of their death.
3. Parameter: The amount of time the patient population was alive with the help of
AIDS treatment before the time of their death.
4. Statistic: The average length of time in months the sample patients live during
their AIDS treatment
5. Variable: The variable X is the time each patient will live once they start the
treatment.
For each of the following eight exercises, identify: a. the population, b. the sample,
c. the parameter, d. the statistic, e. the variable, and f. the data. Give examples
where appropriate.
43. Ski resorts are interested in the mean age that children take their first ski and
snowboard lessons. They need this information to plan their ski classes optimally.
Population: All children who attend Ski and snowboarding lesson at ski resorts.
Sample: The children that take their first ski and snowboard lesson at ski resorts.
Parameter: Is the average age group of all children who attend ski and snowboard
lessons in Ski Resorts.
Statistic: Is the average age group of all children who attend ski and snowboard
lessons in Ski Resorts in the sample.
Variable: The variable X is the age of the child when they take their first ski and
snowboard lessons.
Data: The Data is the ages of the children taking the lesson for the first time.
Examples of the data are:
1. age: 10
2. age: 12
3. age: 8
4. age: 9
5. age: 11

44. A cardiologist is interested in the mean recovery period of her patients who
have had heart attacks.
Population: All patients that the cardiologist has in her record.
Sample: All the cardiologist patients who have had heart attacks.
Parameter: Is the mean recovery period of all the patients the cardiologist has in
her record.
Statistic: Is the mean recovery period of all the patients in the sample.
Variable: The variable is X is the estimated recovery time of the patient who has
suffered a heart attack.
Data: The Data is the recovery of each patient has had for heart attack.
Examples of the data are:
1. 14 days (2 weeks)
2. 30 days (1 month)
3. 60 days (2 months)
4. 90 days (3 months)
5. 37 days (1 month and 1 week)

45. Insurance companies are interested in the mean health costs each year of their
clients, so that they can determine the costs of health insurance.

Population: All the insurance company’s clients who purchase health insurance.
Sample: A randomly selected group of clients from the insurance companies that
have health insurance.
Parameter: The mean health costs of insurance company’s clients in general.
Statistic: The mean health costs insurance company’s clients in the sample.
Variable: The variable X is the amount of health cost each year from the client.
Data: The Data is the amount of health cost from each client.
Examples of the data are:
1. $9,000
2. $20,000
3. $1,000
4. $800
5. $5,000

46. A politician is interested in the proportion of voters in his district who think he
is doing a good job.
Population: All the voters in the town or city the politician running for.
Sample: The all voters in the politician district.
Parameter: The proportion of voters in the population that think he is doing a good
job.
Statistic: The proportion of voters in the sample who think he is doing a good job.
Variable: The variable X is the answers the voters gives if they think depending on
the politician is doing a good job or not.
Data: The data is the yes or no, so they show if they agree or disagreed with
politician work in the community.

47. A marriage counselor is interested in the proportion of clients she counsels who
stay married.
Population: All of this counselor’s clients.
Sample: Is the group of clients that the marriage counselor counsels.
Parameter: The proportion of all clients of this marriage counselor that stay
married.
Statistic: The proportion of the sample of this marriage counselor that stay married.
Variable: the variable X is the clients who stay married with couple.
Data: The data is the yes or no, so they can agree and disagree if they’re married or
not.
48. Political pollsters may be interested in the proportion of people who will vote
for a particular cause.
Population: All the of the people who voted on the political poll
Sample: It’s a randomly selected group of people for all that voted.
Parameter: It’s the proportion of the population of the voters who will vote for a
particular cause.
Statistic: It’s the proportion of the sample of the voters who will vote for a
particular cause.
Variable: The variable X is the number of votes cast by the voters for a cause of
their choosing.
Data: The Data is the total number of votes for a particular cause.
Examples of the data are:
1. Banning TV = 1000 votes
2. Enabling crime = 100 votes
3. Outlawing jaywalking = 2000 votes

49. A marketing company is interested in the proportion of people who will buy a
particular product.
Population: All the people that buy company products.
Sample: The who people who will buy a particular product.
Parameter: The proportion of people who will buy products inside the population.
Statistic: The proportion of people who will buy products in the sample.
Variable: The variable X is whether the person would buy a particular product.
Data: The data can go either: yes, I would buy a particular product from this
company, or no, I would not buy a particular product from this company.

Use the following information to answer the next three exercises: A Lake Tahoe
Community College instructor is interested in the mean number of days Lake
Tahoe Community College math students are absent from class during a quarter.
50. What is the population she is interested in?
a. all Lake Tahoe Community College students.
b. all Lake Tahoe Community College English students.
c. all Lake Tahoe Community College students in her classes.
d. all Lake Tahoe Community College math students. * (the answer chosen by me)
51. Consider the following:
X = number of days a Lake Tahoe Community College math student is absent
In this case, X is an example of a:
a. variable. * (the answer chosen by me)
b. population.
c. statistic.
d. data.
1.2 Data, Sampling, and Variation in Data and Sampling
6. “Number of times per week” is what type of data?
a. qualitative (categorical);
b. quantitative discrete; * (the answer chosen by me)
c. quantitative continuous

Use the following information to answer the next four exercises: A study was done
to determine the age, number of times per week, and the duration (amount of time)
of residents using a local park in San Antonio, Texas. The first house in the
neighborhood around the park was selected randomly, and then the resident of
every eighth house in the neighborhood around the park was interviewed.

7. The sampling method was


a. simple random;
b. systematic; * (the answer chosen by me)
c. stratified;
d. cluster

8. “Duration (amount of time)” is what type of data?


a. qualitative (categorical);
b. quantitative discrete;
c. quantitative continuous * (the answer chosen by me)
9. The colors of the houses around the park are what kind of data?
a. qualitative (categorical); * (the answer chosen by me)
b. quantitative discrete;
c. quantitative continuous

10. The population is _the neighboring houses around the San Antonio local
park_
For the following exercises, identify the type of data that would be used to describe
a response (quantitative discrete, quantitative continuous, or qualitative), and give
an example of the data.

53. number of tickets sold to a concert


 quantitative discrete = 1,000 tickets sold, 256 tickets sold, 777 tickets sold
54. percent of body fat
 quantitative continuous = 20 % average body fat, 10 % healthy body fat,
40% unhealthy body fat
55. favorite baseball team
 qualitative = the New York meth’s, Chicago Cubs, St. Louise Cardinals
56. time in line to buy groceries
 quantitative continuous = 55 minutes with 42 seconds, 1 hour with 30
minutes and 25 seconds, 15 minutes with 22 second
57. number of students enrolled at Evergreen Valley College
 quantitative discrete = 10,000 enrolled students, 222,000 enrolled students,
6,780 enrolled students.
58. most-watched television show
 qualitative = NFL Sunday Night Football (NBC), This Is Us (NBC), Grey’s
Anatomy (ABC)
59. brand of toothpaste
 qualitative = Colgate, Aqua fresh, Sensodyne
60. distance to the closest movie theater
 quantitative continuous = Caribbean Cinemas of Dorado: 20,8 km,
Caribbean Cinemas of Barceloneta: 18,7 km, Caribbean Cinemas of San
Juan: 38,2 km
61. age of executives in Fortune 500 companies
 quantitative discrete = executive age: 45, executive age: 60, executive age:
50
62. number of competing computer spreadsheet software packages
 quantitative discrete = Microsoft Excel: 18,449, Asana: 11,911, Google
Sheets: 12,588
Use the following information to answer the next two exercises: A study was done
to determine the age, number of times per week, and the duration (amount of time)
of resident use of a local park in San Jose. The first house in the neighborhood
around the park was selected randomly and then every 8th house in the
neighborhood around the park was interviewed.

63. “Number of times per week” is what type of data?


a. qualitative
b. quantitative discrete * (the answer chosen by me)
c. quantitative continuous
64. “Duration (amount of time)” is what type of data?
a. qualitative
b. quantitative discrete
c. quantitative continuous * (the answer chosen by me)
65. Airline companies are interested in the consistency of the number of babies on
each flight, so that they have adequate safety equipment. Suppose an airline
conducts a survey. Over Thanksgiving weekend, it surveys six flights from Boston
to Salt Lake City to determine the number of babies on the flights. It determines
the amount of safety equipment needed by the result of that study.
Using complete sentences, list three things wrong with the way the survey was
conducted.
 The first thing that was done wrong in the survey was that it was done
during a holiday event (in this case the thanksgiving weekend) where the
number of people taking fights is unusually high. Making it impossible to
forecast the number of babies on each flight. Since a normal non holiday
fights will have a drastically different amount of babies on board than
holiday flights.
 The second thing that was done wrong in the survey was that only one route
(flights from Boston to Salt Lake City) was chosen to recollect the data of
the survey, which make the impossible to forecast varied results. Since each
of the 6 fight follow the same path, making the results similar to each other.
Due to the fact that there might not be chance to factor in new babies if
every flight goes to the same places.
 The three thing that was done wrong in the survey was they only 6 flights to
conduct for this test. Which is insufficient if you’re looking to safety
equipment, because if you are to take into consideration every possible
outcome and 6 just isn’t enough to address every possibility.
66. Using complete sentences, list three ways that you would improve the survey if
it were to be repeated.
 The first way to improve the survey is to conduct the survey during multiple
days on non-holiday and holiday flights to be able to compare them and get
the most varied results.
 The second way to improve the survey is to conduct the survey on flights
that travel different destinations to be able to collect the most amount of data
as possible.
 The three-way to improve the survey is to conduct the survey on a varied
number of flights to have enough data to round out the best result.66.
Suppose you want to determine the mean number of students per statistics
class in your state.
Describe a possible sampling method in three to five complete sentences. Make the
description detailed.
 The method used to determine the mean number of students per statistics
class in your state is the stratified sampling. The way it would be useful is to
grab the population (which the state or area you reside in) and divide it into
strata. Then sample specific statistics classes in the strata, and then conduct
analysis to determine the number of students per statistics class. To obtain
value from statistic done to, the sample is then used to predict the number of
students in each statistics class in that state.
67. Suppose you want to determine the mean number of cans of soda drunk each
month by students in their twenties at your school. Describe a possible sampling
method in three to five complete sentences. Make the description detailed.
 The method used to determine the mean number of cans of soda drunk each
month by students in their twenties at your school is Systematic sampling.
This can be done by selecting a starting point, in these case to separate all
students above the age of 18 into groups. Then formulate a system in which
the interviewer conducts an interview on every fifth student that enters the
school's cafeteria. Finally, follow the result selected by the group to get the
number of soda cans consumed by each of them in a month is recorded in
order to determine the mean number of soda cans consumed each month.
68. List some practical difficulties involved in getting accurate results from a
telephone survey.
 These are the following practical difficulties involved in getting accurate
results from a telephone survey:
1. There is no possible way to verify the validity of the person you’re talking to is a
real person or not. (example A.I. generated phone calls and voice mail baits or
pranks)
2. Many people when they not know a phone number they usually hang up.
Thinking that it might be sales to spoke person trying to sell them something, a
scammer trying to steal their information, or they just do not care enough to
answer.
3. The interviewer could contact specific people that he or she knows that will
give an expected response according to their own agenda, making the survey rigid
at the very beginning.
4. There might be a possibility that population use to run the survey might a not lot
of people who have access to telephones in the area they reside in. Making it
impossible to get accurate results when compared to other surveys done in
different populations.
5. There might be a possibility that the population may have many people do not
have phone service to receive calls. Making the result of the number in the survey
inconsistent to the number of people who have a telephone in the population.
69. List some practical difficulties involved in getting accurate results from a
mailed survey.
 These are the following practical difficulties involved in getting accurate
results from a mailed survey.
1. There might be a proportion of the population in the survey that might not have
a P. O box to send them the survey. Making the numbers receive in the survey
inconsistent to other surveys done in different populations that have more amount
of people with P. O boxes.
2. There is a chance that the people who receive the survey may not check their
mailbox or do not care enough to respond, giving us not the estimated amount the
survey wants to acquire.
3. The host could send the survey specific people that he or she knows that will
give expected responses according to their own agenda, making the survey rigid at
the very beginning.
4. The area that the survey might be conducted on May or may not a have a
functioning postal service that distributes the mail in the area. Which could put in
jeopardy the entire survey, since the mailed survey might not even make it to the
public.
5. The people receiving the survey could answer completely dishonestly and not
take any of the questions seriously, not giving us the most accurate result possible.
70. With your classmates, brainstorm some ways you could overcome these
problems if you needed to conduct a phone or mail survey.
 Mail survey:
1. In the case that the large majority of the population does not have a P. O box to
receive the survey or a functioning postal service that distributes the mail. Then we
will hire a personal delivery company like UPS to deliver the survey to the person
at their home.
2. In the case the host could send the survey specific people, the survey will be run
at random by the company that requires the data, so there is no way to get biased
result.
3. In the case the people who receive the survey may not check their mailbox or do
not care enough to respond, the company should offer an incentive like company
benefits or money to motive people into answering the survey.
 Phone survey
1. In the case of the validity of the person you’re talking to is a real person or not.
It's to hire a phone company like Verizon to give their data on the active users of
certain area, so you know which ones are real.
2. In the case of many people hanging up before the call, notify them with a text
message beforehand on who you are, who do work for, what you’ll be doing the
survey, etc. Lastly, selling the argument, you be offering an incentive for their
honest answers at the end of the survey.
3. In the case that the population may have many people do not have phone service
to receive calls, then we will contact the phone service provider to reinstate that
customer phone service if they answer the survey correctly. Consequently, giving
all parties involved what they want, in this case the customer gets their phone
service back, the phone service gets paid by the company doing the survey and the
company doing the survey gets their data.
71. The instructor takes her sample by gathering data on five randomly selected
students from each Lake Tahoe Community College math class. The type of
sampling she used is
a. cluster sampling
b. stratified sampling * (the answer chosen by me)
c. simple random sampling
d. convenience sampling
72. A study was done to determine the age, number of times per week, and the
duration (amount of time) of residents using a local park in San Jose. The first
house in the neighborhood around the park was selected randomly and then every
eighth house in the neighborhood around the park was interviewed. The sampling
method was:
a. simple random
b. systematic * (the answer chosen by me)
c. stratified
d. cluster
73. Name the sampling method used in each of the following situations:
a. A woman in the airport is handing out questionnaires to travelers asking them to
evaluate the airport’s service. She does not ask travelers who are hurrying through
the airport with their hands full of luggage, but instead asks all travelers who are
sitting near gates and not taking naps while they wait.
 The sampling method used in this situation is convenience.
b. A teacher wants to know if her students are doing homework, so she randomly
selects rows two and five and then calls on all students in row two and all students
in row five to present the solutions to homework problems to the class.
 The sampling method used in this situation is cluster.
c. The marketing manager for an electronics chain store wants information about
the ages of its customers. Over the next two weeks, at each store location, 100
randomly selected customers are given questionnaires to fill out asking for
information about age, as well as about other variables of interest.
 The sampling method used in this situation is stratified.
d.The librarian at a public library wants to determine what proportion of the library
users are children. The librarian has a tally sheet on which she marks whether
books are checked out by an adult or a child. She records this data for every fourth
patron who checks out books.
 The sampling method used in this situation is systematic.
e. A political party wants to know the reaction of voters to a debate between the
candidates. The day after the debate, the party’s polling staff calls 1,200 randomly
selected phone numbers. If a registered voter answers the phone or is available to
come to the phone, that registered voter is asked whom he or she intends to vote
for and whether the debate changed his or her opinion of the candidates.
 The sampling method used in this situation is simple random.
74. A “random survey” was conducted of 3,274 people of the “microprocessor
generation” (people born since 1971, the year the microprocessor was invented). It
was reported that 48% of those individuals surveyed stated that if they had $2,000
to spend, they would use it for computer equipment. Also, 66% of those surveyed
considered themselves relatively savvy computer users.
a. Do you consider the sample size large enough for a study of this type? Why or
why not?
 Yes, because this is a random sample given to all individuals who visited an
exhibition in the Angeles Convention Center, so we assume a total of 3,274
people were conducted for this test. So the group sampled is large enough to
reflect the U.S. population.
b. Based on your “gut feeling,” do you believe the percent’s accurately reflect the
U.S. population for those individuals born since 1971? If not, do you think the
percent of the population are actually higher or lower than the sample statistics?
Why?
 No, they are likely much higher because 66% surveyed consider themselves
savvy computer users and The Los Angeles Convention Center was visited
by a population of people born after 1917.
Additional information: The survey, reported by Intel Corporation, was filled out
by individuals who visited the Los Angeles Convention Center to see the
Smithsonian Institute's road show called “America’s Smithsonian.”
c. With this additional information, do you feel that all demographic and ethnic
groups were equally represented at the event? Why or why not?
 No, this is a convenience sample taken from individuals who visited an
exhibition in the Angeles Convention Center. This sample is not
representative of the U.S. population, especially demographic and ethnic
groups for being in such an enclosed population.
d. With the additional information, comment on how accurately you think the
sample statistics reflect the population parameters.
 The sample statistic does not accurately represent the population parameter
because no such parameter is evaluated in the sample to determine the
sample's behavior for which the entire sample will act.
75. The Well-Being Index is a survey that follows trends of U.S. residents on a
regular basis. There are six areas of health and wellness covered in the survey: Life
Evaluation, Emotional Health, Physical Health, Healthy Behavior, Work
Environment, and Basic Access. Some of the questions used to measure the Index
are listed below.
Identify the type of data obtained from each question used in this survey:
qualitative, quantitative discrete, or quantitative continuous.
a. Do you have any health problems that prevent you from doing any of the
things people your age can normally do?
 The type of data this question represents is qualitative.
b. During the past 30 days, for about how many days did poor health keep you
from doing your usual activities?
 The type of data this question represents is quantitative discrete
c. In the last seven days, on how many days did you exercise for 30 minutes or
more?
 The type of data this question represents is quantitative discrete.
d. Do you have health insurance coverage?
 The type of data this question represents is qualitative.
79. A scholarly article about response rates begins with the following quote:
“Declining contact and cooperation rates in random digit dial (RDD)
national telephone surveys raise serious concerns about the validity of
estimates drawn from such research.”(Scott Keeter et al., “Gauging the
Impact of Growing Nonresponse on Estimates from a National RDD
Telephone Survey,” Public Opinion Quarterly 70 no. 5 (2006),
http://poq.oxfordjournals.org/content/70/5/759.full (accessed May 1, 2013).)

The Pew Research Center for People and the Press admits:

“The percentage of people we interview – out of all we try to interview – has


been declining over the past decade or more.” (Frequently Asked Questions,
Pew Research Center for the People & the Press, http://www.people-
press.org/methodology/frequently-asked-questions/#dont-you-have-trouble-
getting-people-to-answer-your-polls (accessed May 1, 2013).)

a. What are some reasons for the decline in response rate over the past
decade?
• Reasons for the decline in response rate over the past decade:
1. The possibility of the calls interfering with people's busy work schedule
because when people get over run be many tasks they tend to not have a
lot of time for many things. Specially to calls related to a survey or
questionnaire.
2. The possibility of the call being sent at the wrong time, mostly around
times like work or office hours when people are unavailable at home to
receive calls.
3. The possibility that general interest of the public in the survey is that they
don’t find it interesting anymore, because people think the calls as a
waste of time and hence avoid such calls.
b. Explain why researchers are concerned with the impact of the declining
response rate on public opinion polls.
 The reason for concern expressed by the researchers over the declining
response rate to telephonic surveys are Inefficient samples that left thanks to
the fact not many people participate over the telephone fails to cover the true
population. The involvement of people and time during telephonic surveys,
which when not many people respond, the interviews are bound to call
several times to cover other people. Thus, tending to invest more time and
energy to get the minimum number of responses. And finally the low
reliability and validity of the results of such a survey, since the survey
involves an inefficient sample, the result obtained by such was less reliable.

80. Fifty part-time students were asked how many courses they were taking
this term. The (incomplete) results are shown below:

# of Courses Frequency Relative Cumulative


Frequency Relative
Frequency
1 30 0.6 0.6
2 15 15/50 = 0.3 0.3+0.6 = 0.9
3 50-(15+30) = 5 5/50= 0.1 0.9+0.1 = 1.00
a. Fill in the blanks in Table 1.33.
b. What percent of students take exactly two courses?
 The percent of students take exactly two courses is 30% because 0.3 x (100)
= 30%
c. What percent of students take one or two courses?
 The percent of students take one or two courses 90% because 0.3 x (100) =
30% and 0.6 x (100) = 60%, so 60% + 30% = 90%

81. Sixty adults with gum disease were asked the number of times per week
they used to floss before their diagnosis. The (incomplete) results are shown
in Table 1.34.

# Flossing per Frequency Relative Cumulative


Week Frequency Relative Freq.
0 27 0.4500 0.4500
1 18 18/60 = 0.3000 0.45+0.30 =
0.7500
3 60(27+18+3+1) = 11/60 = 0.1833 0.75 + 0.183 =
11 0.9333
6 3 0.0500 0.933 + 0.05 =
0.9833
7 1 0.0167 0.983 + 0.0167 =
1.000

a. Fill in the blanks in Table 1.34.


b. What percent of adults flossed six times per week?
 The percent of adults flossed six times per week is 5% because 0.0500 x
(100) = 5%
c. What percent flossed at most three times per week?
 The percent flossed at most three times per week is 18.33% because 0.1833
x (100) = 18.33%

You might also like