Professional Documents
Culture Documents
Chau Tran
Dr. Alana Unfried
STAT 330 – Sampling Design and Analysis
9 May 2019
Google Form:
https://docs.google.com/forms/d/e/1FAIpQLSeyYDG6uP7XgT3EqtOYCvHqVKk0Pa98X3kN9
Odl0J-KPhBzrw/viewform
Google Spreadsheet:
https://docs.google.com/spreadsheets/d/1r5NjOHsSPKQyjgZ1qUpvcw5Smdsbjsqawdz8XTimsl
A/edit#gid=581602063
My sampling project is called “Driver’s License: A Study.” For this sampling project, I
am trying to learn the average age of first-time drivers. The target population is those who
possess a driver's license in the United States. This topic is interesting because the legal driving
age is varied from state to state. Additionally, in recent years, there are many factors affecting
the decline of the average first-time driver’s license holders, including the hassle of having to put
in a certain amount of time and effort into fulfilling requirements, the availability of ridesharing
services such as Uber and Lyft, and financial problems. My sampling frame is 15,000 United
States residents. I will determine who is selected for the sample by narrowing the requirements
down to people who are above the legal age for driving in each state. Everyone will have an
equal chance of participating, minus those who do not have email accounts. One I take my
sample, I will send the survey via email. I will define my strata by genders because I believe they
affect the age of obtaining a driver's license. I will use proportional allocation to better represent
the population.
This project's objective is to gather and analyze data collected from a poll of simple
random samples regarding the correlation between age at the time of obtaining a driver's license
and other undermining factors such as education level and gender. Two links that are necessary
for this project are the Google Form and the Google Spreadsheet. The first question on my
survey asks the participant to identify with a gender. The second question asks for the current
age of the participant. The third question asks for the participant’s highest level of education,
following with occupation in the fourth question. The fifth question is a skip logical question,
which asks “do you have a driver’s license?” If the answer is no, the survey closes, and if it is
yes, the participant is directed to the next question, which asks “at what age did you obtain your
driver license?” Then, the participant is asked to rate the convenience level of the driver’s
license, in other words, the ability to drive, from a scale of 1 to 5, which 1 being not at all
convenient and 5 being extremely convenient. The last question asks for the recommended age
A simple random sample will take a random portion of the entire population to represent
the entire data set, in this case, 15,000 people, where each person has an equal probability of
being chosen. A method of lottery or random draw will be appropriate to be applied to this type
of sample. There is no need to divide the population into sub-populations. A Simple random
preferred due to its lower rate of bias. A computer will be useful to select our random sample in
this study.
separating samples based on education levels. I believe there is a difference between the rate of
driver's license holders among high school students and that of undergraduate students. This
application would minimize occurrences of high extremes affecting the mean and median in the
data. For example, giving high school students and undergraduate students a fair chance of
representing in the study would be helpful given driver's license holders in high school are fewer
than driver's license holders in college. The pit gall of this type of sample for my project would
be that it does not take into account cases of people obtaining driver's licenses regardless of their
Systematic sampling might work for this study due to its simplicity. Compared to simple
random sampling where we will have to generate random numbers to pick out our participants,
systematic sampling allows us to take advantage of a fixed, periodic interval which is determined
manipulation. Therefore, I do not suggest systematic sampling to be incorporated into this study.
Cluster sampling is not an ideal method because there is no need to define subgroups
within the clusters. The study does not call for specific data. This type of random sampling
would be useful if the study was to choose certain regions in the United States to demonstrate the
findings, however, there is no need to do so. Compared to simple random sampling and
systematic sampling, cluster sampling is more cost-effective, however, the clusters may not be
Based on the comparison between the four types of random sample, I have concluded that
a simple random sample would be the best option. This type of random sample will create a
balanced subset from the bigger sample that carries the greatest potential for representing the
larger group as a whole that is free from any bias. The type of simple random sampling I am
going to conduct is an online survey. I am going to obtain a list of emails from a database and
give each one a number. Then, I will use the lottery system to pick out my random sample.
Everyone will have an equal chance of participating, minus those who do not have email
accounts. I expect there to be occurrences of nonresponse and cases of response bias where
participants submit false information the regarding level of education and age at obtaining a
driver’s license. Overall, the result should reflect the true population in the United States and the