You are on page 1of 5

Chapter 1: Introduction to Statistics

1.1 An overview of Statistics

• Data consist of information coming from observations, counts, measurements, or responses.

• Statistics is the science of collecting, organizing, analyzing, and interpreting


data in order to make decisions.

• There are two types of data sets we will use in this course — populations
and samples.

• A Population is the collection of all outcomes, responses,


measurements, or counts that are of interest.

• A Sample is a subset, or part, of the population.

“A goal of statisticians is to draw reasonable conclusions about population characteristics based off of


the analysis of sample data. It is extremely important to obtain sample data that are representative of
the population from which the data are drawn.”

Examples:

1. In a recent survey from 2012, 1500 new college graduates were asked if they had taken out
student loans to finance their education. 69% said yes. Identify the population and the sample.
Describe the data set.
2. Thirty nurses working in the San Diego area were surveyed concerning their opinions of working
conditions.
3. A survey of 200 Filipino adults found that 32% drink coffee daily.

• A parameter is a number that describes a population characteristic. For example, the average
age of all people in the U.S.
• A statistic is a number that describes a sample characteristic. For example, the average age of
people from a sample of three states.

Examples:

1. In 2012, Major League Baseball teams spent a total of $2,940,657,192 on players salaries.
2. In a survey of 1000 U.S. adults, 74% said they care about the next upcoming presidential
election.
3. In a recent study of math majors at a university, 10 students were minoring in physics.
4. The 2182 students who accepted admission offers to Northwestern University in 2009 had an
average SAT score of 1442.

The study of statistics has two major branches: descriptive statistics and inferential statistics.

• Descriptive Statistics Involves organizing, summarizing, and displaying data.

• Inferential Statistics Involves using sample data to draw conclusions about a population.
Examples :

1. Expenditures for the cable industry were $5.66 billion in 1996 (Source: USA TODAY ).
2. The report by the Medicare Office of the Actuary estimated that health spending will grow by an
average of 5.8 percent a year through 2020, compared to 5.7 percent without the health
overhaul.
3. The mean travel time to work (years 2008–2012) for San Diego County workers age 16+ is 24.2
minutes. (Source: census.gov ).
4. Allergy therapy makes bees go away (Source: Prevention).

1.2 Data Classification


Type of Data Set : Quantitative & Qualitative

• Qualitative Data Consists of attributes, labels, categories or nonnumerical entries, for example
hair color, place of birth or major.

• Quantitative data Numerical measurements or counts, for example age, height or


temperature

Examples :

1. A Gallup poll asked Americans, Which subject (Math, English, Art,...), if any, has been the most
valuable in your life?
2. The number of donuts sold each week at Dunkin Donuts
3. The different flavors of donuts at Dunkin Donuts
4. Heights of NBA atheletes
5. The numbers on the back of NBA atheletes’ uniforms
6. Bank account numbers of each person in this class
7. Zip code of each person’s residence.

• DISCRETE VARIABLE a variable that assumes values that can be counted. A discrete variable has gaps
between values the variable can take on. Ex’s of discrete variables: number of people, number of
donuts, performance rating (1, 2, 3, 4 or 5), shoe size (5.5, 6, 6.5, 7, 7.5, 8, ...).

• CONTINUOUS VARIABLE a variable that can be equal to any number between any two specific values.
A variable that can take on any number in a connected interval or set of numbers. Ex’s of continuous
variables: time, temps, volume.

Examples :

1. The number of people who buy a coffee from Starbucks today.


2. The volume of water (in cubic feet) of each of the Great Lakes.
3. The number of drones Amazon.com wants to have delivering orders by 2020.
4. The amount of time it takes to answer this question.
5. The number of nuclear power plants in the world.
1.3 Data Collection and Experimental Design
How to Design a Statistical Study

1. Identify the variable(s) of interest (the focus) and the population of the study.
2. Develop a detailed plan for collecting data. If you use a sample, make sure the sample is
representative of the population.
3. Collect the data.
4. Describe the data using descriptive statistics techniques.
5. Interpret the data and make decisions about the population using inferential statistics.
6. Identify any possible errors.

Types of Statistical Studies: A statistical study can be categorized as observational study or an


experiment.

• Observational study A researcher observes and measures characteristics of interest of part of a


population, but does not change existing conditions.
• Experiment In performing an experiment, a treatment is applied to part of a population, called a
treatment group, and responses are observed. Another part of the population may be used as a
control group, in which no treatment is applied.(The subjects in the treatment and control
groups are called experimental units.) In many cases, subjects in the control group are given a
placebo, which is a harmless, fake treatment, that is made to look like the real treatment. The
responses of the treatment group and control group can then be compared and studied to help
determine if the treatment was effective, or if the treatment causes side effects. In most cases,
it is a good idea to use the same number of subjects for each group.

Examples :

1. Researchers demonstrated in people at risk for cardiovascular disease that 2000 milligrams per
day of acetyl-L-carnitine over a 24-week period lowered blood pressure and improved insulin
resistance.
2. Researchers conduct a study to determine where a drug used to treat hypothyroidism works
better when taken in the morning or when taken at bedtime. To perform the study, 90 patients
are given one pill to take in the morning and one pill to take in the evening (one containing the
drug and the other a placebo). After 3 months, patients are instructed to switch pills.
3. Researchers conduct a study to determine the number of falls women had during pregnancy. To
perform the study, researchers contacted 3997 women who had recently given birth and asked
them how many times they fell during their pregnancies.
Methods of Data Collection

There are several ways to collect data. Often the focus of the study dictates the best way to collect data.

The four ways we collect data are:

1. by conducting an observational study


2. by conducting an experimental study
3. by conducting a simulation
4. by conducting a survey

• Simulation uses a mathematical or physical model to reproduce the conditions of a situation or


process, often by use of computers. Simulations allow you to study situations that are impractical or too
dangerous to create in real life, and they often save time and money. For instance, Automobile
manufacturers use simulations with dummies to study the effects of crashes on humans.

• A Survey is an investigation of one or more characteristics of a population. The most common types of
surveys are done by interview, Internet, phone, or mail. An official count or survey of a population is
called a Census.

Examples :

1. A study of the effect of changing flight patterns on the number of airplane accidents.
2. A study of the effect of eating carrots on lowering blood pressure.
3. A study of how sixth grade students solve a puzzle.
4. A study of U.S. residents’ approval rating of the U.S. president.

Sampling Techniques

In a Random Sample, every member of the population has an equal chance of being selected.

The five sampling techniques presented here are: Simple Random Sampling, Stratified Sampling,
Cluster Sampling, Systematic Sampling and Convenience Sampling.

Simple Random Sample : Every possible sample of the same size has the same chance of being selected.
One way to collect a simple random sample is to assign a number to each member of the population.
Random numbers can then be generated by a random number table, a software program or a calculator.
Members of the population that correspond to these numbers become members of the sample.

Example : There are 37 students in Mr. Busken’s Friday stats class. You wish to form a sample of five
students to answer some survey questions. Select the students who will belong to the simple random
sample. Use the calculator’s ”randInt” function

Stratified Sample : Divide a population into groups (strata) so that subjects within the same subgroup
share the same characteristics (such as gender or age bracket) and select a random sample from each
group.
Example : To collect a stratified sample of Mr. Busken’s students, you could divide the students up into
age groups, then randomly select a couple people from each age group.

Cluster Sample : Divide the population into groups (clusters) and select all of the members in one or
more, but not all, of the clusters.

Example : To collect a cluster sample of Mr. Busken’s students, you could divide the students up into
age groups, then select all the students in one or more age brackets.

Systematic Sample Choose a starting value at random. Then choose every kth member of the
population.

Example To collect a systematic sample of Mr. Busken’s students, you could number the students 1
through 37, randomly choose a starting number, then select every 4th person.

Convenience Sample Choose only members of the population that are easy to get Often leads to biased
studies (not recommended).

You might also like