You are on page 1of 8

Statistical Analysis

Forensic Science Laboratory


Syracuse University Chemistry 113

Introduction

Statistics is the science of making


effective use of numerical data relating to
groups of individuals or experiments. It
deals with all aspects of this, including not
only the collection of data, but the analysis
and interpretation of such data, as well as
the planning of the most effective way to
collect the data itself.

A statistician is someone who


specializes in the application of statistical
analysis, whether by hand or through
software analysis. Often such people have
gained this experience after starting work
in any of a list of fields of application of
statistics.

The bridge between Route 18 and Route 94 has collapsed. The investigators
believe that the bridge was built with unsatisfactory materials, and wish to
examine the metal used during the building of the bridge. If the county can
prove the materials were substandard, they can force the construction company
to make all repairs, and possible recovery additional fines due to the hazard substandard
materials has posed to any and all drivers…

Objectives

In this lab you will examine a variety of galvanized nails to determine how much of the
zinc in the nails dissolves in water, an occurrence that would be expected during normal weather
conditions, and possible pooling of water in certain locations of the bridges’ construction. You
will statistically analyze the data collected

Background
Experimental and observational studies

A common goal for a statistical research project is to investigate causality, and in


particular to draw a conclusion on the effect of changes in the values of predictors or
independent variables on dependent variables or responses. There are two major types of causal
statistical studies: experimental studies and observational studies. In both types of studies, the
effect of differences of an independent variable (or variables) on the behavior of the dependent

Statistical Analysis 1 Copyright James T. Spencer 2010


variable are observed. The difference between the two types lies in how the study is actually
conducted. Each can be very effective.

An experimental study involves taking measurements of the system under study,


manipulating the system, and then taking additional measurements using the same procedure to
determine if the manipulation has modified the values of the measurements. In contrast, an
observational study does not involve experimental manipulation. Instead, data are gathered and
correlations between predictors and response are investigated. An example of an experimental
study is the famous Hawthorne study, which attempted to test changes to the working
environment at the Hawthorne plant of the Western Electric Company. The researchers were
interested in determining whether increased illumination would increase the productivity of the
assembly line workers. The researchers first measured the productivity in the plant, then
modified the illumination in an area of the plant and checked if the changes in illumination
affected productivity. It turned out that productivity indeed improved (under the experimental
conditions). However, the study is heavily criticized today for errors in experimental procedures,
specifically for the lack of a control group and blindness. The Hawthorne effect refers to finding
that an outcome (in this case, worker productivity) changed due to observation itself. Those in
the Hawthorne study became more productive not because the lighting was changed but because
they were being observed.[citation needed]

An example of an observational study is one that explores the correlation between


smoking and lung cancer. This type of study typically uses a survey to collect observations about
the area of interest and then performs statistical analysis on the survey results. Surveys will
often use a version of a Lickert scale to quantify the opinions and beliefs of the subjects. In this
case, the researchers would collect observations of both smokers and non-smokers, perhaps
through a case-control study, and then look for the number of cases of lung cancer in each group.

The basic steps of an experiment are:

1. Planning the research, including determining information sources, research subject


selection, and ethical considerations for the proposed research and method.
2. Design of experiments, concentrating on the system model and the interaction of
independent and dependent variables.
3. Summarizing a collection of observations to feature their commonality by suppressing
details. (Descriptive statistics)
4. Reaching consensus about what the observations tell about the world being observed.
(Statistical inference)
5. Documenting / presenting the results of the study.

Levels of measurement

There are four types of measurements, or levels of measurements, used in statistics:

 nominal,
 ordinal,
 interval, and
 ratio.

Statistical Analysis 2
They have different degrees of usefulness in statistical research. Ratio measurements have
defined both a zero value and the distances between different measurements; this provides the
greatest flexibility in statistical methods that can be used for analyzing the data. Interval
measurements have meaningful distances between measurements defined, but have no
meaningful zero value defined (as in the case with IQ measurements or with temperature
measurements in Fahrenheit). Ordinal measurements have imprecise differences between
consecutive values, but have a meaningful order to those values. Nominal measurements have no
meaningful rank order among values.

Since variables conforming only to nominal or ordinal measurements cannot be reasonably


measured numerically, sometimes they are called together as categorical variables, whereas ratio
and interval measurements are grouped together as quantitative or continuous variables due to
their numerical nature.

Key terms used in statistics


Null hypothesis

Interpretation of statistical information can often involve the development of a null


hypothesis in that the assumption is that whatever is proposed as a cause has no effect on the
variable being measured.

The best illustration for a novice is the predicament encountered by a jury trial. The null
hypothesis, H0, asserts that the defendant is innocent, whereas the alternative hypothesis, H1,
asserts that the defendant is guilty.

The indictment comes because of suspicion of the guilt. The H0 (status quo) stands in
opposition to H1 and is maintained unless H1 is supported by evidence "beyond a reasonable
doubt". However, "failure to reject H0" in this case does not imply innocence, but merely that the
evidence was insufficient to convict. So the jury does not necessarily accept H0 but fails to reject
H0.

Error

When working from a null hypothesis, two basic forms of error are recognized:

 Type I errors where the null hypothesis is falsely rejected giving a "false positive".
 Type II errors where the null hypothesis fails to be rejected and an actual difference
between quantitative values (populations) is missed.

Confidence intervals

Most studies will only sample part of a population and then the result is used to interpret
the null hypothesis in the context of the whole population. Any estimates obtained from the
sample only approximate the population value. Confidence intervals allow statisticians to express
how closely the sample estimate matches the true value in the whole population. Often they are
expressed as 95% confidence intervals. Formally, a 95% confidence interval of a procedure is

Statistical Analysis 3
any range such that the interval covers the true population value 95% of the time given repeated
sampling under the same conditions.

Significance

Statistics will rarely give a simple Yes/No type answer to the question asked.
Interpretation often comes down to the level of statistical significance applied to the numbers,
and often refer to the probability of a value accurately rejecting the null hypothesis (sometimes
referred to as the p-value

Probability

The probability of a particular event occurring is easily calculated. The number of ways
that particular event can occur divided by the total number of possible outcomes, will give you
the probability. If you’re looking to combine events, two methods are advised. If your look for
one event OR a different event, calculate the probabilities separately and then ADD them
together. If you’re looking to have one event AND a second event occur, calculate the
probabilities separately and then MULTIPLY them together.

References:

http://en.wikipedia.org/wiki/Statistics

Vitha, M.F. and Carr, P.W., Journal of Chemical Education • Vol. 74 No. 8 August 1997

Experimental Methods and Data Table


Procedure:

(1) Obtain three nails from one of the three samples of galvanized nails provided.
Perform the experiment on all three nails at the same time, basically performing all
three trials at once.
(2) Weigh each of the three nails and record the original weight in the data table
(3) Obtain 3 large test tubes – the test tube must be large enough for the nail to fit into
AND be completely covered by the hydrochloric acid.
(4) Fill each test tube with 1 M HCl so that the nail will be completely covered.
(5) Drop the nails into the test tubes, one nail per test tube, and place the test tubes in a
test tube rack.
(6) Observe the chemical reaction between the zinc in the galvanized nail and the
hydrochloric acid – the reaction will produce bubbling.
(7) When the bubbling has finished (this may be left overnight, if desired) remove the
nail by pouring the remaining solution into the waste container, catching the nail
before it can fall – be sure your hands are gloved - and rinse the nail completely.
Wipe the nail dry before obtaining the final mass of the nail, recording each mass in
the data table provided.
(8) Clean up your lab area completely before continuing onto the calculations.

Statistical Analysis 4
Data Table:

Trial 1 Trial 2 Trial 3


Original Mass of
Galvanized Nail
Final Mass of
Galvanized Nail

Mass difference
Percent Composition
of Zinc in the nail

Class Data Table:

Nail Sample 1 Nail Sample 2 Nail Sample 3

Statistical Analysis 5
Statistical Analysis:

Measures of Central Tendencies: give information about the average, or typical, data
point, when a large grouping of data is considered:

Mean: the calculated average – sum of all data points dived by the number of data
points collected…

Median: the data set that sits in the middle when data is placed in numerical
order…

Mode: the data point that appears the most times in the data set

Measures of Variability: gives information about the data as a collective set

Range: based on only two scores, the highest and the lowest, listed as a single
value and is the difference between the high and low data points…

Standard Deviation: takes into account all the scores in the data set and indicates
how much one score deviates from another… You need to square every
data point and add them all together to find the numerator, while the
denominator is simply the number of data points collected. Once the
quotient is determined, subtract the mean squared from it and then the
square root is taken

SD = Σx2 - M2
N

Variance: identifies an overall change found among the data set, and is simply the
standard deviation squared….

Statistical Analysis 6
Statistical Analysis
Post-lab Assignment
Name Laboratory Section
Instructor Lab Period

(1) Were there any data points that did not appear to fit the generally viewed pattern in this
lab?

If you found some, what should you do with such data points? Why?

(2) What did this lab teach you about the kinds of materials that are available at a local
hardware store?

(3) If you had tested the bridges materials, and found them to be within the standard
deviation for such materials, what would you conclude about the liability of the
construction company?

Statistical Analysis 7
Statistical Analysis
Pre-lab Assignment
Name Laboratory Section
Instructor Lab Period

(1) Using the definitions provided in the calculations section above, identify the mean,
median, and mode for the following problems:

(a) A set of six fingerprints are compared to a standard, and the following are the
numbers of minutiae that are found to be matching:

14, 12, 9, 10, 12, 12, 11, 13

(b) Immediately following this laboratory, the class is taking a 20 item quiz on the
material. The previous years scores, based on number of questions answered
correctly, are as follows:

12, 12, 14, 15, 16, 19, 20, 10, 7, 5, 2, 3, 3

(2) Using the definitions provided in the calculations section above, identify the range,
standard deviation, and variability for each of the above problems:

(a)

(b)

(3) (a) The morgue has 12 cold drawers for bodies, and is ¾ full at this point in time. If you
are asked to open a drawer at random, what is the probability that you will open an empty
drawer?

(b) What is the probability that you will open an empty drawer AND another empty
drawer?

Statistical Analysis 8

You might also like