Professional Documents
Culture Documents
Statistics It is the science concerned with developing and studying methods for
collecting, analyzing, interpreting and presenting empirical data.
Descriptive statistics It summarize and graph the data for a group that you choose.
Central tendency: Use the mean or the median to locate the center of the
dataset.
Dispersion: You can use the range or standard deviation to measure the
dispersion
Skewness and Kurtosis: The measure tells you whether the distribution of values
is symmetric or skewed.
Inferential statistics takes data from a sample and makes inferences about the larger
population from which the sample was drawn.
Hypothesis testing: Is the population mean greater than or less than a particular
value
Confidence intervals (CIs): It incorporates the uncertainty and sample error to
create a range of values the actual population value is like to fall within.
Correlation and Regression: It describes the relationship between a set of
independent variables and a dependent variable.
Population is the entire group that you want to draw conclusions about.
Sample is the specific group that you will collect data from.
Data are the values (measurements or observations) that the variables can assume
A collection of data values forms a dataset. Each value in the data set is called a data
value or a datum
Qualitative variables are variables that can be placed into distinct categories
Continuous variables can assume an infinite number of values between any two specific
values.
ordinal level of measurement classifies data into categories that can be ranked
interval level of measurement ranks data, and precise differences between units of
measure do exist; however, there is no meaningful zero
Telephone survey have an advantage over personal interview surveys in that they are
less costly.
Mailed questionnaire surveys can be used to cover a wider geographic area than
telephone surveys or personal interviews
Sampling Methods
Probability sampling involves random selection, allowing you to make strong
statistical inferences about the whole group.
Non-probability sampling involves non-random selection based on convenience
or other criteria, allowing you to easily collect data.
Random Sampling every member of the population has an equal chance of being
selected.
Systematic sampling is similar to simple random sampling, but it is usually
slightly easier to conduct. Every member of the population is listed with a
number, but instead of randomly generating numbers, individuals are chosen at
regular intervals.
Stratified sampling involves dividing the population into subpopulations that may
differ in important ways.
Cluster sampling also involves dividing the population into subgroups, but each
subgroup should have similar characteristics to the whole sample.
non-probability sample, individuals are selected based on non-random criteria,
and not every individual has a chance of being included.
convenience sample simply includes the individuals who happen to be most
accessible to the researcher.
Purposive Sampling also known as judgement sampling, involves the researcher
using their expertise to select a sample that is most useful to the purposes of the
research.
Snowball Samplingcan be used to recruit participants via other participants.
Observational the researcher merely observes what is happening or what has
happened in the past and tries to draw conclusions based on these observations.
experimental study, the researcher manipulates one of the variables and tries to
determine how the manipulation influences other variables.
independent variable in an experimental study is the one that is being
manipulated by the researcher.
resultant variable is called the dependent variable or the outcome variable.
variable is the variable that is studied to see if it has changed significantly due to
the manipulation of the independent variable.
confounding variable is one that influences the dependent or outcome variable
but was not separated from the independent variable.
Small Sample Size The first thing to consider is the sample that was used in the
research study.
Changing the Subject another type of statistical distortion can occur when different
values are used to represent the same data.
Detached Statistics A claim that uses a detached statistic is one in which no comparison
is made.
Implied Connections Many claims attempt to imply connections between variables that
may not actually exist.
Frequency distribution is the organization of raw data in table form, using classes and
frequencies.
categorical frequency distribution is used for data that can be placed in specific
categories, such as nominal or ordinal-level data.
Types of graphs
Histogram is a graph that displays the data by using contiguous vertical bars
(unless the frequency of a class is 0) of various heights to represent the
frequencies of the classes.
frequency polygon is a graph that displays the data by using lines that connect
points plotted for the frequencies at the midpoints of the classes.
Ogive The cumulative frequency is the sum of the frequencies accumulated up to
the upper boundary of a class in the distribution.
bar graph represents the data by using vertical or horizontal bars whose heights
or lengths represent the frequencies of the data.
Pareto chart is used to represent a frequency distribution for a categorical
variable, and the frequencies are displayed by the heights of vertical bars, which
are arranged in order from highest to lowest.
time series graph represents data that occur over a specific period of time.
Pie graphs are used extensively in statistics. The purpose of the pie graph is to
show the relationship of the parts to the whole by visually comparing the sizes of
the sections.
statisticis a characteristic or measure obtained by using the data values from a sample.
parameteris a characteristic or measure obtained by using all the data values from a
specific population.
MEAN is the sum of the values, divided by the total number of values.
mode is the value that occurs most often in the data set. It is sometimes said to be the
most typical case.
Midrange is defined as the sum of the lowest and highest values in the date set divided
by 2