Weeks 6 Statistics in Research

8/14/23, 12:06 PM Statistics in Research Analysis 8/14/23, 12:06 PM Statistics in Research Analysis
One example here is histological abbreviations. If one metric is histology, your column name would be listed as "histology" and
instead of listing "ductal carcinoma in situ" in the data entry, you would use the abbreviation DCIS and include a table legend in the
Statistics in Research Analysis excel file for the statisticians to use as needed
Enter data using double verification or implement other quality-control procedures

Objectives
It is recommended to enter your data twice in different locations to confirm
Determine the best way to prepare and organize data according to your research design Clean raw data files to address missing or miskeyed information
Identify the different levels of statistics, the types of tests at each level, and when they are most
likely to be used. This can be done in some of Excel's computing equations
Identify the purpose of statistical analysis. Develop summative scores
Differentiate between parametric and non-parametric statistics and tests
Creating summative scores is useful to draw conclusions on your data especially using descriptive statistics
Preparing and Organizing Data-Quantitative
Many of you will be using the UW-L stat center for help on your data analysis. It is very
important that you submit your data to them in a universal format to avoid misunderstandings
and time delays in analysis.
7 steps in preparing data for analysis
1. Check data for accuracy and completeness
The data should be checked other than someone who entered the data. Ideally, the data should be checked by more than one person
(if groups of 3, the other 2 group members should check the data; if groups of 2, the other group member should check the data)
Label each variable in a code book, on the instrument or direct data entry program (Excel, Statistical Analysis Software (SAS),
Statistica and the Statistical Package for the Social Sciences (SPSS) (the latter is used by the UW-L statistical center)
For many of you, this will mean entering in each variable you are intending to measure or anything that has value to your study
results. For example, if you are measuring 8 metrics in planning, you would have 8 varaibles to list in your data entry program. If
those metrics were dependent on age or histology of disease, those would also be noted per data point. However, it's important to
think about the goal of your research. If you want to study incidence of breast cancer among 20-30 year olds, age is important to
include. If you're only looking at incidence of breast cancer, age is not relevant.
Assign variable labels to computer locations using naming convensions of the particular statistical application you plan to use.
Develop a comprehensive Codebook
3 and 4 go hand in hand. It is your choice to include the labels and codebook in one file or you can separate them out.
https://softchalkcloud.com/lesson/files/iLo901ScPqgMQH/StatisticsInResearch_2020_print.html 1/16 https://softchalkcloud.com/lesson/files/iLo901ScPqgMQH/StatisticsInResearch_2020_print.html 2/16
Preparing and Organizing Data-Quantitative Continued 1. Transcription-involves transforimgin an audio recording into a written record
2. Organizing data
Similar to the quantitative approach, you must determine the best way to organize your data for ease of statistical interpretation.
Manual data entry such as Excel is useful but many survey tools such as Qualtrics and Survey Monkey offer transcription and data
organization directly in their software.
Why use statistics?

Definition of statistical analysis: the organization and interpretation of data according to well-defined,
systematic and mathematical procedures and rules.
Use it to make sense of data collected
Present data to others in an understandable way
To help the researcher (and interested others) make decisions about what was learned.

Analyzing Quantitative Data

When do you use quantitative data?

This is an example from your textbook. This table is easy to read lists each variable and corresponding data point in in an Must be able to represent data with a number
organized fashion.
Nominal data: a number is assigned to a non-numerical variable
Ordinal data: numbers are assigned in a logical, ranking order but intervals not known or
Preparing and Organizing Data-Qualitative unequal
Interval data: data is logically ordered and has equal intervals, but no zero point. Can be
This concept still applies to qualitative data collection: many of you will be using the UW-L stat mathematically manipulated (+, -, x, /)
center for help on your data analysis. It is very important that you submit your data to them in a Ratio data: continuous scale of numbers including zero and equal intervals
universal format to avoid misunderstandings and time delays in analysis.
Steps to organizing qualitative data: Analyzing Quantitative Data

# of programs
Nominal= means "name", simply to label attributes, way to classify (eg. cancer site, eye color Year with open Avg % open # open seats
yes/no, 1,2 seats
No ordered relationship, numbers are discrete eg. 2.5, numbers assigned have no
meaning when added or subtracted 2000 12 52% 107
Ordinal: Ordinal means to provide order. This value is also arbitrary or symbolic, but does
show the magnitude of differences between levels.
Rank or Likert scales---grade of skin reaction 2001 17 40.5% 105
Interval data: similar to ordinal and nominal data, but has equal spacing between categories.
Fits temp readings, IQ scores
2002 17 34% 86
Ratio: highest order of measurement.

Descriptive Analysis (Level One) Histograms

Histgrams show frequencies
Describe specific characteristics of data
Purpose: to reduce large sets of observations into more compact and interpretable forms
First step of analytical process
Includes:
Frequencies
Central tendencies
Variance
Correlations

Frequencies
Usually displayed in tabular or graphic form- most basic descriptive statistic

Charts
Charts also show frequencies
Central Tendencies
Mode
Most common score
Median
Score that divides the sample in half
Mean
the arithmetic average = Σ x/n
Percentile=percent of scores below a particular score (would you rather be in the 98th percentile or
2nd percentile of IQ scores? What percentile is the median score in a distribution?

The Normal Distribution
The Normal Distribution

In measuring anything in the natural world or any large population a normal (bell shaped) curve Standard deviation tells the researcher how scores deviate on average from the mean. On a normal
is described. In other words it is normal to see fewer individuals at either end of a continuum curve, 68% of scores will fall within one standard deviation of the mean.
than in the middle.
Frequencies can be asymmetrical, symmetrical, or skewed Correlational Analysis
Based on the mean and standard deviations.
In general, 68% of values will fall within 1 SD above or below the mean
In a normal curve- the mean, median, and mode are in the same location. In a skewed Expresses quantitatively the degree and direction of the relationship between variables.
distribution the three measures fall in different places. Helps determine validity and reliability
Cause and effect
May be positive or negative (-1 to +1)
1=perfect (each variable changes at the same rate) positive correlation, -1 perfect negative
Variance correlation
How scores are spread out from center e.g., Reading level goes up one level with each year of school=perfect positive correlation
The range
Difference between highest and lowest scores
Variance=sum of the squared deviations from the mean
Scatter Diagrams
Scatter diagrams show correlation
Positive
As value of one variable increases or decreases, the other variable changes in the same
direction
Standard deviation: square root of variance
Negative

Each variable is in an opposing direction
Variance Non-linear
Variance is measure of dispersion, or how much the individual scores differ from the mean, if there
is little dispersion, the scores are similar. Shows how scores or numbers vary and provide
information about scoring patterns of entire group.


T-Test, ANOVA
Nonparametric includes nominal and ordinal data.
Chi-square, one way analysis of variance
Category affects the type of statistical tests that can be utilized with that data
Parametric vs Nonparametric

Rules that must apply to use parametric tests:
Correlations Coefficients Sample must represent the target population so that the variables fall within the normal
distribution for that population
Compare pairs of numbers Must generate interval or ratio data
Pearson's r for parametric data Random assignment to groups or matching must have occurred
Spearman's rho (ρ) for non-parametric
Can be used to predict the value of one variable given the value of the other once calculated
Steps in the Inferential Process

I. State the hypothesis: restate the working hypothesis into the null hypothesis
Inferential Analysis (Level Two) II. Select a significance level: convention is p<0.05
Tools to determine the extent to which the observations of the sample are representative of the III. Compute a calculated value: choose a test related to type of data
population
Also determines whether conclusions about the population can be drawn from the sample IV. Obtain a critical value from a table
Statistical difference: Answers the question: What is the probability that this change occurred
because of events in the research study and what is the probability that this change would V. Reject or fail to reject the null hypothesis
have occurred anyway, by chance?
Steps in the Inferential Process
Parametric vs Nonparametric
Null Hypothesis
Parametric includes data that is either interval or ratio type Hypothesis, say NO relation between groups; easier to prove no relation than that there is
definitely a relationship.
Significance level, researcher decides if it is significant or not Identify relationships between one set of variables then determine if this relationship can be
Choose a test, parametric or non-parametric (Again parametric for ratio and interval data, non- applied to other sets of data.
parametric for nominal or ordinal data)
Multiple regression, factor analyses, discriminant function analysis

Qualitative data analysis

Types of Tests
Is used for qualitative or naturalistic inquiry as the data is not easily quantified. In order to make
For parametric data sense, it must be described.
This can be done by developing categories for the observations, etc
t-test: compares means from 2 sample groups on same variable (to determine if differences are Then the category groups can be analyzed by comparing them and looking for similarities and
significant) differences. In other types of research underlying themes can be described.
ANOVA: one-way analysis of variance
same as above but can handle more than 2 groups
How to correctly state your hypothesis in reporting
Also called F test When describing your result in terms of hypothesis testing, you use the following 2 explanations:
Type I error is when the null hypotheses is rejected when it should be accepted; Type II error is
when the null hypothesis is accepted when it should be rejected "Reject the null hypothesis"
This implies that your research hypothesis met significance requirements, therefore, you reject the null hypothesis.
"Do not reject the null hypothesis" or "failed to reject the null hypothesis"
Tests for nonparametric data This implies that research hypothesis did NOT meet significance requirements and you cannot reject the null hypothesis.
YOU NEVER USE THE WORD "ACCEPT" TO DESCRIBE HYPOTHESES BECAUSE THERE IS THE POTENTIAL FOR TYPE 1 OR TYPE
Chi-square: tests to see if observed frequencies of events in certain categories fall within the range 2 ERRORS. REFER TO THE SAMPLE PAPERS FOR EXAMPLES.
of frequencies expected to fall there.
Can compare two groups or same group pre and post test
Other tests include the Mann-Whitney U test, Kruskal-Wallis One-Way analysis of variance, and
Friedman Two-Way analysis of variance by ranks

Associations and Relationships (Level Three)

Weeks 6 Statistics in Research

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Weeks 6 Statistics in Research

Uploaded by

Copyright:

Available Formats

8/14/23, 12:06 PM Statistics in Research Analysis 8/14/23, 12:06 PM Statistics in Research Analysis

Enter data using double verification or implement other quality-control procedures

Preparing and Organizing Data-Quantitative

https://softchalkcloud.com/lesson/files/iLo901ScPqgMQH/StatisticsInResearch_2020_print.html 1/16 https://softchalkcloud.com/lesson/files/iLo901ScPqgMQH/StatisticsInResearch_2020_print.html 2/16

Why use statistics?

Analyzing Quantitative Data

When do you use quantitative data?

Descriptive Analysis (Level One) Histograms

https://softchalkcloud.com/lesson/files/iLo901ScPqgMQH/StatisticsInResearch_2020_print.html 5/16 https://softchalkcloud.com/lesson/files/iLo901ScPqgMQH/StatisticsInResearch_2020_print.html 6/16

Most common score

The Normal Distribution

The Normal Distribution

https://softchalkcloud.com/lesson/files/iLo901ScPqgMQH/StatisticsInResearch_2020_print.html 11/16 https://softchalkcloud.com/lesson/files/iLo901ScPqgMQH/StatisticsInResearch_2020_print.html 12/16

Qualitative data analysis

Associations and Relationships (Level Three)

https://softchalkcloud.com/lesson/files/iLo901ScPqgMQH/StatisticsInResearch_2020_print.html 15/16 https://softchalkcloud.com/lesson/files/iLo901ScPqgMQH/StatisticsInResearch_2020_print.html 16/16

You might also like