Professional Documents
Culture Documents
20231005-RESEACH METHODS-Ch1
20231005-RESEACH METHODS-Ch1
Dünya 524
Seni Bekliyor
ÜNİVERSİTE SIRALAMALARI
Academic Research
UNIVERSITY Methods
RANKINGS and Ethics
Looking at Data
• E-MAIL : ufuk.turen@ostimteknik.edu.tr
3/53
Course Objective
• The main purpose of this course is to examine the research process (problem identification, data
collection, data analysis and interpretation of results), to review certain scientific research methods
(experimental method, descriptive method, historical method, etc.) literature research, collecting data,
evaluating data and writing reports is to enable them to learn practically. Statistics and software
packages (SPSS 25.0) required for data evaluation and report writing will also be used during the
course.
• This course covers the structure of science and scientific research, scientific methods and different
views on these methods, problem, research model, universe and sample, data collection and data
collection methods (quantitative and qualitative data collection techniques), data recording, analysis,
interpretation and reporting. It includes the explanation of research and writing techniques
accompanied by basic concepts related to social sciences and social sciences. This course will also
discuss ethical considerations related to conducting scientific research and reporting.
4/53
Course Content
WEEK 1 Introduction WEEK 9 Introduction to SPSS
6/53
Clinical Data Example
8/43
Types of Variables: Overview
Categorical Quantitative
9/43
Categorical Variables
Also known as “qualitative.”
Categories.
• Treatment groups
• Exposure groups
• Disease status
10/43
Categorical Variables
• Dichotomous (binary) – two levels
• Dead/alive
• Treatment/placebo
• Disease/no disease
• Exposed/Unexposed
• Heads/Tails
• Pulmonary Embolism (yes/no)
• Male/female
11/43
Categorical Variables
12/43
• Ordinal variable – Ordered categories. Order matters!
13/43
Quantitative Variables
• Numerical variables; may be arithmetically
manipulated.
– Counts
– Time
– Age
– Height
14/43
Quantitative Variables
• Discrete Numbers – a limited set of distinct values, such as
whole numbers.
15/43
Quantitative Variables
• Continuous Variables - Can take on any number within a
defined range.
16/43
Looking at Data
• How are the data distributed?
17/43
The first rule of statistics:
USE COMMON SENSE!
18/43
Frequency Plots (univariate)
Categorical variables
– Bar Chart
Continuous variables
– Box Plot
– Histogram
19/43
Bar Chart
20/43
Bar Chart: categorical
variables
NO
YES
21/43
Bar Chart for SI categories
200.0
183.3
Number of Patients 166.7
150.0
133.3
116.7
100.0 Much easier to
83.3 extract information
66.7 from a bar chart
50.0 than from a table!
33.3
16.7
0.0
1 2 3 4 5 6 7 8 9 10
Shock Index Category
22/43
Box plot and histograms: for
continuous variables
23/43
Box Plot: Shock Index
2.0
Shock Index Units
maximum (1.7)
Outliers
1.3
Q3 + 1.5IQR =
.8+1.5(.25)=1.175
“whisker”
75th percentile (0.8)
0.7 interquartile range median (.66)
(IQR) = .8-.55 = .25 25th percentile (0.55)
16.7
Percent
8.3
0.0
0.0 0.7 1.3 2.0
SI
25/43
Histogram
6.0 100 bins (too much detail)
4.0
Percent
2.0
0.0
0.0 0.7 1.3 2.0
SI 26/43
Histogram
200.0
2 bins (too little detail)
133.3
Percent
66.7
0.0
0.0 0.7 1.3 2.0
SI
27/43
Box Plot: Shock Index
2.0
0.7
0.0
SI
28/43
Box Plot: Age
100.0
maximum
More symmetric
interquartile range
Years
median
25th percentile
33.3
minimum
0.0
AGE
Variables 29/43
Histogram: Age
14.0
9.3
Percent
4.7
0.0
0.0 33.3 66.7 100.0
AGE (Years) 30/43
Some histograms from your class
(n=24)
Starting with politics.
31/43
32/43
33/43
Feelings about math and writing
34/43
Optimism
35/43
Diet
36/43
Habits
37/43
Measures of central tendency
• Mean
• Median
• Mode
38/43
Central Tendency
• Mean – the average; the balancing point
In math ∑x X1 + X 2 + + X n
shorthand: i =1
X= =
n n
39/43
Mean: example
Some data:
Age of participants: 17 19 21 22 23 23 23 38
∑X i
17 + 19 + 21 + 22 + 23 + 23 + 23 + 38
i =1
X= = = 23.25
n 8
40/43
Mean of age in Kline’s data
Means Section of AGE
Geometric Harmonic
556.9546
14.0
Percent 9.3
4.7
0.0
0.0 33.3 66.7 100.0
Mean of age in Kline’s data
14.0
9.3
Percent
4.7
0.0
0.0 33.3 66.7 100.0
The balancing point
42/43
Mean
• The mean is affected by extreme values
(outliers)
0 1 2 3 4 5 6 7 8 9 10 0 1 2 3 4 5 6 7 8 9 10
Mean = 3 Mean = 4
1 + 2 + 3 + 4 + 5 15 1 + 2 + 3 + 4 + 10 20
= =3 = =4
5 5 5 5
Central Tendency
• Median – the exact middle value
Calculation:
• If there are an odd number of observations, find the middle value
• If there are an even number of observations, find the middle two
values and average them.
Median: example
Some data:
Age of participants: 17 19 21 22 23 23 23 38
14.0
Percent
9.3
4.7
9.3
Percent
4.7
0.0
0.0 33.3 66.7 100.0
Does PE have a median?
• Yes, if you line up the 0’s and 1’s, the middle number is 0.
Median
• The median is not affected by extreme
values (outliers).
0 1 2 3 4 5 6 7 8 9 10 0 1 2 3 4 5 6 7 8 9 10
Median = 3 Median = 3
Central Tendency
• Mode – the value that occurs most frequently
Mode: example
Some data:
Age of participants: 17 19 21 22 23 23 23 38
9.3
Percent
4.7
0.0
0.0 33.3 66.7 100.0
AGE (Years)
Range of PE?
• 1-0 = 1
Quartiles
25% 25% 25% 25%
Q Q Q
1 2 3
◼ The first quartile, Q1, is the value for which
25% of the observations are smaller and 75%
are larger
◼ Q2 is the same as the median (50% are
smaller, 50% are larger)
◼ Only 25% of the observations are greater than
the third quartile
Interquartile Range
Median
Q1 (Q2) Q3 maximum
minimum
25% 25% 25% 25%
15 35 49 65 94
Interquartile range
= 65 – 35 = 30
Variance
• Average (roughly) of squared deviations of values from
the mean
(x − X )
i
2
S =
2 i
n −1
Why squared deviations?
• Adding deviations will yield a sum of 0.
• Absolute values are tricky!
• Squares eliminate the negatives.
• Result:
– Increasing contribution to the variance as you go farther from
the mean.
Standard Deviation
(x − X )
i
2
S= i
n −1
Calculation Example:
Sample Standard Deviation
Age data (n=8) : 17 19 21 22 23 23 23 38
n=8 Mean = X = 23.25
13
4.7
0.0
0.0 33.3 66.7 100.0
AGE (Years)
Std. Deviation age
0.0
0.0 0.5 1.0 1.5 2.0
SI
Std. Deviation SI
Variation Section of SI
80.56%
19.44%
Std. Deviation PE
Variation Section of PE
Standard
Parameter Variance Deviation
11 12 13 14 15 16 17 18 19 20 21
S = 3.338
Data B
Mean = 15.5
11 12 13 14 15 16 17 18 19 20 21 S = 0.926
Data C
Mean = 15.5
11 12 13 14 15 16 17 18 19 20 21 S = 4.570
◼ SSlide from: Statistics for Managers Using Microsoft® Excel 4th Edition, 2004 Prentice-Hall
Bienaymé-Chebyshev Rule
• Regardless of how the data are distributed,
a certain percentage of values must fall
within K standard deviations from the mean:
68% of the
data
from: ER Tufte. The Visual Display of Quantitative Information. Graphics Press, Cheshire, Connecticut,
1983, p.69
Notice the X-axis
Correctly scaled X-axis…
Report of the Presidential Commission on the Space Shuttle Challenger Accident, 1986
(vol 1, p. 145)
The graph excludes the observations where no O-rings failed.
Smooth curve at least shows the trend toward failure at high and low temperatures…
◼ http://www.math.yorku.ca/SCS/Gallery/
Even better: graph all the data (including non-failures) using a logistic
regression model
Tappin, L. (1994). "Analyzing data relating to the Challenger disaster". Mathematics Teacher, 87, 423-426
What’s wrong with
this graph?
from: ER Tufte. The Visual Display of Quantitative Information. Graphics Press, Cheshire, Connecticut,
1983, p.74
What’s the message here?
http://www.melanoma.org/mrf_facts.pdf
Example 1: projected statistics
• How do you think these statistics are calculated?
The answer?
Closer to 1/150 (one order of magnitude off)
• No citations given.
And…
• Fact Sheet on eating disorders:
• No citation given.
And…
• “Studies report between 15% and 62% of college
women engage in problematic weight control behaviors
(Berry & Howe, 2000).” (in The Sport Journal, 2004)
• Citations:
Steen SN. The competitive athlete. In: Rickert VI, ed.
Adolescent Nutrition: Assessment and Management. New
York, NY: Chapman and Hall; 1996:223 47.
Tofler IR, Stryer BK, Micheli LJ. Physical and emotional
problems of elite female gymnasts. N Engl J Med.
1996;335:281 3.
Where did the statistics come
from?
• The 15%: Dummer GM, Rosen LW, Heusner WW, Roberts PJ, and Counsilman
JE. Pathogenic weight-control behaviors of young competitive swimmers.
Physician Sportsmed 1987; 15: 75-84.
• The “to”: Rosen LW, McKeag DB, O’Hough D, Curley VC. Pathogenic weight-
control behaviors in female athletes. Physician Sportsmed. 1986; 14: 79-86.
• Population/sample size?
– Convenience samples
– Rosen et al. 1986: 182 varsity athletes from two midwestern universities
(basketball, field hockey, golf, running, swimming, gymnastics, volleyball,
etc.)
– Dummer et al. 1987: 486 9-18 year old swimmers at a swim camp
– Rosen et al. 1988: 42 college gymnasts from 5 teams at an athletic
conference
Where did the statistics come
from?
• Measurement?
– Instrument: Michigan State University Weight Control Survey
– Disordered eating = at least one pathogenic weight control behavior:
• Self-induced vomiting
• fasting
• Laxatives
• Diet pills
• Diuretics
• In the 1986 survey, they required use 1/month; in the 1988 survey, they required use
twice-weekly
• In the 1988 survey, they added fluid restriction
Where did the statistics come
from?
• Findings?
– Rosen et al. 1986: 32% used at least one “pathogenic weight-
control behavior” (ranges: 8% of 13 basketball players to 73.7%
of 19 gymnasts)
– Dummer et al. 1987: 15.4% of swimmers used at least one of
these behaviors
– Rosen et al. 1988: 62% of gymnasts used at least one of these
behaviors
References
• http://www.math.yorku.ca/SCS/Gallery/
• Kline et al. Annals of Emergency Medicine 2002; 39: 144-152.
• Statistics for Managers Using Microsoft® Excel 4th Edition, 2004 Prentice-Hall
• Tappin, L. (1994). "Analyzing data relating to the Challenger disaster".
Mathematics Teacher, 87, 423-426
• Tufte. The Visual Display of Quantitative Information. Graphics Press, Cheshire,
Connecticut, 1983.
• Visual Revelations: Graphical Tales of Fate and Deception from Napoleon
Bonaparte to Ross Perot Wainer, H. 1997.
Mean of Pulmonary Embolism? (Binary
variable?)
n
X
i =1
i
181 * 1 + 750 * 0 181
X= = = = .1944
n 931
Histogram 931
100.0
80.56%
(750)
66.7
Percent
33.3
19.44% (181)
0.0
0.0 0.3 0.7 1.0
PE
ÜÇÜNCÜ NESİL, YENİLİKÇİ VE GİRİŞİMCİ
ÜNİVERSİTE MODELİ
www.ostimteknik.edu.tr