Professional Documents
Culture Documents
Sign inRegister
Hide
Summary statistics
Summary form the Lectures + all the exercises!
University
Medical University-Pleven
Course
Medical Statistics 15
Uploaded by
Michelle Betschart
Academic year
20/21
helpful
0
0
Share
Comments
Please sign in or register to post comments.
Related documents
MEDICAL STATISTICS - NOTES
Microbiology Topics 1 - 30
Microbiology Topics 31 - 96
Pathophysiology - notes
Topic 53. Duodenum - Summary Anatomy
Topic 196. Basis Cranii
Preview text
LECTURE 1: INTRODUCTION TO STATISTICS, SOURCES AND TYPES OF
DATA
Definition and major objectives of Statistics
Statistics is the science that deals with the collection, classification, analysis,
and
interpretation of numerical facts or data, and that, by use of mathematical
theories of
probability, imposes order and regularity on aggregates of more or less disparate
elements.
Statistical activities
- Statistical description – the process of summarizing the characteristics of data
under
study (at the sample or population level). This process is called descriptive
statistics. - Statistical relationship analysis - the process of analysis of
relationship between
dependent (effect) and one or more independent (causes) variables.
- Statistical inference – the process of generalization from a sample to a
population,
when the observation is performed in a representative sample, usually with
calculated
degrees of uncertainty; we call this process inferential statistics.
Basic concepts
POPULATION - T he population includes all members of a defined group. It represents
the
target of an investigation, and the aim of the process of data collection is to
make inferences
(draw conclusions) about the population.
Examples of populations:
- all patients with a certain disease; - all inhabitants of Bulgaria.
2 Michelle Betschart
4 Michelle Betschart
This approach is useful when cases are automatically time-ordered, such as arrival
or
discharge of hospital inpatients. In the simple and systematic sample there is a
need of a list of
all population – this is not always possible.
OTHER SAMPLES
There are also other methods that are less reliable:
- Convenience sample – i t includes subjects who are easiest to select (e.g. first
50 people
on the street at one time).
- Self-selected sample - postal surveys for example (non- responders may bias the
results).
These two types of samples are not representative and not recommended to be used.
SAMPLE SIZE
There is no magic number that we can point to as an optimum sample size. It depends
on the
characteristics of an investigation. The sample size must be adequate for making
correct
inferences from a sample to a population. It relates to the concept of sampling
error.
Classification of variables
Each variable has different:
- variable values – every single variable can take two or more different values;
- variable distribution– frequencies of the values of a single variable.
Classifying variables
- Quantitative (numerical) variables – values of which are expressed by numbers
(e.g.
weight, number of patients per day);
- Qualitative (categorical) variables or attributes – values of which are expressed
only by
description (e.g. gender, residence, blood group, profession, marital status,
ethnic
group, etc.
TYPES OF VARIABLES
In summary, we usually classify variables into four main types of variables:
- Numerical continuous variables
- Numerical discrete variables
- Categorical ordinal variables - Categorical nominal variables
Graphical presentation/summarization
In the bar charts (are used for categorical data) all bars are separated. They are
appropriate to
express changes in rates over time or levels of rates in different areas
(countries, regions, etc.)
8 Michelle Betschart
In the histograms all bars are linked to each other. They are appropriate to
express changes in
rates over time or levels of rates or proportions in different areas for the same
time
(countries, regions, etc.).
The maps are appropriate to express different levels of rates in different region.
10 Michelle Betschart
Answers:
1-A; 2-B; 3-B; 4-D; 5-B; 6-A; 7-A; 8-B; 9-A; 10-B; 11-B; 12-A; 13-D; 14-C; 15-C; 16
-C; 17-B;
18-A; 19-B; 20-B; 21-B; 22-B; 23-A; 24-B; 25-B; 26-C; 27-A; 28-A; 29-B; 30-C.
Secondly, we have to decide whether the data should be grouped and what grouping
interval should be used. As a rough guide we may have 5– 20 groups, depending on
the
number of observations. If the interval chosen for grouping the data is too wide,
too
much detail will be lost, while if it is too narrow the table will be unwieldy. The
starting
points of the group should be round numbers and all the intervals should be of the
same width. There should be no gaps between the groups.
Once the format of the table is decided, the numbers of observations (frequencies)
in
each group should be counted.
14 Michelle Betschart
DESCRIBING A DISTRIBUTION
Regarding the number of peaks: - Unimodal distributions: with a single peak,
- Bimodal distributions: with two peaks,
- Polymodal distributions: with more than two peaks.
16 Michelle Betschart
THE NORMAL CURVE is a theoretically perfect frequency polygon in which the mean,
median,
and mode all coincide and which takes the form of a symmetrical bell-shaped curve.
ASYMETRIC DISTRIBUTIONS
Regarding the inclination of the peak or skewness: - Positive skewness –
distributions with an extended right hand tail (lower values more
lik ely);
- Negative skewness – distributions with an extended left hand tail (higher values
more
likely).
POSITIVELY SKEWED: most of the scores are low, but with some scores spreading out
towards the upper end of the distribution; the tail is directed to the right or to
the positive side of the distribution à mode<median<mean.
NEGATIVELY SKEWED: most of the scores are high, but with some scores spreading out
towards the lower end of the distribution; the tail is directed to the left or
negative side of the
distribution à mean<median<mode.
Important: the type of the distribution determines the statistical tests to be used
for
descriptive or inferential statistics.
z scores express how many standard deviations a particular score is from the mean.
A. True B. False
The total area under the standard normal curve is always 1.0.
A. True B. False
The area of a normal curve between any two designated z scores expresses the
proportion or percentage of cases falling between the two points.
A. True B. False
About 10% of scores fall 3 standard deviations above 66 the mean.
A. True B. False
50% of scores fall between z = 0.5 and z = - 0.5.
A. True B. False
In a normal curve, approximately 34% of the scores fall between z = 0 and z = - 1.
A. True B. False
Numerous human characteristics are distributed approximately as a normal curve.
A. True B. False
The height of the rectangle in a histogram is 67 proportional to class frequency
and
class width.
A. True B. False
Which of the following statements is true?
A. A z score indicates how many standard deviations a raw score is above or below
the
mean.
B. The mean of a standard normal distribution is always 0 (zero).
C. All the above statements are true.
In an anatomy test, your result is equivalent to z score of - 0.2. What does this z
score
imply?
A. You performed very well when compared to others.
B. Your result was slightly above average.
C. Your result was slightly below average.
State whether the data reflecting the age at death of individuals in the general
population are likely to be skewed to the right, skewed to the left or symmetrical.
A. Symmetrical.
B. Skewed to the right (positively skewed).
C. Skewed to the left (negatively skewed).
Select the statement which you believe to be true. The Normal distribution:
A. Is a family of distributions which can have a variety of means and standard
deviations.
B. Is the distribution of a variable measured on healthy individuals.
C. Has a mean of zero and a standarddeviationofone.
D. Is skewed to the right.
Frequency distribution is another expression for a bar chart.
A. True B. False
A histogram can be used instead of a pie chart to display categorical data.
A. True B. False
A histogram is similar to a bar chart but there are no gaps between the bars.
A. True B. False
A histogram can be used to display either a frequency or a relative frequency
distribution.
A. True B. False
20 Michelle Betschart
1
out of 59 Download
Help