Professional Documents
Culture Documents
Module
Files
Review
Status
To provide comparisons
Statistics
a science that deals with collection, organization, presentation, analysis and interpretation of data
Branches of Statistics
These are the definition of statistics into two branches
Descriptive Statistics
consists of methods concerned with collection, organization, summarization and presentation of a set of data
Inferential Statistics
compromised of those methods concerned with making predictions or inferences about an entire population
based on information provided by the sample
if the sample is random, more or less they have similar results when data collected in the whole population
Random Sampling - we are giving the entire population equal chance of being selected
Sample
A subset of the population
Census
the process of collecting information from the population
Survey
the process of collecting information from the sample
Parameter
summary or numerical measure used to describe a population
Observation
Constant
a characteristic or property of a population or sample which makes the members similar to each other
Variables
any characteristic or information measurable or observable on every element of the population or sample
Continuous Variables
Dependent Variable
EX. “test scores” is dependent on number of hours spent in studying, IQ, attitude towards studying
Independent Variable
Mediating Variable has output influenced by the IV. Input → Process → Output
variables whose values are simply labels or names or categories without any explicit or implicit ordering of the
labels
the most is arranging them alphabetically but arranging them does not have meaning
Ordinal
variables whose values are simply labels or names or categories with an implied ordering in these labels
ex. rank positions in a military organization and hierarchy in a government (President, Vice President)
Interval
variables whose values can be ordered and distance between any two labels are of known size
always numeric and have no true zero point (no true value example 0 in temperature doesn’t mean 0, it means
freezing point)
Ratio
variables whose values have all the properties of the interval scale and the ratio of two values is meaningful
02 | DATA PRESENTATIONS
Presentation of Data
Numerical quantities focus on expected values, graphical summaries on unexpected values (John Tukey)
1. Textual
2. Tabular
3. Graphical
Textual
data are presented in paragraph form
involves enumeration of important characteristics, giving emphasis on significant figures and identifying the
important features of the data
important information
arrange in array
Array - highest or lowest value; for data with less than ten elements; listing in increasing/decreasing order
Tabular
Sometimes we could hardly grasp information from a textual presentation of data.
EXAMPLE
R - the difference between the highest value and the lowest value
Step 2: Decide on the number of classes, denoted by K
k - number on non-overlapping intervals
Step 3: Compute for the class size, denoted by c
c - quotient of steps 1 and 2
c = R/k
ALWAYS ROUND UP (even if 3.2 = 4)
Step 4: Identify the class intervals, Cl
Step 5: Identify the frequency in each Cl or tallying
Example
Range (R) = 29
No. of Classes (k) = 6 (given)
1. Labels:
19 29 32 35 42
21 29 32 36 42
21 30 33 37 45
26 30 33 37 48
27 31 34 38 48
27 31 35 41 50
= 31/5
= 6.2 (Round up)
=7
describes the “center” of a given data set. It is a single value about which the observation tends to cluster
the sum of all observations divided by the total number of observations, denoted by X
Properties:
it is unique
2. Median
Ungrouped Median
Properties:
3. Mode
the observation(s) that occur most frequently in the data set, denoted by Mo
Properties:
C. Measures of Variability
describes the extent to which the data are dispersed
Variability is descriptive statistics that describe how similar a set of scores are to each other
The more similar the scores are to each other, the lower the measure of dispersion will be
The less similar the scores are to each other, the higher the measure of dispersion will be
In general, the more spread out a distribution is, the larger the measure of dispersion will be
RANGE (R)
the difference between the highest and lowest value in the data set
R = HV - LV
Two very different sets of data can have the same range
The deviate tells us how far a given score is from the typical or average score
SInce square units of measure are often awkward to deal with, the square root of variance is often used instead
compare variability of two populations that are expressed in different units of measurement
expressed as a percentage rather than in terms of the units of the particular data
A frequency curve that is not symmetrical about the mean is said to be skewed. If it tails off to the right, we
describe it as positively skewed, but if it tails off to the left, we say it is negatively skewed. The relationship
between the mean and the median is related to the direction skewness.
If the mean is greater than the median, we have positively skewed curve, but if the mean is less than the median, we
have a negatively skewed curve. Now, with the use of the standard deviation, it is possible to obtain a measure of
skewness which indicates both the direction and the magnitude (or the extent) of skewness of a frequency data.
-1.28
Measure of Kurtosis
Kurtosis measures whether the scores are spread out more or less than they would be in a normal (Gaussian)
distribution
mesokurtic if K=3
leptokurtic if K>3
and platykurtic if K<3