You are on page 1of 21

LBOLYTC Quiz 1 Reviewer

Module

Files

Review

Status

01 | INTRODUCTION & TERMINOLOGIES


Purpose of Statistics
To provide information

To provide comparisons

To help discern relationship (of variables)

To aid in decision making

To justify claims or assertions

To estimate unknown quantities

To predict future outcomes

Statistics
a science that deals with collection, organization, presentation, analysis and interpretation of data

PROCESS: Collection → Organization and Presentation → Analysis → Interpretation (COPAI)

Branches of Statistics
These are the definition of statistics into two branches

Descriptive Statistics
consists of methods concerned with collection, organization, summarization and presentation of a set of data

first part of definition of statistics

Inferential Statistics
compromised of those methods concerned with making predictions or inferences about an entire population
based on information provided by the sample

if the sample is random, more or less they have similar results when data collected in the whole population

researchers use random sampling for convenience and to be cost effective

Random Sampling - we are giving the entire population equal chance of being selected

Population and Sample


Population
consists of the totality of all the elements or entities from which you want to obtain an information

Sample
A subset of the population

Census
the process of collecting information from the population

Survey
the process of collecting information from the sample

Parameter
summary or numerical measure used to describe a population

LBOLYTC Quiz 1 Reviewer 1


Statistic
summary or numerical measure used to describe a sample

Other methods of collecting data


Interview

Observation

Discussion Groups (FDG)

Constant
a characteristic or property of a population or sample which makes the members similar to each other

Variables
any characteristic or information measurable or observable on every element of the population or sample

Qualitative (Categorical) Variables


variables that indicate what kind of a given characteristic an individual, object, or event possesses

e.g. school, gender, country, nationality,

Quantitative (Numerical) Variables


variables that indicate how much a given characteristic an individual, object, or event possess

e.g. age, height

Types of Quantitative Variables


Discrete Variables

variables whose value are obtained through the process of counting

number of students, number of fruits,

Continuous Variables

variables whose values are obtained through the process of measuring

e.g. of simple machines Ruler, Weighing Scale, Thermometer

Dependent Variable

a variable which is affected by another variable

EX. “test scores” is dependent on number of hours spent in studying, IQ, attitude towards studying

Independent Variable

a variable which affects the dependent variable

Ex. “number of hours spent in studying” affects test scores

LBOLYTC Quiz 1 Reviewer 2


Moderating has indirect relationship with DV but still has correlation

Mediating Variable has output influenced by the IV. Input → Process → Output

Scales of Measurement of Variables


Nominal

LBOLYTC Quiz 1 Reviewer 3


lowest level of measurement known as categorical scale

variables whose values are simply labels or names or categories without any explicit or implicit ordering of the
labels

the most is arranging them alphabetically but arranging them does not have meaning

ex. gender, school/s attended, and nationality.

Ordinal

variables whose values are simply labels or names or categories with an implied ordering in these labels

ranking can be done on the data

distance between two labels can not be determined

ex. rank positions in a military organization and hierarchy in a government (President, Vice President)

Interval

variables whose values can be ordered and distance between any two labels are of known size

always numeric and have no true zero point (no true value example 0 in temperature doesn’t mean 0, it means
freezing point)

ex. 97 grade in DLSU means 4.0, temperature

can determine the difference

Ratio

variables whose values have all the properties of the interval scale and the ratio of two values is meaningful

has a true zero point

highest level of measurement

academic score on a quiz and years of working experience

02 | DATA PRESENTATIONS

Presentation of Data
Numerical quantities focus on expected values, graphical summaries on unexpected values (John Tukey)

1. Textual

2. Tabular

3. Graphical

Textual
data are presented in paragraph form

involves enumeration of important characteristics, giving emphasis on significant figures and identifying the
important features of the data

important information

arrange in array

Array - highest or lowest value; for data with less than ten elements; listing in increasing/decreasing order

LBOLYTC Quiz 1 Reviewer 4


💡 Array - arranged from highest to lowest or lowest to highest

Tabular
Sometimes we could hardly grasp information from a textual presentation of data.

EXAMPLE

LBOLYTC Quiz 1 Reviewer 5


Percentage Frequency = Frequency / N

FREQUENCY DISTRIBUTION TABLE


tabular summary of data showing the frequency (or number) of items in each of several non-overlapping classes

Steps in Constructing Frequency Distribution Table


Step 1: Determine the range, denoted by R

R - the difference between the highest value and the lowest value
Step 2: Decide on the number of classes, denoted by K
k - number on non-overlapping intervals
Step 3: Compute for the class size, denoted by c
c - quotient of steps 1 and 2
c = R/k
ALWAYS ROUND UP (even if 3.2 = 4)
Step 4: Identify the class intervals, Cl
Step 5: Identify the frequency in each Cl or tallying
Example

LBOLYTC Quiz 1 Reviewer 6


Data is arranged from lowest to highest

Range (R) = 29
No. of Classes (k) = 6 (given)

Class Size (c) = R/k


= 29/6
= 4.83 (Round up)
=5

Class Size / Class Width

LBOLYTC Quiz 1 Reviewer 7


the difference between the upper (or lower) class limits of consecutive classes

All classes should have same class width

Lower Class Limit


the least value that can belong to a class

Upper Class Limit


the greatest value that can belong to a class

Class Boundaries (CB)


the numbers that separate classes without forming gaps between them

Class Mark / Midpoint (CM)


the middle value of each data class. To find the class midpoint, average the upper and lower class limits.

Relative Frequency (RF)


obtained by dividing the frequency of the given class by the total number of observations

Less than CF (<CF)


total number of observations within a class whose values do not exceed the upper limit of the class

Greater than CF (>CF)


total number of observations within a class whose values are not less than the lower limit of the class

Cumulative Frequency of a data class


the number of data elements in that class and all previous classes. (may be ascending or descending.)

LBOLYTC Quiz 1 Reviewer 8


Graphical
TYPES OF GRAPS

1. Pie Chart / Circle Graph - any data; shows percentages or division

2. Bar Graph (popular)

a. Bar Chart [with gaps between bars] - discrete data

b. Histogram [no gaps between bars] - continuous data

3. Line Graph (popular)

a. Frequency polygon - continuous data


polygon - enclosed plane figure

RULES TO REMEMBER IN CONSTRUCTING GRAPHS

1. Labels:

Figure number [below the graph]

Figure title [below the graph]

for Pie chart, % should be indicated

for Bar graph, axis should be labeled

2. Textual explanation should also follow any graph

LBOLYTC Quiz 1 Reviewer 9


LBOLYTC Quiz 1 Reviewer 10
Polygon - enclosed plane figure

LBOLYTC Quiz 1 Reviewer 11


LBOLYTC Quiz 1 Reviewer 12
Seatwork

19 29 32 35 42

21 29 32 36 42

21 30 33 37 45

26 30 33 37 48

27 31 34 38 48

27 31 35 41 50

1. Range (R) = 19-50 = 31

2. No. of Class size (k) = 5

3. Class Size (c) = R/k

= 31/5
= 6.2 (Round up)

=7

Class Interval F CB CM RF <CF >CF

19-25 3 18.5-25.5 22 0.100 3 30

26-32 11 25.5-32.5 29 0.366 14 27

33-39 9 32.5-39.5 36 0.300 23 16

LBOLYTC Quiz 1 Reviewer 13


40-46 4 39.5-46.5 43 0.133 27 7
47-53 3 46.5-53.5 50 0.100 30 3

03 | NUMERICAL DESCRIPTIVE MEASURES


Measures of Central Tendency

describes the “center” of a given data set. It is a single value about which the observation tends to cluster

1. Arithmetic Mean (or simply Mean)

the sum of all observations divided by the total number of observations, denoted by X

Properties:

it always exists for quantitative variables

it is unique

takes into account every item of the data

Thus, it is easily affected by extreme values

2. Median

the middle value of an array, denoted by Md

Ungrouped Median

Properties:

Not easily affected by extreme values

LBOLYTC Quiz 1 Reviewer 14


it always exists and is unique

3. Mode

the observation(s) that occur most frequently in the data set, denoted by Mo

Properties:

No calculations are required (for the ungrouped mode)

It may not exist

It may not be unique

C. Measures of Variability
describes the extent to which the data are dispersed

Variability is descriptive statistics that describe how similar a set of scores are to each other

The more similar the scores are to each other, the lower the measure of dispersion will be

The less similar the scores are to each other, the higher the measure of dispersion will be

In general, the more spread out a distribution is, the larger the measure of dispersion will be

RANGE (R)
the difference between the highest and lowest value in the data set

R = HV - LV

The range is rarely used in scientific work as it is fairly insensitive

It depends on only two scores in the set of data, HV and LV

Two very different sets of data can have the same range

LBOLYTC Quiz 1 Reviewer 15


it is insensitive because it only looks at the highest and lowest value, not values in between

VARIANCE s^2 or o^2


the mean squared differences of the observations from their mean

This difference is called a deviate or a deviation score

The deviate tells us how far a given score is from the typical or average score

Thus, the deviate is a measure of dispersion for a given score

LBOLYTC Quiz 1 Reviewer 16


STANDARD DEVIATION s or o
the positive square root of the variance

SInce square units of measure are often awkward to deal with, the square root of variance is often used instead

The standard deviation is the square root of variance

Standard deviation =√variance

Variance standard deviation2

COEFFICIENT OF VARIATION (CV)


the ratio of the standard deviation to its mean expressed in percent

compare variability of two populations that are expressed in different units of measurement

expressed as a percentage rather than in terms of the units of the particular data

if malaki value mas dispersed, if mas maliit less dispersed

LBOLYTC Quiz 1 Reviewer 17


Measure of Skewness
Skew is a measure of symmetry in the distribution of scores.

A frequency curve that is not symmetrical about the mean is said to be skewed. If it tails off to the right, we
describe it as positively skewed, but if it tails off to the left, we say it is negatively skewed. The relationship
between the mean and the median is related to the direction skewness.

Perfectly symmetric or not skewed

Positive = right skewed


Negative = left skewed

Pearsonian coefficient of skewness (S_k) formula

If the mean is greater than the median, we have positively skewed curve, but if the mean is less than the median, we
have a negatively skewed curve. Now, with the use of the standard deviation, it is possible to obtain a measure of
skewness which indicates both the direction and the magnitude (or the extent) of skewness of a frequency data.

LBOLYTC Quiz 1 Reviewer 18


Mas malaki yung pink sa value ng blue, mas malaki yung deviation ng pink sa blue

-1.28

If Sk < 0, then the distribution has a negative skew

If Sk > 0, then the distribution has a positive skew

If Sk = 0 then the distribution is symmetrical

Measure of Kurtosis
Kurtosis measures whether the scores are spread out more or less than they would be in a normal (Gaussian)
distribution

LBOLYTC Quiz 1 Reviewer 19


130/151.195

LBOLYTC Quiz 1 Reviewer 20


A distribution is said to be

mesokurtic if K=3

leptokurtic if K>3
and platykurtic if K<3

LBOLYTC Quiz 1 Reviewer 21

You might also like