You are on page 1of 18

Ramon Magsaysay Memorial Colleges

GRADUATE SCHOOL
Pioneer Avenue, General Santos City, Philippines
Website: www.rmmcmain.edu.ph

ADVANCED EDUCATIONAL STATISTICS

Reflection Paper on Different Topics

1. Graduate School’s Vision, Mission, and goals and objectives.

2. Introduction to Statistics

2.1 Definition of Statistics

2.2 Descriptive vs. Inferential Statistics

2.3 Levels of Data Measurement

2.4 Data Collection and Presentations

2.5 Frequency Distribution/Graphical/Textual

3. Measures of Central Tendency

3.1 Mean

3.2 Median

3.3 Mode

4. Measures of Location

4.1 Quartile

4.2 Decile

4.3 Percentile

5. Measures of Variation, Skewness, and kurtosis

5.1 Variability

5.2 Range

5.3 Interquartile range

5.4 Semi- Interquartile range / Quartile Deviation


Ramon Magsaysay Memorial Colleges
GRADUATE SCHOOL
Pioneer Avenue, General Santos City, Philippines
Website: www.rmmcmain.edu.ph

5.5 Mean Absolute Deviation

5.6 Variance

5.7 Standard Deviation

5.8 Coefficient of Variation

5.9 Skewness

5. 10 Kurtosis

ROCHELLE Y. CONTANG
Student

Marianne C. Sarmiento, PhD


Professor
Ramon Magsaysay Memorial Colleges
GRADUATE SCHOOL
Pioneer Avenue, General Santos City, Philippines
Website: www.rmmcmain.edu.ph

1 GRADUATE SCHOOL’S VISION, MISSION, GOALS AND OBJECTIVES

A vision without action will remain as dream. An action without planning and vision is just a

waste of time. Ramon Magsaysay Memorial Colleges vision mission is a dream and action originated

from the principle of the late president- Ramon Magsaysay, who dream nothing but to give better future

to all Filipinos wanted to pursue education in all level.

Every Institution has their own Mission and Vision. It defines the objectives and desired future of

an Institution, Just like RMMC- Gensan, they have their own mission and vision statements. To produce

a future ready global citizen that can champion the development of the society and to be able to

continue the legacy that our Founder left to us. For my own point of view this mission and vision, could

be our inspiration on taking our path leading to a successful life. For me the Vision itself has a huge

impact to us because it only has desired future for us. To become successful and a future ready citizen

that can put a major contribution in the development of our country, we can only do that if we keep in

our mind and take to heart the vision, mission, goals and objectives because with Fortitude it will give us

the strength we need to face problems the same with Excellence, Uprightness because no one can be

successful if we don’t have loyalty and compassion that reminds us to keep our feet on the ground no

matter how far or how high our journey takes. We shouldn’t forget where we came from. And for the

mission of this institution, for me we the students are the one who benefits the most in the mission

statement because its only objective is to give us a quality education that focuses on the technology

enabled and adaptive learning approach as a medium of education. That can develop industry

responsive and enterprising individuals while sustaining the Founder’s Legacy, his pioneering spirit in

education. For me, the mission vision statement is to help us become successful human beings who can

make a major contribution in our society just like what our Founder did.
Ramon Magsaysay Memorial Colleges
GRADUATE SCHOOL
Pioneer Avenue, General Santos City, Philippines
Website: www.rmmcmain.edu.ph

2. INTRODUCTION TO STATISTICS

Statistical knowledge is important to both statisticians and non-statisticians (Broers, 2006). It is,
therefore, recommended that people from all disciplines are given basic skills in statistics.

For this reason, I pursued this course to obtain quantitative skills to be applied and improved on
in several ways. In this regard, I hoped to obtain knowledge in designing of experiments, collection and
analysis of data, interpretation of results as well as drawing of conclusions .

This course has given me a clear understanding of the role of statistics in life. Statistics is the
most used research tool in medicine, education, psychology, business and economics, among other
fields. It helps in shaping people’s choices in their daily lives. For example, statistical findings can give a
clear understanding of implications of some behaviors such as smoking and lead to corrective measures.

I have always wanted to build a strong career in research. I wish to have advanced skills in
statistics which will enable me handle and analyze large and complex research problems. In this case,
the knowledge I have already obtained will give me a head start.

2.1 DEFINITION OF STATISTICS

Statistics is the study that deals with the collection and analysis of data. Statistics plays a vital

role in our daily lives. It keep us informed about, what is happening in the world around us. It is

important because today we live in the information given by the books, internet and research study and

much of this information’s are determined mathematically by Statistics help. It means to be informed

correct data, statistics concepts are necessary.

As a teacher, statistics is very important because it determined the learner’s performance. It

helps me to determine how many pupils should undergo on intervention. It’s also a big help to me, in

terms of my research study, I use statistics to collect, organize and analyze the relevant data. It also help
Ramon Magsaysay Memorial Colleges
GRADUATE SCHOOL
Pioneer Avenue, General Santos City, Philippines
Website: www.rmmcmain.edu.ph

me to be a good teacher, using hypothesis test, I can compare the different teaching methods and

strategies I use. I can determined what is the best or effective strategies/ method in learning and

teaching process.

2.2 DESCRIPTIVE VS. INFERENTIAL STATISTICS

Descriptive Statistics uses the data to provide description of the population, either through

numerical calculation, graphs or tables while the Inferential Statistics makes inferences and predictions

about a population based on a sample of data taken from the population. At first glance, data collected

may not make sense, hense statistics assists in organizing the data into a more meaningful form for

purposes of better comprehension by relevant users of the data collected. The two major branches of

statistics include descriptive and inferential statistics. Descriptive statistics refers to using data to

describe a specific population. It helps organize large amounts of data in a sensible way hence forming

the basis for quantitative analysis of data. The data is first collected from the population or sample

before being summarized using descriptive statistics, However, the data described is only specific to the

sample or population under study and cannot be generalized to include other elements from other

samples or population. While inferential statistic, unlike descriptive statistics, is more generalized. It

refers to making deductions, predictions, or inferences about a population with reference to a particular

study conducted on samples. The goal of descriptive and inferential statistics is to provide summaries of

data in a manner that is useful and simple.

2.3 LEVELS OF DATA MEASUREMENT

There are four (4) Level Data Measurement- Nominal, Ordinal, Interval and Ratio. According to

the reporter, Nominal is the least precise and informative because it only means the characteristics or

identity. No ordering of cases is implied. It means numerals used as labels to identify items uniquely. It is

less precise, relevant and informative because it only indicate quantity, rank, or any measurement.
Ramon Magsaysay Memorial Colleges
GRADUATE SCHOOL
Pioneer Avenue, General Santos City, Philippines
Website: www.rmmcmain.edu.ph

Example of Nominal Measurement are the Driver’s license, Telephone or Cellphone Numbers, Zip Codes,

Postal Codes, Gender and etc.

The second level of Data Measurement is Ordinal. Ordinal indicates ordered levels or ranks.

Economic status (low income, middle income, and high income), Educational Level (Highschool, BS,

PhD), Satisfaction Rating (Extremely Dislike, dislike, neutral, like, extremely like) and etc. The third level

of Data Measurement is the Interval. Interval measurement is the distance between attributes does

have meaning. For example, when we measure temperature (in Fahrenheit), the distance from 30-40 is

the same as distance from 70-80. And lastly, the fourth level of measurement is the ratio. Ratio is the

highest of the four hierarchical level of measurement. Height, money, age and weight are the examples

of ratio.

2.4 DATA COLLECTION AND PRESENTATIONS

Data Presentation is a method by which people organize, summarize, and communicate

information using a variety of tools such as tables, graphs and diagrams. Data presentation helps to

easily understand the comparison of every data, it’s also help easy and better understanding of the data.

There are three forms of presentation of data, first is the Textual or Descriptive Presentation- comes

from the word Text and describe, and it means the data written in a paragraph or text form. Tabular

Presentation is the second forms of presentation. In the tabular presentation, the data is presented in

the form of rows and columns. It is a table facilitates representation of even large amounts of data in an

attractive, easy to read and organized manner. And lastly, the third forms of presentation, The

Diagrammatic Presentation. The Diagrammatic presentation is a simple and effective method of

presenting the information that any statistically data contains. It is a technique of presenting numeric

data through pictograms, cartograms, bar diagrams and pie diagrams.


Ramon Magsaysay Memorial Colleges
GRADUATE SCHOOL
Pioneer Avenue, General Santos City, Philippines
Website: www.rmmcmain.edu.ph

2.5 FREQUENCY DISTRIBUTION/GRAPHICAL TEXTUAL

A Frequency distribution is a means to organize a large amount of data. It shows the frequency

of repeated items in a graphical form or tabular form. Frequency Distribution is a big help in terms of

large amount of data. Using this, you can interpret the population easily. For example, in the classroom,

you want to determine the IQ level of your pupils by using thee frequency distribution, you can easily

determine the slow, middle and high IQ of your pupils. In order to organize the large amount of data,

Frequency Distribution is the answer. There’s a lot of graph in interpreting the Frequency Distribution,

you can use pie, bar and line graph.

MEASURES OF CENTRAL TENDENCY

3.1 MEAN

According to the reporter, Mean is an essential concept I mathematics and statistics.

The mean is the average or the most common value in a collection. Arithmetic mean is best known

as AVERAGE. Listening the reporter, reporting about the mean, I notice that arithmetic mean is very

easy to calculate, very easy to understand and it gives an exact value.

As a teacher, Arithmetic Mean is very important because it determined the learner’s

performance. It helps me to determine how many pupils should undergo on intervention. Also as a

businesswoman, It is also a big help knowing how to solve mean. The same is true if I want to

calculate a stock’s average closing price during a particular month. Say, there are 30 trading days in

a month. Simply take all the prices, add them up, and divide by 23 to get the arithmetic mean.

Σx
Mean also has a formula, for ungrouped data, it is, x= for ungrouped data. x stands
n

for sample mean, x as value of each item, n as number of items in the sample and Σ as “the sum of” .
Ramon Magsaysay Memorial Colleges
GRADUATE SCHOOL
Pioneer Avenue, General Santos City, Philippines
Website: www.rmmcmain.edu.ph

Σfx
And for the grouped data x= , wherein Σfx stands as summation of frequency and midpoint and
Σf

Σf as summation of frequency.

The arithmetic mean is simple, and most people with even a little bit of finance and math

skill can calculate it. It’s also a useful measure of central tendency, as it tends to provide useful

results, even with large groupings of number.

3.2 MEDIAN

Based on what I learned to the reporter, Median has a symbol of Md. It is measure of

central tendency that occupies the middle position in an array of values. Median is another widely

used average, easy to understand, and easy to compute. It cannot be foun d unless the items are

arranged in an ascending or descending order. It is the point that divides the frequency distribution

into two halves.

In classroom, median is always been used especially when solving grades and getting the

daily average of the assessment of learners. Median helps me to distinguish the passer and non-

passer in the teaching and learning process. It helps me also distinguish the non-mastered items and

mastered items which is a big help to me because I can easily identify the hardest and the easiest

questions.

Just like the Mean, in order to get the median, it should be arranged in ascending or

n+1
descending order. In ungrouped data, the formula is , after that, the place of the median will be
2

( )
n
−F
identify. And for the grouped data, the formula is 2 , wherein Lb stands for Lower
Md=Lb+ i
f
Ramon Magsaysay Memorial Colleges
GRADUATE SCHOOL
Pioneer Avenue, General Santos City, Philippines
Website: www.rmmcmain.edu.ph

Boundary, F as cumulative frequency for the class interval preceding the median class when the

scores are arranged from lowest to highest, C as size of the median and f as frequency of the median

class.

3.3 MODE

Mode is the third kind of average. Mode means the number that occurs mostly frequently.

Just like for example, In order to get the mode of the test scores in ungrouped data, you need to

arrange it in ascending order or from lowest to greatest, and then find the number that appears on

the list most frequently. There are also two types of data set: the ungrouped data and grouped

data. In getting the modal in grouped distribution, the formula is ¿ Lb+¿ ) c; wherein, Lb stands for

Lower Boundary of the modal class, d1 as difference between the frequency of the modal class and

the frequency of the class interval lower than the modal class, d2 as difference between the

frequency of the modal class and the frequency of the class interval higher than the modal class and

C as the size of the modal class.The mode refers to the number that occurs most frequently in a

given set of data. A set of data may have a one mode and it is called a unimodal and if there is two

modes in the data set, it is called a bimodal. The mode doesn’t necessarily fail near the middle of a

given set of numbers. It just indicates the number or numbers that are the most common in the set.

In certain cases, mode can be an extremely helpful measure of central tendency. One of its

biggest advantages is that it can be applied to any type of data, whereas both the mean and median

cannot be calculated for nominal data. It is also not affected by extreme values in datasets with

quantitative data. Thus, it can provide insights into almost any dataset despite the data distribution.
Ramon Magsaysay Memorial Colleges
GRADUATE SCHOOL
Pioneer Avenue, General Santos City, Philippines
Website: www.rmmcmain.edu.ph

On the other hand, the statistical measure also comes with its own limitation. For instance, it

cannot be further treated mathematically. Therefore, the measure cannot be used for more

detailed analysis. In addition, since it is not based on all values in the dataset, it is difficult to draw

conclusions regarding the dataset relying on mode only.

MEASURES OF LOCATION

4.1 Quartile

Quartiles are values that divide your data into quarters. However, quartiles aren’t shaped like pizza

slices; Instead, it divide data into four segments according to where the numbers fall on the number

line. As quartiles divide numbers up according to where their position is on the number line, you have to

put the numbers in order before you can figure out where the quartiles are.

To solve percentile in ungrouped data, the first thing to do is arrange the scores according to

magnitude or array. This is the formula used Q=[


k
4 (
n+ 1−
k th
100 )
] . And also, for grouped data, this is

{ ]
Kn
−cfp
the formula used 4 .
Q=Lb+ C
fb

So, when we talk about quartiles, we are dividing the data set into 4 quarters. Each quarter is 25% of

the total number of data points. The first quartile or Q1 is the value in the data set such that 25% of the

data points are less than this value and 75% of the data set is greater than this value. The second

quartile or Q2 is the value in the data set such that 50% of the data points are less than this value and

50% of the data set are greater than this value. The third quartile or Q3 is the value such that 75% of the

values are less than this value and 25% of the values are greater than this value.

4.2 Decile
Ramon Magsaysay Memorial Colleges
GRADUATE SCHOOL
Pioneer Avenue, General Santos City, Philippines
Website: www.rmmcmain.edu.ph

A Decile is a specific type of quantile. It divides a data set into 10 parts where each part contains the

same amount of data (10 equal frequencies). In order to split a dataset into ten parts (deciles), nine

cutpoints or numbers should be found. They are the boundaries, showing where the deciles begin and

end.

There are also other types of quantiles. If a data set is split into 100 parts, with each part containing

the same amount of data, each of these 100 parts would be called a percentile. Deciles can be seen as a

special type of percentiles, where the data is divided into groups of 10%.

There are many formulas for calculating deciles. When a formula is chosen, it is necessary to be

consistent. The specific formula that is used should be used for the whole data set. According to our

professor, the formula we should use in ungrouped data is Dk =


[ ( )]
k
10
n+ 1−
k
10
th
and for grouped

2n
−cfp
data, the formula used is 10 .
D=Lb+( )C
fd

4.3 Percentile

Percentile is a value on a scale of 100 that indicates the percent of a distribution that is equal to or

below it. P symbol is often use. To solve percentile in ungrouped data, the first thing to do is arrange the

scores according to magnitude or array. This is the formula used Pk=[


k
100 (
n+ 1−
k th
100 )
] . And also,

{ ]
Kn
−cfp
for grouped data, this is the formula used 100 .
P=Lb+ C
fb

Percentiles report the relative standing of a specific value in a statistical data set. What is important

here is where one stands in relation to everyone else rather than in relation to the mean. Furthermore,
Ramon Magsaysay Memorial Colleges
GRADUATE SCHOOL
Pioneer Avenue, General Santos City, Philippines
Website: www.rmmcmain.edu.ph

the percentile is a great tool to indicate the relative standing of a value. Percentile tells where a value

falls within a particular distribution of values. Most noteworthy, percentiles tell how a value compares

to other values.

A useful and convenient property of percentiles is universal interpretation. Furthermore, being at

95th percentile means the same thing no matter what. 95th percentile certainly means that 95% of the

values lie below yours, while 5% lie above it. This is irrespective of the fact whether one is analyzing

weights of packages or exam scores. Most noteworthy, this enables a fair comparison of data sets that

have different means and standard deviation.

A percentile is certainly not a percent. Most noteworthy, a percentile is a value in the data set that

marks a specific percentage of the way through the data. For example, an individual’s score is 80th

percentile. This certainly does not mean that the individual scored 80% of the questions correctly.

Rather it means that 80% of the students’ scores are lower than the individual. Moreover, it also means

that 20% of the students’ scores were higher than the individual.

Percentile helps you understand what position you belong to. It answer if you are above or below

rank or at the midlle.

MEASURES OF VARIATION, SKEWNESS, AND KURTOSIS

5.1 Variability
Ramon Magsaysay Memorial Colleges
GRADUATE SCHOOL
Pioneer Avenue, General Santos City, Philippines
Website: www.rmmcmain.edu.ph

According to the reporter, variability means “scatter or spread”. Thus, measure of variability refer to

the scatter or spread of scores around their central tendency. There are 4 ways to measure Variability;

Range, Interquartile Range, Standard Deviation and variance.

Variability is important because it helps to be able understand how the degree to which data values

are spread out in a distribution can be assessed using simple measures to best represent the variability

in the data. When talking about variability, it talking about how scattered or dispersed or spread out the

data is.

5.2 Range

Based on what I learned, range is the simplest and easiest measure of variability. Range is only the

difference between the highest and lowest scores in a distribution. Range has a formula of R= h-l,

wherein, R stands for Range, h as highest score and l as lowest score. Range is not very informative

because it is based only on the most extreme scores. It is severely affected by extreme scores in your

data distribution. Just one of these extreme scores can significantly alter the range. Therefore, it is not

used as a reliable measure of variability.

5.3 Interquartile Range

Interquartile range or IQR is a measure of where the “ middle fifty” is in a data set. Where a range is

measure of where the beginning and end ae in a set, an interquartile range is a measure of where the

bulk of the values.To find the quartile range (IQR), first, arrange it in ascending or descending order after

that find the median (middle value) of the lower and upper half of the data. Those values are quartile 1

and quartile 3.
Ramon Magsaysay Memorial Colleges
GRADUATE SCHOOL
Pioneer Avenue, General Santos City, Philippines
Website: www.rmmcmain.edu.ph

For the Grouped data of Interquartile Range, these is how to solve, with IQR being equal to the value

of the first quartile subtracted from the value of the third quartile. It is less likely influenced by extreme

scores, therefore, giving a better and stable measure of variability than the range.

5.4 Semi – Interquartile range/ quartile deviation

The semi-interquartile range is a way to measure the spread of observations in a dataset. It is

calculated as one half the distance between the first quartile (Q1) and the third quartile (Q3):Semi-

interquartile range = (Q3 – Q1) / 2.

The semi-interquartile range is a measure of spread or dispersion. It is computed as one half the

difference between the 75th percentile [often called (Q3)] and the 25th percentile (Q1). The formula for

semi-interquartile range is therefore: (Q3-Q1)/2.Since half the scores in a distribution lie between Q3

and Q1, the semi-interquartile range is 1/2 the distance needed to cover 1/2 the scores. In a symmetric

distribution, an interval stretching from one semi-interquartile range below the median to one semi-

interquartile above the median will contain 1/2 of the scores. This will not be true for

a skewed distribution, however.

The semi-interquartile range is little affected by extreme scores, so it is a good measure of spread

for skewed distributions. However, it is more subject to sampling fluctuation in normal distributions

than is the standard deviation and therefore not often used for data that are approximately normally

distributed.

5.5 Mean Absolute Deviation

In this lesson, we will first learn how to use the mean absolute deviation to find not only the average

of a set of numbers but the average distance each number is from the average number.
Ramon Magsaysay Memorial Colleges
GRADUATE SCHOOL
Pioneer Avenue, General Santos City, Philippines
Website: www.rmmcmain.edu.ph

Finding the mean is essentially finding the average of a set of numbers. Once the mean is found, the

next step is the absolute deviation, in other words, determining the distance between each of the

original numbers from the mean you found in step one. Mean absolute deviation is the average distance

between the mean of a set of numbers.

Many professionals use mean in their everyday lives. Teachers give tests to students and then

average the results to see if the average score was high, in between, or too low. Each average tells a

story.

Absolute deviation can further help to see the distance between each of the scores and the

beginning average scores. Analyzing information in this way helps the teacher to see if the test was too

hard, too easy, or just right, based upon the mathematical outcomes.

5.6 Variance

Variance is the average squared differences of scores from the mean score of a distribution. The

variance measures how spread out the data are about their mean. The variance is equal to the

standard deviation squared.

The greater the variance, the greater the spread in the data because variance (σ 2) is a squared

quantity, its units are also squared, which may make the variance difficult to use in practice. The

standard deviation is usually easier to interpret because it's in the same units as the data. For example, a

sample of waiting times at a bus stop may have a mean of 15 minutes and a variance of 9 minutes 2.

Because the variance is not in the same units as the data, the variance is often displayed with its square

root, the standard deviation. A variance of 9 minutes 2 is equivalent to a standard deviation of 3 minutes.
Ramon Magsaysay Memorial Colleges
GRADUATE SCHOOL
Pioneer Avenue, General Santos City, Philippines
Website: www.rmmcmain.edu.ph

5.7 Standard Deviation

One way to measure the spread of information or data is by looking at the standard deviation. To

get the standard deviation, the formula is, first you square the distances values are from the mean. Then

you sum those squared differences. Then you divide that sum by the number of differences. Finally, you

take the square root of that quotient. The reason that you subtract and square is pretty clear. Whether

the value is above the mean of below the mean the squared difference between the value and the mean

comes out the same when it is squared. So positive and negative makes difference here.

5.8 Coefficient of Variation

According to the reporter, Coefficient of Variation is also known as Relative standard Deviation or

RSD. It is the ratio of the standard deviation to the mean. In getting the Coefficient of Variation is just

divide the standard deviation by the mean and multiply to 100.

Coefficient of variation represents the ratio of the standard deviation to the mean, and it is a useful

statistic for comparing the degree of variation from one data series to another, even if the means are

drastically different from each other.

The coefficient is very useful because the standard deviation of data must always be understood in

the context of the mean of the data. The Disadvantage of the coefficient of variation is when the mean

value is near zero, it is sensitive to small changes in the mean, limiting its usefulness.

5.9 Skewness

Skewness assesses the extent to which a variables distribution is symmetrical. Skewness also use

to determine the difference between the mean, median and mode in distribution. If Mode exceeds

Mean and Median, distribution is skewed to the left or it is a negative skewed, if the Mean exceeds
Ramon Magsaysay Memorial Colleges
GRADUATE SCHOOL
Pioneer Avenue, General Santos City, Philippines
Website: www.rmmcmain.edu.ph

Mode and Median, distribution is skewed to the right or it is a positive skewed and is the Mean, Median

and mode are equal, distribution is symmetrical skewed.

There are 2 formulas to calculate the measurement of skewness: The first one is Pearson’s

Coefficient Skewness 1 is equal to the difference of Mean and Mode and divided into Standard Deviation

and the second formula is Pearson’s Coefficient of skewness 2 is equal to three multiply to the

difference of Mean and Median divided to Standard Deviation.

5.10 Kurtosis

Kurtosis comes from the Greek word “Kurtos” meaning humped. Kurtosis is a measure of whether

the distribution is too peaked. According to the reporter, if the number is greater than +1, the

distribution is too peaked. Likewise, a kurtosis of less than -1 indicates a distribution that is too flat.

Σ ( x−x ) 4
In solving the kurtosis for ungrouped data, the formula is K= , wherein, K stands for
ns 4

measures of kurtosis, x as sample data, x as mean, n as number of data in the sample and s as sample

Σf ∗( Xm−x ) 4
standard deviation. And for grouped data, this is the formula used; K= , where in K
ns 4

stands for kurtosis, f as frequency, Xm as midpoint, x as m/ean, n as number of data in the sample and s

as sample standard deviation.

There are also 3 types of Symmetrical Curves; the Platykurtic Distribution wherein kurtosis is less

than 3, Mesokurtic Distribution or Normal Distribution wherein it is equal to 3 and Leptokurtic

Distribution wherein it is greater than 3.


Ramon Magsaysay Memorial Colleges
GRADUATE SCHOOL
Pioneer Avenue, General Santos City, Philippines
Website: www.rmmcmain.edu.ph

You might also like