You are on page 1of 5

STATISTICAL ANALYSIS WITH SOFTWARE APPLICATION

QUIZ 1

● QUALITY IMPROVEMENT
STATISTICS ➜ Initiate a triple inspection program,
● the branch of mathematics that transforms setting penalties for workers who
data into useful information for decision produce poor-quality output.
makers. ● PURCHASING
➜ DESCRIPTIVE STATISTICS ➜ A food producer purchases plastic
★ Collecting, summarizing, containers for packaging its
and describing data product. Inspection of the most
➜ INFERENTIAL STATISTICS recent shipment of 500 containers
★ Drawing conclusions and/or found that 3 of the containers were
making decisions defective. The supplier’s historical
concerning a population defect rate is .005. Has the defect
based only on sample data rate really risen or is this simply a
“bad” batch?
● MEDICINE
WHAT IS STATISTICS? (Doane & Seward,
2019) ➜ Determine whether a new drug is
● is the science of collecting, organizing, really better than the placebo or if
analyzing, interpreting, and presenting the difference is due to chance.
data. ● OPERATIONS MANAGEMENT
● A STATISTIC is a single measure, reported ➜ Manage inventory by forecasting
as a number, used to summarize a sample consumer demand.
data set; for example, the average height ● PRODUCT WARRANTY
of students in a university. ➜ Determine the average dollar cost
● Examples of statistics: of engine warranty claims on a new
➜ average height for the length of the hybrid engine.
gowns
➜ maximum height to design the KINDS OF STATISTICS (Doane & Seward,
height of the doorways of the 2019)
classrooms, etc. ● DESCRIPTIVE STATISTICS
➜ refers to the collection,
USES OF STATISTICS (Doane & Seward, presentation, and summary of data
2019) (either using charts and graphs or
● AUDITING using numerical summary).
➜ The firm has learned that some ● INFERENTIAL STATISTICS
invoices are being paid incorrectly, ➜ refers to the generalizing from a
but it doesn’t know how widespread sample to a population, estimating
the problem is. A sample of invoices unknown population parameters,
can be used to estimate the drawing conclusions, and making
proportion of incorrectly paid decisions.
invoices.
● MARKETING SOME DEFINITIONS
➜ Many companies use Customer ● VARIABLE
Relationship Management (CRM) ➜ is any characteristic of a person or
to analyze customer data from an object that may vary across
multiple sources. With statistical persons or across different time
and analytics tools such as points.
correlation and data mining, they Age Student number Asset size
identify specific needs of different
customer groups, and this helps Quiz scores Civil status Place of birth
them market their products and
services more effectively.
● HEALTH CARE ● DATA
➜ Evaluate 100 incoming patients ➜ are the values associated with a
using a 42-item physical and variable. For the variable civil
mental assessment questionnaire. status, possible data are single,
1 diamla, foronda, gan
married, separated,
Sophomore, Junior,
widow/widower.
Senior

Height 5’6” 1.75 165 cm S&P’s bond ratings AAA, AA, A, BBB, BB,
meters B, CCC, CC, C, …
b. Numerical or Quantitative
Academic Strand ABM STEM GAS
Variables
★ An interval scale is an
CLASSIFICATION OF VARIABLES ordered scale in which the
1. According to nature difference between
a. Categorical or Qualitative measurements is a
Variables meaningful quantity but the
★ Categorical or Qualitative measurements do not have
Variables have values a true zero point.
which can be placed into ★ A ratio scale is an ordered
categories. scale in which the difference
★ Examples of categorical between measurements is a
variables are gender, civil meaningful quantity and the
status, frequency measurements have a true
b. Numerical or Quantitative zero point.
Variables
★ Numerical or quantitative Numerical Variable Scale or Level of
variables have values that Measurement
represent quantities.
★ Examples are height, Temperature (in oC or Interval
weight and temperature oF)
2. According to scale
a. Categorical or Qualitative Standardized Exam Interval
Variables Score (SAT or NSAT)
★ A NOMINAL SCALE is
used to classify persons (or Height Ratio
objects) in which NO
ranking is implied. Weight Ratio

Age Ratio
Categorical Categories
Variables Salary Ratio

Car Ownership Yes / No Industry Index Ratio

Profession Engineer, Architect,


Teacher, others
STATISTICAL TECHNIQUES FOR
Work shift Day shift/night shift DESCRIBING UNIVARIATE DATA
Scale or Level of Measurement

★ An ORDINAL SCALE is
used to classify persons (or Nominal Ordinal Interval Ratio
objects) into distinct
Tables Frequency Frequenc Frequency Frequency
categories in which ranking and y and and and
is implied. Percentage Percenta Percentag Percentag
(Summary ge e (FDT) e (FDT)
Table) (Summar
Categorical Categories y Table)
Variables Charts Pie chart Pie chart Histogram Histogram
bar graph bar graph Line Line
Customer satisfaction Satisfied, Neutral, Not Graph Graph
Satisfied
Central Mode Mode and Mean, Mean,
Tendency Median Median Median
Student’s Year level Freshman, and Mode and Mode

2 diamla, foronda, gan


Variations Range Range, Range, ● It is affected by extreme values.
and standard standard ● The mean of a separate distribution can be
interquarti deviation, deviation, combined to get the mean of the total
le range variance variance
and coeff. and coeff. distribution.
of variation of variation ● What is the mean of the distribution: 713,
300, 618, 595, 311, 401, and 292?
MEASURES OF CENTRAL TENDENCY ● The number of tornadoes that have
● This is a statistical measure which occurred in the United States over an 8-
describes where the center of a frequency year period were as follows. What is the
distribution lies. mean?
● The three measures commonly used are ➜ 684, 764, 656, 702, 856, 1133,
the mean, the mode and the median. 1132, 1303
● Some variations of the mean are the
arithmetic mean, geometric mean, WEIGHTED MEAN
weighted mean and the trimmed mean.

MODE
● It is simply the observation or value that
occurs most frequently in the data set.
● What is the mode given the following: A, A,
B, C, B, C, D, A, A? How do you call the
distribution with respect to the mode?
Mode = A, distribution is unimodal
● What is the mode of the distribution: 10, 10,
20, 20, 30, 30, 20, 30? How do you call the
distribution with respect to the mode?
Mode = 20, 20; distribution is bimodal.
● What is the mode of the distribution: 500,
500, 700, 600, 700, 600? How do you call
the distribution with respect to the mode?
➜ Answer: Mode = { } or none. The
GEOMETRIC MEAN
distribution has no mode.
● This measure is useful for growth rates.
● The mode is suitable for nominal, ordinal,
● It mitigates high extremes.
interval and ratio level variables.
● It is, however, less familiar.
● It is not affected by extreme values.
● It requires that data is positive.
● There may be no mode.
● There may be several modes.
𝐺 = 𝑛√𝑥1 𝑥2 . . . 𝑥𝑛
MEDIAN
● It is simply the middle score or middle value TRIMMED MEAN
when scores are ranked in order of ● Computed in the same manner as the
magnitude. arithmetic mean but it omits the highest and
● It is unique. lowest k% of data values (e.g., 5%).
● It is relatively unaffected by extreme scores ● It mitigates the effects of extreme values.
at either end of the distribution. ● It has the disadvantage of excluding some
● Not all values in the distribution contribute data values that could be relevant.
to the value of the median.
● It can be used with ordinal, interval and MEASURES OF POSITION
ratio data. ● A measure of position, or quantile, is a
● It is not suitable for nominal data because general descriptive measurement used to
nominal data have no numerical order. separate quantitative data into distinct
MEAN groups. To compute quartiles of ungrouped
● The mean of a set of numerical data is data, the values must first be arranged
unique. either in ascending or descending order.
● It is the only measure of central tendency ● Quartiles divide the values into four groups
where the sum of the deviation of each of equal size, each comprising 25% of
value from the mean will always be zero. observations. If n = 50, 25% of the values
● It includes precise information from every is less than or equal to 𝑄1.
score, hence, it is affected by a change in ● Deciles divide the values into ten groups of
any score. equal size, each comprising 50% of
3 diamla, foronda, gan
observations. If n = 50, 30% of the values ● If the values are all the same (no variation),
is less than or equal to D3. all these measures will be zero.
● Percentiles divide the values into 100 ● None of these measures are ever negative.
groups of equal size, each comprising 1%
of observations. If n = 200, 65% of the TABLES AND CHARTS FOR
values is less than or equal to P65. CATEGORICAL DATA
Summary Table
MEASURES OF VARIABILITY ● A summary table indicates the frequency,
● Variation measures the spread, or amount, or percentage of items in a set of
dispersion, of values in a data set. categories so that differences in categories
➜ Range can be seen.
➜ Quartile Deviation
➜ Variance Bar Chart
➜ Standard deviation ● A bar chart shows each category, the
length of which represents the amount of
➜ Coefficient of Variation
frequency or percentage of values falling
● It measures the difference of each value
under each category.
around the mean.
● It functions as a measure of risk or
Pie Chart
uncertainty in the field of finance.
● A pie chart shows a circle broken up into
● It provides a measure of volatility in
slices that represent categories. The size of
considering alternatives for pricing
each slice of the pie varies according to the
commodities.
percentage in each category.
● It may be used as a measure of error in the
field of forecasting.
TABLES AND CHARTS FOR NUMERICAL
RANGE DATA
● The range of a set of data with n Stem and Leaf Display
observation is defined as the difference ● A stem and leaf display organizes data into
between the highest and lowest values. groups (called stems) so that the values
● The quartile deviation, QD, is the amount within each group (the leaves) branch out
of spread with the middle half of the items to the right on each row.
arranged in an ordered array. It is also
called semi-interquartile range. It is used Frequency Distribution Table
for ordinal data. ● A frequency distribution table is a summary
table in which the data are arranged into
VARIANCE numerically ordered class groupings.
● The variance is the average
(approximately) of squared deviations of Histogram
values from the mean. ● A histogram is the graph of data in a
frequency distribution where the class
COEFFICIENT OF VARIATION boundaries are shown on the horizontal
● The coefficient of variation is the standard axis while the vertical axis is either
deviation divided by the mean, multiplied frequency, relative frequency or
by 100. percentage. Bars of the appropriate
● It is always expressed as a percentage, %. heights are used to represent the number
● It shows variation relative to the mean. of observations within each class.
● The CV can be used to compare two or
more sets of data measured in different Line Graph
units (e.g. weight in kgs and height in ● A percentage polygon is formed by having
meters). the midpoint of each class represent the
data in that class and then connecting the
SUMMARY CHARACTERISTICS sequence of midpoints at their respective
● The more the data are spread out, the class percentages.
greater the range, quartile deviation,
variance, and standard deviation. Scatter Plots
● The more the data are concentrated, the ● A scatter plot is used for numerical data
smaller, the quartile deviation, variance, consisting of paired observations taken
and standard deviation. from two numerical variables.

4 diamla, foronda, gan


● One variable is measured on the vertical
axis while the other variable is measured
on the horizontal axis.
● In case of dependence relationship, the
dependent variable is plotted along the
vertical axis.

5 diamla, foronda, gan

You might also like