You are on page 1of 4

ENGINEERING DATA ANALYSIS

DATA – a collection of discrete values that convey


information, describing, quantity, facts or statics.
STATICS – is the study of analysis, presentation,
collection, interpretation, organization and large 2 TYPES OF DATA
data presentation. It can be defined as a function 1. QUALITATIVE (categorical)
of the given data. • Describes the object under consideration
using a finite set of discrete classes.
VARIABLE – a mathematical symbol that may
• Can’t be counted or measured easily
represent a number, a vector, a matrix, a function,
using numbers.
the argument of the function, a set of an element of
Ex. gender of a person
the set
a. NOMINAL – set of values that don’t
– Latin word “variabilis” – changeable
possess a natural ordering
OTHER SPECIFIC NAME FOR VARIABLES: Ex. hair color
1. UNKNOWN – a variable in an equation which b. ORDINAL – types of values that have a
has to be solved for. natural ordering while maintaining their class
2. INDETERMINATE – is a symbol commonly of values.
called variable that appears in a polynomial or Ex. size of a clothing brand
a formal power of series. 2. QUANTITATIVE
a. POLYNOMIAL – an expression consisting • Tries to quantity things and does by
of indeterminates and coefficients that considering numerical values.
involves only the operation of addition, a. DISCRETE
subtraction, multiplication, and positive
• consist of numerical variables that are
integers power of variables.
easily counted.
Ex. X – 4x + 7
2
• often identified through graphs.
variable of polynomial Ex. integers or whole numbers
b. PARAMETER – a quantity (usually
numbers) which is a part of the input of a b. CONTINUOUS
problem, and remains constant during the
• has numerical variable with an infinite
whole solution of the problem.
number of collected values.
Ex. F(x) = ax2 + bx + c Variable
of variable of a function Ex. height, temperature, weight
parameter/constant
LEVEL OF DATE MEASUREMENT:
DEPENDENT AND INDEPENDENT VARIABLE 4 LEVELS:
1. DEPENDENT (OUTPUT) – a variable that is • NOMINAL – data can be categorized
implicitly a function of another variable. Ex. city of birth, ethnicity
2. INDEPENDENT (INPUT) – a variable (often • ORDINAL – data can be categorized and
denoted by x) whole variation does not on that rank
of another. Ex. top 5 Olympic medalist
Ex. time, space, mass, density • INTERVAL – data can be categorized, rank
and evenly spaced.
ENGINEERING DATA ANALYSIS
Ex. test scores, temperature • describes the number of observations for
• RATIO – data can be categorized, rank, each possible value of a variable.
evenly spaced and has a natural zero. • depicted using graphs and frequency
tables.
Ex. height, weight, age
FREQUENCY OF A VALUE
POPULATION & SAMPLES
• the number of times it occurs in a dataset.
POPULATION – includes all the elements from the
data set and measurable characteristics of the TYPES OF FREQUENCY DISTRIBUTION:
population such as mean and standard deviation. 1. UNGROUPED – the number of observations of
each value of a variable
TYPES OF POPULATION:
– used for categorical
1. FINITE – also known as countable population
variables.
in which population can be counted.
2. GROUPED – the number of observations of
– the population of all individuals or
each class interval of a variable.
objects that are finite.
• CLASS INTERVAL – are ordered
Ex. employees of a company
groupings of a variable's values.
2. INFINITE – also known as uncountable
3. RELATIVE – the proportion of observation of
population in which the counting of units in the
each value or class interval of a variable.
population is not possible
– used for any type of variable especially when
Ex. numbers of germs
comparing frequencies.
3. EXISTENT – population of concrete individuals
4. CUMULATIVE – the sum of frequencies less
Ex. books, students
than or equal to each value or class interval of
4. HYPOTHETICAL – whose unit is not available
a variable.
in solid form
– used for ordinary
Ex. outcome of rolling a dice
or quantitative variables when understanding how
SAMPLES – includes one or more observations often observations fall below certain values.
that are drawn from the population
➢ HOW TO MAKE A FREQUENCY TABLE?
• SAMPLING – the process of selecting
FREQUENCY TABLE – an effective way to
the sample from the population
summarize or organize a dataset.
1. PROBABILITY – the population units
– usually composed of two
cannot be selected.
columns
Ex. simple random, stratified, cluster,
– values or class interval
disproportionate
– their frequencies
2. NON-PROBABILITY – the population
a. UNGROUPED
units can be selected
1. create a table
Ex. quota, purposive, judgement
VARIABLE’S NAME FREQUENCY
FREQUENCY DISTRIBUTION
• are visual display that organize and present
frequency counts so that the information can be
interpreted more easily.
ENGINEERING DATA ANALYSIS
• ORDINAL VARIABLES – the values GRAPH OF A FREQUENCY DISTRIBUTION
should be ordered from smallest to 1. PIE CHART – a circular graph that shows the
largest value. relative frequency distribution of a nominal
• NOMINAL VARIABLES – the values variable.
can be in any order in the table. 2. BAR CHART – a graph that shows frequency
2. count the frequencies or relative frequency distribution of a
• the frequencies are the number of categorical variable (nominal or ordinal)
times each value occur – y-axis - frequencies / relative
• if dataset is large, count frequencies frequencies
by tallying. – x-axis - values
b. GROUPED 3. HISTOGRAM – a graph that shows the
1. divide the variable into class intervals frequency or relative frequency distribution of a
a. calculates the range: subtract the quantitative variable
lowest value in the dataset from the – y-axis - frequencies / relative
highest. frequencies
b. decides the class interval width – x-axis - interval class
c. calculates the class intervals
2. create a table BAR CHART HISTOGRAPH
3. count the frequencies TYPE OF Categorical Quantitative
c. RELATIVE VARIABLE
1. create an ungrouped or grouped VALUE Ungrouped Grouped
frequency table. GROUPING (values) (interval class)
2. add a third column to the table for BAR Can be a Never a space
relative frequencies SPACING space between bars
o to calculate the relative frequency, between bars
divide each frequency by the sample BAR ORDER Can be in any Can only be
size. the sample size is the sum of order ordered from
frequencies. lowest to
d. CUMULATIVE highest
1. create an ungrouped or grouped MEAN, MEDIADE & RANGE
frequency table.
2. add a third column to the table for • MEAN – the total number of all
cumulative frequency values divided by the number of the values.
o cumulative frequency is the number
of observations less than or equal to or

a certain value or class interval. • MEDIAN – is the middle number in

3. optional: cumulative relative frequency a list of numbers ordered lowest to highest.

o divide each cumulative frequency by u=


the sample size. • MODE – is the value appear most
often in a set of data.
ENGINEERING DATA ANALYSIS

Mo )h
Where: L – lower limit
H – size of the class interval
F1 – first frequency
F0 – preceding frequency
F2 – succeeding frequency
• RANGE – is the difference between
the lowest and highest value.
Range = max. value – min. value
• VARIANCE – is the expectation of
the squared deviation of a random variable
from its population mean or sample mean.

or
Where: 𝒔𝟐/𝝈𝟐 – sample / population variable
𝒙𝟏 – the value of the one
observation
𝐱̅ /𝑴 – mean value
𝒏 − 𝟏 / 𝑵 – number of observations
• STANDARD DEVIATION – is the
measure of the distribution of the statistical
data.
Σ (𝑥 1 − 𝑀 )2
or 𝜎=√ 𝑁

You might also like