You are on page 1of 18

LECTURE PACKAGE 3: DATA ANALYSIS

STATISTICS
Topics Data Measurement
covered: Error
5 Significant figures Significant figures
Exploratory Data Analysis
2 Straight and weighted Descriptive statistics (Straight and
averages weighted averages)
1 Types of graphs and their Data visualisation (Types of graphs and
uses their uses)
Histograms and their uses
Data Modelling
Trend-lines Inferential statistics Trend lines

1. STATISTICS

 “A quantity that is computed from a sample of data”


 A single number used to summarize a larger collection of values (e.g. average)
 “A branch of mathematics dealing with the collection, analysis, interpretation, and
presentation of masses of numerical data.”
 •Statistics for performance evaluation (e.g. class performance)

DATA ANALYSIS is PART of a PROCESS of EVALUATING DATA


Using ANALYTICAL TOOLS and STATISTICAL TOOLS
in order to discover USEFUL INFORMATION and aid in DECISION MAKING

ANALYSIS TOOLS: usually software e.g. excel


STATISTICAL TOOLS:
 straight and weighted averages; trendlines
 significant figures; error bars

2. DATA MEASUREMENT

2.1 DATA MEASUREMENT ERROR1

1
https://www.watelectrical.com/different-types-of-errors-in-measurement-and-measurement-error-
calculation/

1
2.1.1 Types of errors in measurements:

a. Gross errors (Blunders) 2 A blunder (or gross error) is a significant, unpredictable


mistake caused by human error that often leads to large discrepancies. Blunders are
typically the result of carelessness, miscommunication, fatigue, or poor judgment. A
person may record a wrong value, misread a scale, forget a digit when reading a scale or
recording a measurement, or make a similar blunder. These blunders should stick out like
sore thumbs if one person checks the work of another person. It should not be included in
the analysis of data.

b. Measurement errors (also called observational errors)3 The measurement error is the
result of the variation of a measurement from the true value, i.e. it is the difference
between a measured quantity and its true value. Measurement error is classified into two
types: random error and systematic error. The best example of measurement error is, if
electronic scales are loaded with a 1kg standard weight and the reading is 10002 grams,
then: The measurement error is = (1002 grams-1000 grams) = 2 grams

b1 Random errors4 Random errors are naturally occurring errors that are to be expected
with any experiment, caused by a sudden change in experimental conditions, noise, or
tiredness in the working persons. These errors are either positive or negative. Examples
of random error are sudden changes in humidity, unexpected change in temperature, or
fluctuation in voltage. These errors may be reduced by taking the average of a large
number of readings.

b2. Systematic error5 Systematic error (also called systematic bias or Zero Error) is
consistent, repeatable error associated with faulty equipment, mis-calibrated
instruments, or a flawed experiment design that affects all measurements. These errors
can be corrected by fixing or properly calibrating the measurement device or correcting

2
http://www.dot.state.wy.us/files/live/sites/wydot/files/shared/Highway_Development/Surveys/Survey
%20Manual/Section%20III%20-%20Measurements%20and%20Errors.pdf &
http://www.physics.nmsu.edu/research/lab110g/html/ERRORS.html &
https://www.watelectrical.com/different-types-of-errors-in-measurement-and-measurement-error-
calculation/
3
https://www.watelectrical.com/different-types-of-errors-in-measurement-and-measurement-error-
calculation/ & https://www.statisticshowto.datasciencecentral.com/measurement-error/
4
https://www.watelectrical.com/different-types-of-errors-in-measurement-and-measurement-error-
calculation/ & https://www.statisticshowto.datasciencecentral.com/measurement-error/
5
https://www.watelectrical.com/different-types-of-errors-in-measurement-and-measurement-error-
calculation/ & https://www.statisticshowto.datasciencecentral.com/measurement-error/ &
https://www.statisticshowto.com/systematic-error-random-error

2
the design flaw in the experiment. These errors may be classified into different
categories: instrumental, environmental, observational, and theoretical errors.

b2.1 Instrumental errors6,7 Instrumental errors are those errors in instrumental


measurements which arise exclusively from lack of mathematical accuracy in an
instrument. In order to reduce the errors in measurement, different correction factors must
be applied, or the instrument must be recalibrated carefully.

b2.2 Environmental errors8,9 The environmental errors occur when some external condition
or some factor in the environment, such as an uncommon event, leads to error. External
conditions leading to error mainly include pressure, temperature, humidity, or the
presence of magnetic fields. In order to reduce the environmental errors, try to keep the
environmental conditions as steady as possible during the experiments.

b2.3 Observational errors10 As the name suggests, these types of errors occur due to
wrong observations or incorrect reading of the instruments. Incorrect observations can
also be caused by the error of parallax. In order to reduce the parallax error highly
accurate meters are needed: meters provided with mirror scales.
b2.4 Theoretical errors11,12 Theoretical errors are due to simplification of the model system
or approximations in the equations describing it. For example, a theory states that the
temperature of the system surrounding will not change the readings taken when it actually
does, then this factor will begin a source of error in measurement.

2.1.2 Data Measurement: Error – Absolute, Relative & Percentage Error

Absolute Error The absolute error is the difference between the actual and measured
value. But when measuring, we don't know the actual value! So we use the maximum
possible error. Example: a fence is measured as 12.5 metres long, accurate to 0.1 of a
metre. Accurate to 0.1m means it could be up to 0.05m either way: Length =12.5𝑚±0.05𝑚
So it could really be anywhere between 12.45m and 12.55m long. In the example above the
absolute error is 0.05m. Relative Error The relative error is the absolute error divided by the
actual measurement. We don't know the actual measurement, so the best we can do is use
the measured value:
𝑅𝑒𝑙𝑎𝑡𝑖𝑣𝑒 𝐸𝑟𝑟𝑜𝑟=𝐴𝑏𝑠𝑜𝑙𝑢𝑡𝑒 𝐸𝑟𝑟𝑜𝑟
𝑀𝑒𝑎𝑠𝑢𝑟𝑒𝑑 𝑉𝑎𝑙𝑢𝑒

Percentage Error The percentage error is the relative error shown as a percentage:
𝑃𝑒𝑟𝑐𝑒𝑛𝑡𝑎𝑔𝑒 𝐸𝑟𝑟𝑜𝑟=𝐴𝑏𝑠𝑜𝑙𝑢𝑡𝑒 𝐸𝑟𝑟𝑜𝑟𝑀𝑒𝑎𝑠𝑢𝑟𝑒𝑑 𝑉𝑎𝑙𝑢𝑒∙100%

Example: Alex measured the field to the nearest metre, and got a width of 6 m and a length
of 8 m.

6
http://www.finedictionary.com/Instrumental%20errors.html
7
https://www.watelectrical.com/different-types-of-errors-in-measurement-and-measurement-error-
calculation/
8
https://www.watelectrical.com/different-types-of-errors-in-measurement-and-measurement-error-
calculation/
9
https://manoa.hawaii.edu/exploringourfluidearth/physical/world-ocean/map-distortion/practices-
science-scientific-error
10
https://www.watelectrical.com/different-types-of-errors-in-measurement-and-measurement-error-
calculation/
11
http://www.physics.nmsu.edu/research/lab110g/html/ERRORS.html
12
https://www.watelectrical.com/different-types-of-errors-in-measurement-and-measurement-error-
calculation/

3
Measuring to the nearest metre means the true value could be up to half a metre smaller or
larger.
The width (w) could be from 5.5 m to 6.5 m: 𝑤=6±0.5 m or 5.5≤𝑤≤6.5
The length (l) could be from 7.5 m to 8.5 m: 𝑙=8±0.5 m or 7.5≤𝑙≤8.5
The area is width × length (𝐴=𝑤∙𝑙), so:
The smallest possible area is: 5.5 m∙7.5 m=41.25 m2
The measured area is: 6 m∙8 m=48 m2 And
the largest possible area is: 6.5 m∙8.5 m=55.25 m2
So: 41.25 m 2≤𝐴≤55.25 m2

Absolute, Relative and Percentage Error13


From 41.25 m2 to 48 m2 the absolute error is: 6.75 m2
From 48 m2 to 55.25 m2 the absolute error is: 7.25 m2
The absolute error is the maximum between the calculated errors:
Absolute Error =7.25 m2
Relative Error =𝐴𝑏𝑠𝑜𝑙𝑢𝑡𝑒 𝐸𝑟𝑟𝑜𝑟 = 7.25 m2 = 0.151
𝑀𝑒𝑎𝑠𝑢𝑟𝑒𝑚𝑒𝑛𝑡 𝑉𝑎𝑙𝑢𝑒 48 m2
Percentage Error =𝑅𝑒𝑙𝑎𝑡𝑖𝑣𝑒 𝐸𝑟𝑟𝑜𝑟 X 100%=15.1%

2.2 Significant Figures

 In general, the order-of-magnitude has more significance in a calculation than the actual
value itself. Numbers written in scientific notation are easier to compare, in terms of
significance, than when expressed as decimals.
o For example,
 when comparing R1 000 000 (1x106) to R20 (2x101), the fact that the first number
has five more zeroes than the second number, is what primarily determines the
significant difference between the two. Adding them together makes no difference
to the answer, as whether one wins R1 000 000 in the Lotto or R1 000 020 both
amounts make one a millionaire - the R20 would probably not even be considered.
 However, R20 (2x101) compared to 2c (2x10-2) now has much more significance!
 When used in conjunction with a logarithm, this forms the basis of a decibel: an important
comparative technique commonly used in measurements.

The significant figures (also known as the significant digits) of a number are DIGITS that
carry meaning CONTRIBUTING to its MEASUREMENT resolution.

013 103 0.0103 10300 103.5673


013 103 0.0103 10300 103.5673

 Significant figures:14 The significant figures (also known as the significant digits) of a
number are digits that carry meaning contributing to its measurement resolution.
Significant figures rules:
o All non-zero digits are considered significant. For example, 91 has two significant
figures (9 and 1), while 123.45 has five significant figures (1, 2, 3, 4 and 5).
o Zeros appearing anywhere between two non-zero digits are significant: 101.1203 has
seven significant figures: 1, 0, 1, 1, 2, 0 and 3.
o Zeros to the left of the significant figures are not significant. For example, 0.00052 has
two significant figures: 5 and 2.

13
https://www.mathsisfun.com/measure/error-measurement.html
14
https://en.wikipedia.org/wiki/Significant_figures#Identifying_significant_figures

4
o Zeros to the right of the significant figures are significant if and only if they are justified
by the precision of their derivation.
o For example,
 12.2300 may have six significant figures: 1, 2, 2, 3, 0 and 0.
 0.000122300 still has only six significant figures (the zeros before the 1 are not
significant).
 120.00 has five significant figures since it has three trailing zeros.
 62/41=1.512195…
62 has two sig numbers i.e. 6 & 2 41 has two sig numbers i.e. 4 & 1
Thus the answer should have two sig figures i.e.1.5

Scientific notation
 In most cases, the same rules apply to numbers expressed in scientific notation.
However, in the normalized form of that notation, placeholder leading and trailing digits do
not occur, so all digits are significant.
o For example,
0.00012 (two significant figures) becomes 1.2X10−4 and
0.00122300 (six significant figures) becomes 1.22300 X10−3.
 In particular, the potential ambiguity about the significance of trailing zeros is eliminated.
o For example,
1300 to four significant figures is written as 1.300 X103, while
1300 to two significant figures is written as 1.3 X103.

Rounding and decimal places


 The basic concept of significant figures is often used in connection with rounding.
Rounding to significant figures is a more general-purpose technique than rounding to 𝑛
decimal places, since it handles numbers of different scales in a uniform way.
o For example,
the population of a city might only be known to the nearest thousand and be stated as
52 000,
while the population of a country might only be known to the nearest million and be
stated as 52 000 000.
 The former might be in error by hundreds, and the latter might be in error by hundreds of
thousands, but both have two significant figures (5 and 2). This reflects the fact that the
significance of the error is the same in both cases, relative to the size of the quantity
being measured.
 To round to 𝒏 significant figures15 Identify the significant figures before rounding.
These are the 𝑛 consecutive digits beginning with the first non-zero digit. If the digit
immediately to the right of the last significant figure is greater than 5 or is a 5 followed by
other non-zero digits, add 1 to the last significant figure.
o For example, 1.2459 as the result of a calculation or measurement that only allows for
3 significant figures should be written 1.25.
o Replace non-significant figures in front of the decimal point by zeros. Drop all the digits
after the decimal point to the right of the significant figures (do not replace them with
zeros).

3. EXPLORATORY DATA ANALYSIS

15
https://en.wikipedia.org/wiki/Significant_figures#Identifying_significant_figures

5
3.1 Descriptive statistics Straight and Weighted Averages

In statistics, mean, median, and mode are all known as measures of central tendency, and
in colloquial usage any of these might be called an average value.
 A straight average is used to represent a mean value where all samples have contributed
equally. For example, the class average for a particular course is the mean value
determined by the sum of each students' mark, divided by the number students in the
class.
 A weighted average represents a mean value where some samples have more
importance than others, and therefore contribute more significantly towards the final
value.
o For example, your end-of-year mark for first year is more dependent on your
PHYS1014 mark than that of your chosen elective, because PHYS1014 is a higher-
credit course and is therefore weighted more in terms of importance.

In colloquial language, an average is a single number taken as representative of a list of


numbers. Different concepts of average are used in different contexts. Often "average"
refers to the arithmetic mean, the sum of the numbers divided by how many numbers are
being averaged. In statistics, mean, median, and mode are all known as measures of central
tendency, and in colloquial usage any of these might be called an average value.
https://en.wikipedia.org/wiki/Average

Arithmetic mean The most common type of average is the arithmetic mean. If 𝑛 numbers
are given, each number denoted by 𝑎𝑖 (where 𝑖=1,2,…,𝑛), the arithmetic mean is the sum of
the as divided by 𝑛 or:

Example: the AM of 4, 8 and 15 is: 𝐴𝑀= 4+8+15 = 9


3
Geometric mean The geometric mean of 𝑛 positive numbers is obtained by multiplying
them all together and then taking the 𝑛𝑡ℎ root. In algebraic terms, the geometric mean of
𝑎1,𝑎2,⋯,𝑎𝑛 is defined as:

Example: the GM of 2 and 8 is: 𝐺𝑀= =4

Weighted arithmetic mean16 The weighted arithmetic mean is similar to an ordinary


arithmetic mean (the most common type of average), except that instead of each of the
data points contributing equally to the final average, some data points contribute
more than others. The notion of weighted mean plays a role in descriptive statistics and
also occurs in a more general form in several other areas of mathematics. If all the weights
are equal, then the weighted mean is the same as the arithmetic mean.

STRAIGHT AVERAGE WEIGHTED AVERAGE


Equal weighting of each Unequal weighting of each item

16
https://en.wikipedia.org/wiki/Weighted_arithmetic_mean

6
course info: uses weighted averages -2 tests (15% each), 1 project (40%) and 1 exam (30%)
has he done enough to pass the course?
SA = (60+51+35+61)/4 = =51.75% WA= (15%*60+15%*51+40%*35+30%*61)/
(15%+15%+40%+30%) =48.95%

Example 1: given two school classes, one with 20 students, and one with 30 students, the
grades in each class on a test were:
 Morning class = 62, 67, 71, 74, 76, 77, 78, 79, 79, 80, 80, 81, 81, 82, 83, 84, 86, 89, 93,
98 Afternoon class = 81, 82, 83, 84, 85, 86, 87, 87, 88, 88, 89, 89, 89, 90, 90, 90, 90, 91,
91, 91, 92, 92, 93, 93, 94, 95, 96, 97, 98, 99
 The straight average for the morning class is 80 and
 the straight average of the afternoon class is 90.
 The straight average of 80 and 90 is 85, the mean of the two class means.
 However, this does not account for the difference in number of students in each class (20
versus 30);
 hence the value of 85 does not reflect the average student grade (independent of class).
 The average student grade can be obtained by averaging all the grades, without regard to
classes (add all the grades up and divide by the total number of students):
𝑥 =430050=86
 Or, this can be accomplished by weighting the class means by the number of students in
each class (using a weighted mean of the class means):
𝑥 =20X80 + 30X90 = 86
20+30
 Thus, the weighted mean makes it possible to find the average student grade in the case
where only the class means and the number of students in each class are available.

Example 2: a student is enrolled in a biology course where the final grade is determined
based on the following categories:
 tests 40%, final exam 25%, quizzes 25%, and homework 10%.
 The student has earned the following scores for each category: tests-83, final exam-75,
quizzes-90, homework-100.
 We need to calculate the student's overall grade.
 The final grade is calculated as:
𝐹𝐺= 𝑇𝑒𝑠𝑡 X 40%+ 𝐸𝑥𝑎𝑚 X 25% + 𝑄𝑢𝑖𝑧𝑧𝑒𝑠 X 25%+ 𝐻𝑜𝑚𝑒𝑤𝑜𝑟𝑘X10%
100%
𝐹𝐺 = 83∙0X40 + 75∙0X25 + 90X0.25+100∙0X10 = 84.45
1

3.2 Data visualisation (Types of graphs and their uses)

(a)Pie Chart –

7
to represent fractions, ratios, percentages, etc. of a whole; for example, the constituents of a
concrete mix, where each slice represents a contributing volume / mass to the total volume /
mass. This kind of graph is needed to show percentages effectively

(b)Bar graph
- to show comparative results of several samples, for example, the strengths of several
concrete test specimens. The sample identifiers are on the X-axis and the Y-axis is used to
indicate the strengths.
 A bar graph is used to show relationships between groups
 The two items being compared do not need to affect each other
 It is a fast way to show big differences
 Example: the strengths (Y-axis) of several concrete test specimens (X-axis)

(c)Line graph/ Scatter Plot –

for showing the relationship of a dependent variable versus a controlled variable; for
example, the strength of concrete vs time. The dependent variable (strength) would be on
the Y-axis whilst the controlled / independent variable (time) would be on the X-axis.

 A mathematical model of such a relationship (or an infinite number of measured data


points) would result in a solid / continuous line. However, a series of measured data
points would result in a scatter of dots plotted against the X-axis.
 Data markers (such as: + x o *) are used to indicate where actual measurements have
been taken, and when plotted together with a modelled relationship, provide confidence in
the model.
 A trend line could also be used, based on the actual data points, to form a representative
mathematical model.
 Several functions could be plotted against a common controlled variable (X-axis) such as
the curing strengths of several concrete specimens; each specimen being represented
with a different colour, or different style, of line / date point.

8
EXAMPLE: Our task is to analyse the results for the Course ABC1234.
 The course had 50 students, divided into 5 groups of 10 students each (groups: yellow,
red, blue, green and orange).
o The groups submitted an Assignment (20% of the Final Mark),
o the students wrote a Test (20% of the Final Mark) and
o an Exam (60% of the Final Mark).
In the Excel spreadsheet are the results for the course (Lecture 2.xlsx – sheet Course
ABC1234).

PIE CHART17

17
https://en.wikipedia.org/wiki/Pie_chart

9
Plot the results in a PIE CHART. A pie chart (or a circle chart) is a circular statistical graphic,
which is divided into slices to illustrate numerical proportion.

SCATTER PLOT18

Plot the results in a SCATTER PLOT. A scatter plot is a type of plot or mathematical diagram
using Cartesian coordinates to display values for typically two variables for a set of data. If
the points are coded (color/shape/size), one additional variable can be displayed. The data
are displayed as a collection of points, each having the value of one variable determining the
position on the horizontal axis and the value of the other variable determining the position on
the vertical axis.

18
https://en.wikipedia.org/wiki/Scatter_plot

10
BAR CHART19

Plot the results in a BAR CHART. A bar chart is a graph that presents categorical data with
rectangular bars with heights or lengths proportional to the values that they represent. […] A
bar graph shows comparisons among discrete categories. One axis of the chart shows the
specific categories being compared, and the other axis represents a measured value. Some
bar graphs present bars clustered in groups of more than one, showing the values of more
than one measured variable.

(d)HISTOGRAM20

Histograms and Their Uses

(e)This is a special type of bar graph that shows how many samples, of the total, fall within
particular ranges (called bins).

19
https://en.wikipedia.org/wiki/Bar_chart
20
https://en.wikipedia.org/wiki/Histogram

11
 Histogram is a bar graph that shows the frequency of data within equal intervals.
 There is no space in between the bars
 Higher bars represent more data values in a class
 Lower bars represent fewer data values in a class

 For example, the marks for a course might reveal that


o no students scored in the 0-20 range,
o none in the 20-40 range,
o five students scored in the 40-60 range,
o 15 students were in the 60-80 range and
o seven students were in the 80-100 range.
o In this example, five equally-sized bins have been used, and the total sum of students
in each of the bins equals the class size i.e. 27 students. This provides far more
information than the simple straight average.
 The shape of the histogram is often of more importance than the actual numbers
themselves.
o For example, the classical bell curve of subject marks is an example of this.
 Histograms are often used in sampling and measurements,
o for example, measuring the wind speed at a particular location over time, for the
development of a wind-farm. Design engineers would want to know, on average, for
how much time does the wind blow over a certain strength, so that they can determine
the size of the wind turbines required.
o For example in Johannesburg, for most of the time, the wind does not blow, but when
it does, it is gale-force! This would be clearly evident using a histogram, but this
evidence would be lost if a straight average was merely used. A weighted average
could be used to better represent this data by giving higher wind-speeds more
importance.
 The accuracy of a histogram depends on the number of bins used, which is inversely
related to the size of each bin i.e.
o many smaller bins is more accurate than fewer larger bins.
o The shape of the histogram is often represented by a distribution curve such as
normal, Gaussian, natural, etc. and is usually of primary concern in engineering.

Plot the results in a HISTOGRAM. A histogram is an accurate representation of the


distribution of numerical data. […] It differs from a bar graph, in the sense that a bar graph
relates two variables, but a histogram relates only one. To construct a histogram, the first
step is to "bin" (or "bucket") the range of values – that is, divide the entire range of values
into a series of intervals – and then count how many values fall into each interval. The bins
are usually specified as consecutive, non-overlapping intervals of a variable. The bins
(intervals) must be adjacent, and are often (but are not required to be) of equal size. If the
bins are of equal size, a rectangle is erected over the bin with height proportional to the
frequency – the number of cases in each bin.

12
TYPES OF GRAPHS21

Plot the results in a GRAPH. In mathematics, the graph of a function 𝑓 is, formally, the set of
all ordered pairs (𝑥,𝑓(𝑥)), such that 𝑥 is in the domain of the function 𝑓. In the common case
where 𝑥 and 𝑓(𝑥) are real numbers, these pairs are Cartesian coordinates of points in the
Euclidean plane and form thus a subset of this plane, which is a curve in the case of a
continuous function. This graphical representation of the function is also called the graph of
the function.

3.3 Error bars

(f) Errors bars are used to indicate the range within which a data point could exist, based on
the accuracy of the actual measurement.
a. For example, if a weighing scale that could measure accurate to 10g was used to
weigh a specimen of concrete and returned a reading of 200g, then in reality, the
specimen could weigh anything between 195g and 205g. A pair of error bars showing
these limits (typically looking like this: I ) is overlaid on the data point.
b. The narrower the range of error, the better the accuracy of the data point; this provides
more confidence in the measurements.
c. The primary source of measurement error is the accuracy to which the instrument
used, has been designed. In the old days, the needles of analogue instruments were

21
https://en.wikipedia.org/wiki/Graph_of_a_function

13
hard to see, and this resulted in a reading error. But, due to today's digital readouts,
this form of measurement error rarely exists.

Bars that you include with your DATA that convey the UNCERTAINTY in whatever you’re
trying to show.

Collected results are seldomly a full representation of the whole subject, bars allow for
consideration of the potential range and/or potential errors in the results

The length of an Error Bar helps reveal the uncertainty


of a data point:
 a SHORT ERROR BAR shows that values are
CONCENTRATED, signalling that the plotted
average value is more likely
 A LONG ERROR BAR would indicate that the
values are MORE SPREAD OUT and LESS
RELIABLE.

OUTLIERS

 An OUTLIER is an observation that is UNLIKE the OTHER OBSERVATIONS:


 It is rare, or distinct, or does not fit in some way.
 Outliers can have many causes, such as:
 Measurement or input error.
 Data corruption.
 True outlier observation (e.g. Michael Jordan in basketball).
 There is no precise way to define and identify outliers in general because of the specifics
of each dataset.
 Interpret raw data and decide/statistically determine whether a value is an outlier or not.

14
ERROR BARS: LOWER AND UPPER CAPS

4. DATA MODELLING

4.1 Inferential statistics Trend lines

(g)These are a useful tool to make forecasts or predictions based on a batch of


measured data points.
(h)Trend line help to determine a representative mathematical function, that best suites the
data. The ability to best-fit any data is represented by an 𝑅2value (Show in Excel). The
closer this value is to 1, the better the trend line represents the data.
(i) For example, recordings of population sizes over the last 10 years would probably show a
linear or exponential growth in the future, if a trend-line was determined based on the
past.
(j) Another very useful aspect of inserting a trend-line into measured data, is the ability to
determine a representative mathematical function, that best suites the data. Once an
equation has been determined, this mathematical model offers major benefits to
engineering. The ability to best-fit any data is represented by an R2 value; the closer this
value is to 1, the better the trend-line represents the data, once again providing
confidence.
(k)In Excel for example, when inserting a trend-line, one has the ability to select a linear
(straight line), power, exponential, logarithmic, etc. function that provides a best-fit.

15
TREND LINES22

When plotting data in a graph, you may often want to visualize the general trend in your
data. This can be done by adding a trend line to a chart.

22
https://www.ablebits.com/office-addins-blog/2019/01/09/add-trendline-excel/

16
When you want to add a trend line to a
chart in Microsoft Graph, you can choose
any of the six different trend/regression
types. The type of data you have
determines the type of trend line you should
use.
https://support.office.com/en-us/article/choo
sing-the-best-trendline-for-your-data-
1bb3c9e7-0280-45b5-9ab0-d0c93161daa8

A LINE on a graph SHOWING the GENERAL DIRECTION that a GROUP OF POINTS


seem to follow.

17
CONCLUSION

 Your ability to analyse data correctly is very important,


 but you should not neglect the collection and representation of the data.
 Proper planning is required prior to embarking on any study or investigation.
 Planning will allow you to;
o Use the correct collection methods
o Use the appropriate data analysis tools
o Interrogate your data
o Present your data concisely

18

You might also like