Data Analysis Techniques in Engineering

The document discusses descriptive statistics and methods for summarizing continuous data using histograms. It provides guidelines for grouping continuous data into classes and constructing a frequency table and histogram. An example is given using data on the weights of miniature candy bars to demonstrate how to group the data into classes and make a relative frequency histogram.

Uploaded by

jkellymalacao

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

100 views18 pages

Data Analysis Techniques in Engineering

Uploaded by

jkellymalacao

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

Engineering

Data Analysis
Point Estimation
Point Estimation
• 1. Descriptive Statistics • 6. Asymptotic Distributions of
Maximum Likelihood Estimators
• 2. Exploratory Data Analysis
• 7. Sufficient Statistics
• 3. Order Statistics

• 4. Maximum Likelihood • 8. Bayesian Estimation

Estimation • 9. More Bayesian Concepts

• 5. A Simple Regression Problem

2
6.1. Descriptive Statistics The weight of a miniature Baby Ruth candy bar could be
any number between 20 and 27 grams. Even though such
Descriptive statistics refers to a set of methods used times and weights could be selected from an interval of
to summarize and describe the main features of a dataset, values, times and weights are generally rounded off so that
such as its central tendency, variability, and distribution. the data often look like discrete data. If, conceptually, the
These methods provide an overview of the data and help measurements could come from an interval of possible
identify patterns and relationships. outcomes, we call them data from a distribution of the
continuous type or, more simply, continuous-type data.
In previous Chapter, we considered probability
distributions of random variables whose space S contains a Given a set of continuous-type data, we shall group the data
countable number of outcomes: either a finite number of into classes and then construct a histogram of the grouped
outcomes or outcomes that can be put into a one-to-one data. This will help us better visualize the data. The
correspondence with the positive integers. Such a random following guidelines and terminology will be used to group
variable is said to be of the discrete type, and its continuoustype data into classes of equal length (these
distribution of probabilities is of the discrete type. guidelines can also be used for sets of discrete data that
have a large range).
Of course, many experiments or observations of
random phenomena do not have integers or other discrete 1. Determine the largest (maximum) and smallest
numbers as outcomes, but instead are measurements (minimum) observations. The range is the difference,
selected from an interval of numbers. For example, you R =maximum − minimum.
could find the length of time that it takes when waiting in
line to buy frozen yogurt. Or the weight of a “1-pound” 2. In general, select from k = 5 to k = 20 classes, which
package of hot dogs could be any number between 0.94 are non over lapping intervals, usually of equal length.
pounds and 1.25 pounds. These classes should cover the interval from the minimum
to the maximum.
3. Each interval begins and ends halfway between two A frequency table is constructed that lists the class
possible values of themeasurements, which have been intervals, the class limits, a tabulation of the measurements
rounded off to a given number of decimal places. in the various classes, the frequency fi of each class, and the
class marks. A column is sometimes used to construct a
4. The first interval should begin about as much below relative frequency (density) histogram. With class intervals
the smallest value as thelast interval ends above the of equal length, a frequency histogram is constructed by
larges. drawing, for each class, a rectangle having as its base the
class interval and a height equal to the frequency of the
5. The intervals are called class intervals and the class. For the relative frequency histogram, each rectangle
boundaries are called class boundaries. We shall has an area equal to the relative frequency fi/n of the
denote these k class intervals by observations for the class. That is, the function defined by
(c0,c1],(c1,c2],...,(ck−1,ck].
fi h(x) = ( n)(ci − ci−1), for ci−1 < x ≤ ci,
i = 1,2,...,k,
6. The class limits are the smallest and the largest
possible observed (recorded) values in a class. is called a relative frequency histogram or density
histogram, where fi is the frequency of the ith class and n is
7. The class mark is the midpoint of a class. the total number of observations. Clearly, if the class
intervals are of equal length, the relative frequency
histogram, h(x), is proportional to the frequency histogram
fi, for ci−1 < x ≤ ci,i = 1,2,...,k. The frequency histogram
should be used only in those situations in which the class
intervals are of equal length. A relative frequency histogram
can be treated as an estimate of the underlying pdf.
Example 6.1-1 The weights in grams of 40 miniature Baby
Ruth candy bars, with the weights ordered, are given in A relative frequency histogram of these data is given in
Table 6.1-1. Figure 6.1-1. Note that the total area of this histogram is equal
to 1. We could also construct a frequency histogram in which
We shall group these data and then construct a histogram the heights of the rectangles would be equal to the frequencies
to visualize the distribution of weights. The range of the of the classes. The shape of the two histograms is the same.
data is R = 26.7 − 20.5 = 6.2. The interval (20.5,26.7) could Later we will see the reason for preferring the relative
be covered with k = 8 classes of width 0.8 or with k = 9 frequency histogram. In particular, we will be superimposing
classes of width 0.7. (There are other possibilities.) We on the relative frequency histogram the graph of a pdf.
shall use k = 7 classes of width 0.9. The first class interval
will be (20.45,21.35) and the last class interval will be
(25.85,26.75). The data are grouped in Table Table 6.1-2 Frequency table of candy bar weights

Class Interval Class Limits Tabulation Frequency (fi) h(x) Class Marks

Table 6.1-1 Candy bar weights (20.45,21.35) 20.5–21.3 5 5/36 20.9

(21.35,22.25) 21.4–22.2 4 4/36 21.8

20.5 20.7 20.8 21.0 21.0 21.4 21.5 22.0 22.1 22.5
(22.25,23.15) 22.3–23.1 8 8/36 22.7

22.6 22.6 22.7 22.7 22.9 22.9 23.1 23.3 23.4 23.5 (23.15,24.05) 23.2–24.0 7 7/36 23.6

(24.05,24.95) 24.1–24.9 8 8/36 24.5

23.6 23.6 23.6 23.9 24.1 24.3 24.5 24.5 24.8 24.8
(24.95,25.85) 25.0–25.8 5 5/36 25.4
24.9 24.9 25.1 25.1 25.2 25.6 25.8 25.9 26.1 26.7
(25.85,26.75) 25.9–26.7 3 3/36 26.3
Suppose that we now consider the situation in which
we actually perform a certain random experiment n times,
obtaining n observed values of the random variable—say,
x1,x2,...,xn. Often the collection is referred to as a sample. It
is possible that some of these values might be the same, but
we do not worry about this at this time. We artificially
create a probability distribution by placing the weight 1/n
on each of these x-values. Note that these weights are
positive and sum to 1, so we have a distribution we call the
empirical distribution, since it is determined by the data
x1,x2,...,xn. The mean of the empirical distribution is

which is the arithmetic mean of the observations

x1,x2,...,xn. We denote this mean by x and call it the sample
mean (or mean of the sample x1, x2, ... , xn). That is, the
sample mean is
which is, in some sense, an estimate of μ if the latter is Many find that the right-hand expression makes the
unknown. Likewise, the variance of the empirical computation easier than first taking the n differences,
distribution is xi − x, i = 1, 2, ... , n; squaring them; and then summing.
There is another advantage when x has many digits to the
right of the decimal point. If that is the case, then xi − x
must be rounded off, and that creates an error in the sum
which can be written as of squares. In the easier form, that rounding off is not
necessary until the computation is completed. Of course,
if you are using a statistical calculator or statistics
package on the computer, all of these computations are
is, because we will see later that, in some sense, s2 is a done for you.
better estimate of an unknown σ2 than is v. Thus, the The sample standard deviation, s = √s^2 ≥ 0, is a
sample variance is measure of how dispersed the data are from the sample
mean. At this stage of your study of statistics, it is
difficult to get a good understanding or meaning of the
standard deviation s, but you can roughly think of it as
the average distance of the values x1,x2,...,xn from the
REMARK It is easy to expand the sum of squares; we mean x. This is not true exactly, for, in general,
have

but it is fair to say that s is somewhat larger, yet of the same

magnitude, as the average of the distances of x1,x2,...,xn from
x.
2 2
𝑛𝑥 𝑛(𝑛−1)𝑥 𝑛𝑥 𝑛(𝑛−1)𝑥
Example 6.1-2 There is an alternative way of computing , because =
[n/(n − 1)]v and
Rolling a fair six-sided die five times could result in the
following sample of n = 5 observations: v
x1 = 3,x2 =1,x3 = 2, x4 = 6,x5 =3.
In this case, (3 +1+2+6 +3)
𝑥= =3
5

¿ 1 + + + …+ + +…
And

It follows that s = = 1.87. We had noted that s can roughly

1! 2! 1! 2!
be thought of as the average distance that the x-values are
away from the sample mean x. In this example, the
distances from the sample mean, x = 3, are 0, 2, 1, 3, 0,
with an average of 1.2, which is less than s = 1.87. In
general, s will be somewhat larger than this average
distance.

MT233: Probability & Statistics Overview
No ratings yet
MT233: Probability & Statistics Overview
39 pages
MTE 201 (2024) Prof Mushayabasa
No ratings yet
MTE 201 (2024) Prof Mushayabasa
40 pages
Descriptive & Inferential Stats Guide
100% (1)
Descriptive & Inferential Stats Guide
82 pages
Understanding Descriptive Statistics Basics
No ratings yet
Understanding Descriptive Statistics Basics
60 pages
Screenshot 2024-07-22 at 10.26.36 AM
No ratings yet
Screenshot 2024-07-22 at 10.26.36 AM
35 pages
Or Lecture 202209
No ratings yet
Or Lecture 202209
21 pages
Statistical Concepts and Definitions
No ratings yet
Statistical Concepts and Definitions
15 pages
Descriptive Statistics Overview
No ratings yet
Descriptive Statistics Overview
51 pages
Math
No ratings yet
Math
6 pages
Cambridge Books Online
No ratings yet
Cambridge Books Online
20 pages
Frequency Distribution & Measures of Central Tendency-1
No ratings yet
Frequency Distribution & Measures of Central Tendency-1
34 pages
Chapter 2-Descriptive Statistics and Data Presentation
No ratings yet
Chapter 2-Descriptive Statistics and Data Presentation
7 pages
Introduction To Statistics
No ratings yet
Introduction To Statistics
26 pages
Business Statistics Course Guide
No ratings yet
Business Statistics Course Guide
69 pages
Introduction to Statistics Concepts
No ratings yet
Introduction to Statistics Concepts
12 pages
Understanding Statistics: Types & Methods
No ratings yet
Understanding Statistics: Types & Methods
7 pages
Engineering Mathematics: Data Analysis Techniques
0% (1)
Engineering Mathematics: Data Analysis Techniques
49 pages
Probability and Statistics Course Syllabus
No ratings yet
Probability and Statistics Course Syllabus
70 pages
Statistics 1
No ratings yet
Statistics 1
34 pages
Data Visualization in R: Frequency Distributions
No ratings yet
Data Visualization in R: Frequency Distributions
24 pages
Introduction To Statistics
No ratings yet
Introduction To Statistics
28 pages
MEI Statistics 1 Module Overview
No ratings yet
MEI Statistics 1 Module Overview
9 pages
Understanding Filash Score in Statistics
No ratings yet
Understanding Filash Score in Statistics
3 pages
AGB Unit
No ratings yet
AGB Unit
63 pages
Stats Notes by Warad
No ratings yet
Stats Notes by Warad
5 pages
Chapter 4
No ratings yet
Chapter 4
9 pages
Biostatistics Notes
No ratings yet
Biostatistics Notes
42 pages
MAT 152 - P2 Reviewer
No ratings yet
MAT 152 - P2 Reviewer
9 pages
Understanding Frequency Distribution Concepts
No ratings yet
Understanding Frequency Distribution Concepts
19 pages
00 Probability 2
No ratings yet
00 Probability 2
19 pages
4 - Statistik Deskriptif
No ratings yet
4 - Statistik Deskriptif
33 pages
Unit 2
No ratings yet
Unit 2
39 pages
CAIE A2 Paper 3 Maths
No ratings yet
CAIE A2 Paper 3 Maths
48 pages
Statistics and Probability
100% (7)
Statistics and Probability
141 pages
MATH 361: Probability & Statistics Overview
No ratings yet
MATH 361: Probability & Statistics Overview
17 pages
Descriptive Statistics Overview
No ratings yet
Descriptive Statistics Overview
2 pages
Biostatistics: Descriptive Statistics Guide
No ratings yet
Biostatistics: Descriptive Statistics Guide
53 pages
Research 3 Quarter 3 - MELC 1 Week 1-2 Inferential Statistics
No ratings yet
Research 3 Quarter 3 - MELC 1 Week 1-2 Inferential Statistics
39 pages
MMW Statistics: Frequency Distribution
No ratings yet
MMW Statistics: Frequency Distribution
18 pages
Group 4 - Normal Distributions Bsed Fil-1b
No ratings yet
Group 4 - Normal Distributions Bsed Fil-1b
16 pages
Introductory Statistics Guide
No ratings yet
Introductory Statistics Guide
24 pages
L I - Statistics
No ratings yet
L I - Statistics
6 pages
Descriptive Measures in Statistics
No ratings yet
Descriptive Measures in Statistics
33 pages
Statistics and Research Methods Overview
No ratings yet
Statistics and Research Methods Overview
92 pages
$RELC031
No ratings yet
$RELC031
43 pages
Summry Biostatstics
No ratings yet
Summry Biostatstics
32 pages
Warm Modal Interval in Frequency Polygons
No ratings yet
Warm Modal Interval in Frequency Polygons
22 pages
Understanding Statistics: Key Concepts
No ratings yet
Understanding Statistics: Key Concepts
27 pages
Intro to Statistics Basics
No ratings yet
Intro to Statistics Basics
43 pages
Unit 2 Describing Data
No ratings yet
Unit 2 Describing Data
32 pages
Frequency Distribution Explained
No ratings yet
Frequency Distribution Explained
5 pages
Engineering Data Analysis Techniques
No ratings yet
Engineering Data Analysis Techniques
11 pages
Statatics Chapter 1
No ratings yet
Statatics Chapter 1
21 pages
Big Data Visualization Techniques
No ratings yet
Big Data Visualization Techniques
43 pages
Chapter - 5 Fundamentals of Statisticsl
No ratings yet
Chapter - 5 Fundamentals of Statisticsl
81 pages
General and Particular Rule of Probability Multiplication
No ratings yet
General and Particular Rule of Probability Multiplication
9 pages
Sampling Distributions
No ratings yet
Sampling Distributions
43 pages
Introduction to Probability Concepts
No ratings yet
Introduction to Probability Concepts
39 pages
Bayesian Programming Overview
No ratings yet
Bayesian Programming Overview
16 pages
Stable Distributions
No ratings yet
Stable Distributions
45 pages
TPS5e Ch5.2
No ratings yet
TPS5e Ch5.2
14 pages
5.5.4 Homework Assignment
No ratings yet
5.5.4 Homework Assignment
5 pages
FINA2230 Assignment 2 (24-25)
No ratings yet
FINA2230 Assignment 2 (24-25)
2 pages
Inferential Statistics Assessment Guide
100% (1)
Inferential Statistics Assessment Guide
3 pages
Naive Bayes Classification Numerical Example - Coding Infinite
No ratings yet
Naive Bayes Classification Numerical Example - Coding Infinite
14 pages
NeurIPS 2020 Reciprocal Adversarial Learning Via Characteristic Functions Paper
No ratings yet
NeurIPS 2020 Reciprocal Adversarial Learning Via Characteristic Functions Paper
12 pages
Tabela de Gestao de Banca
No ratings yet
Tabela de Gestao de Banca
59 pages
Probability Theory Assignment 2008
No ratings yet
Probability Theory Assignment 2008
2 pages
Engle 1982
No ratings yet
Engle 1982
22 pages
Understanding Discrete Probability Distributions
No ratings yet
Understanding Discrete Probability Distributions
31 pages
Engineering Mathematics Iv Kas 402
No ratings yet
Engineering Mathematics Iv Kas 402
3 pages
Probability Concepts: Marginal, Joint, Union
No ratings yet
Probability Concepts: Marginal, Joint, Union
127 pages
Making Real-Time Predictions For NBA Basketball Games by Combining The Historical Data and Bookmaker's Betting Line
No ratings yet
Making Real-Time Predictions For NBA Basketball Games by Combining The Historical Data and Bookmaker's Betting Line
8 pages
Expected Return and Risk Analysis
No ratings yet
Expected Return and Risk Analysis
30 pages
Mit18 05 s22 Exam01
No ratings yet
Mit18 05 s22 Exam01
11 pages
Calculating Poisson Distribution For Football Results
No ratings yet
Calculating Poisson Distribution For Football Results
2 pages
Concept King Pages 11
No ratings yet
Concept King Pages 11
9 pages
Markov Chain Exercises and Solutions
100% (1)
Markov Chain Exercises and Solutions
2 pages
1997 (Jack Johnston, John Dinardo) Econometric Methods PDF
84% (19)
1997 (Jack Johnston, John Dinardo) Econometric Methods PDF
514 pages
RGD 111-120
No ratings yet
RGD 111-120
10 pages
Practice Questions 2
No ratings yet
Practice Questions 2
2 pages
Calrel Manual
No ratings yet
Calrel Manual
75 pages
Bayesian Stats for Academics
No ratings yet
Bayesian Stats for Academics
33 pages

Data Analysis Techniques in Engineering

Uploaded by

Data Analysis Techniques in Engineering

Uploaded by

Engineering

• 4. Maximum Likelihood • 8. Bayesian Estimation

Estimation • 9. More Bayesian Concepts

Table 6.1-1 Candy bar weights (20.45,21.35) 20.5–21.3 5 5/36 20.9

(21.35,22.25) 21.4–22.2 4 4/36 21.8

(24.05,24.95) 24.1–24.9 8 8/36 24.5

which is the arithmetic mean of the observations

but it is fair to say that s is somewhat larger, yet of the same

It follows that s = = 1.87. We had noted that s can roughly

You might also like