5 views

Uploaded by picala

© All Rights Reserved

- FCM2063 Printed Notes.
- A Program for Aligning Sentences in Bilingual Corpora
- Vit
- Workshop on Data analysis using SPSS and AMOS
- MTC Students Portfolio in Statistics 2013
- PROBABILITY DISTRIBUTION
- Probability Distributions
- estadistica.pdf
- 5Enote8
- 2nd sem
- Simulation
- Assignment 1 Statistics Masters Level
- Chapter 1 Probability Distribution
- Bala Second Review
- real-world-problems-for-secondary-school-mathematics-students.pdf
- MCQ14
- AP-Stats-12-AP-Stats-Vocab.pdf
- PSED18 02 Descriptive Statistics, Distributions (1)
- 1.pdf
- Boi

You are on page 1of 40

Advanced Statistical Approaches to Quality

Statistical Methods using MATLAB

Statistical Process Control using MATLAB

2

Contents

Probability

Distributions Example 1:

Descriptive Statistics Probability of success in test

Estimation theory

Example 2:

Hypothesis testing Probability of success in test 2

Linear Model given that test 1<5.5?

Design of experiments

3

Contents

Probability

Distributions 0.35

mu=5.72 sigma=1.55

0.3

Descriptive Statistics

0.25

Estimation theory

0.2

Dens ity

Hypothesis testing

0.15

Linear Model

0.1

0

0 1 2 3 4 5 6 7 8 9 10

Score

4

Contents

Probability

25

Distributions

Test 1 Test 2

20 5. 6 6. 1

Descriptive Statistics 5. 1 7. 5

6. 8 6. 6

3. 4 3. 1

Estimation theory 15 6. 8 8. 4

Frequency

4. 6 6. 4

5. 6 4. 9

Hypothesis testing 6. 3 10. 0

10 5. 0 4. 0

Linear Model 7. 6

5. 6

8. 2

5. 8

5

Design of experiments

0

0 1 2 3 4 5 6 7 8 9 10

Score

5

Contents

Descriptive Statistics

Probability

Distributions Example:

What is and ?

Estimation theory

Hypothesis testing Bias

Robustness

Linear Model Confidence Interval

Design of experiments

6

Contents

Descriptive Statistics

Probability

Example 1:

Distributions When you have less than 4. 5

on test 1, you will not pass

Estimation theory

Hypothesis testing Example 2:

Linear Model Average Test1=Average Test 2

Design of experiments

7

Contents

Descriptive Statistics

10

Probability 9

8

Distributions

7

Estimation theory

Score Test 2

6

5

Hypothesis testing

4

Linear Model 3

2

Design of experiments 1

0 1 2 3 4 5 6 7 8 9 10

Score Test 1

8

Contents

Descriptive Statistics

Probability

Distributions

Estimation theory

To improve estimate

Hypothesis testing

Linear Model

... To improve prediction of model

Design of experiments

9

Decision Makers Use Statistics To:

Present and describe data and information properly

Draw conclusions about large groups of individuals or items, using information

collected from subsets of the individuals or items.

Make reliable forecasts about a computer software company

Predict the number of software defects and Improve software processes

What is Data?

Data: Consist of information coming from observations, counts,

measurements, or responses.

People who eat three daily servings of whole grains have been shown to

reduce their risk of stroke by 37%.

70% of the 1500 U.S. spinal cord injuries to minors result from vehicle

accidents, and 68% were not wearing a seatbelt.

10

What is Statistics?

Statistics

Data Information

numerical facts, collected communicated concerning

together for reference or some particular fact.

information.

11

A Computer Science student is anxious about her/his statistics course, since s/he

heard the course is difficult. The professor provides last terms final exam marks to

the student. What can be discerned from this list of numbers?

Statistics

Data Information

List of last terms marks. New information about the

statistics class.

95

89

70 E.g. Class average,

65 Proportion of class receiving As

78 Most frequent mark,

57 Marks distribution, etc.

:

12

The base prices of several vehicles are shown in the table. Which data are

qualitative data and which are quantitative data?

13

vehicle models are non- of vehicles models are

numerical entries) numerical entries)

14

Data Sets

Population

The collection of all outcomes,

responses, measurements, or

counts that are of interest.

Sample

A subset of the population.

15

Branches of Statistics

Involves organizing, Involves using sample data

summarizing, and displaying to draw conclusions about a

data. population.

averages

16

Descriptive Statistics

Collect data

e.g., Survey

Present data

e.g., Tables and graphs

Characterize data

e.g., Sample mean = X i

n

17

Inferential Statistics

Estimation

e.g., Estimate the population mean

weight using the sample mean weight

Hypothesis testing

e.g., Test the claim that the population

mean weight is 120 pounds

on a subset of the large group.

18

VARIABLE

A variable is a characteristic of an item or individual.

DATA

Data are the different values associated with a variable.

POPULATION

A population consists of all the items or individuals about

which you want to draw a conclusion.

SAMPLE

A sample is the portion of a population selected for analysis.

PARAMETER

A parameter is a numerical measure that describes a

characteristic of a population.

STATISTIC

A statistic is a numerical measure that describes a

characteristic of a sample.

19

The simplest kind of plot is a cartesian plot of (x,y) pairs defined by

symbols or connected with lines

>> x=0:0.05:10*pi;

>> y=exp(-0.1*x).*sin(x);

>> plot(x,y)

>> xlabel('X axis description')

>> ylabel('Y axis description') Title for plot goes here

1

>> title('Title for plot goes here') Legend for graph

>> legend('Legend for graph')

>> grid on

0.5

Y axis description

NOTE #1:

Reversing the x,y order 0

(y,x) simply rotates the

plot 90 degrees!

Manually inserted text...

-0.5

NOTE #2:

line(x,y) is similar to plot(x,y)

but does not have additional options -1

0 5 10 15 20 25 30 35

X axis description

20

Kinds of plots:

bar(x) creates a bar graph of the vector x. (Note also the command stairs(x))

bar(x,y) creates a bar-graph of the elements of the vector y, locating the bars

according to the vector elements of 'x'

21

m-function Structure

Function definition

Arguments

Returned variable

function volume=cylinder(radius, length)

% CYLINDER computes volume of circular cylinder

% given radius and length

% Use:

Help comments

% vol=cylinder(radius, length)

%

volume=pi.*radius^2.*length;

Statements

(no end required)

22

Online help for Statistics Toolbox is available from the MATLAB prompt (>> a

double arrow), both generally (listing of all available commands):

[a long list of help topics follows]

DISTTOOL Demonstration of many probability distributions.

DISTTOOL creates interactive plots of probability distributions.

This is a demo that displays a plot of the cumulative distribution

function (cdf) or probability distribution function (pdf) of the distributions

in the Statistics Toolbox.

23

>> disttool

24

binopdf - Binomial density.

chi2pdf - Chi square density.

exppdf - Exponential density.

fpdf - F density.

gampdf - Gamma density.

geopdf - Geometric density.

hygepdf - Hypergeometric density.

lognpdf - Lognormal density.

mvnpdf - Multivariate normal density.

normpdf - Normal (Gaussian) density.

pdf - Density function for a specified distribution.

poisspdf - Poisson density.

tpdf - T density.

unifpdf - Uniform density.

wblpdf - Weibull density.

25

For discrete distributions, the pdf assigns a probability to each outcome.

In this context, the pdf is often called a probability mass function (pmf).

For example, the discrete binomial pdf

n

f ( x) P( X x) p x (1 p) n x , x 0, 1, 2, , n

x

process (such as coin flipping) with probability p of success at each trial.

n = 10; % Number of trials

x = 0:n; % Outcomes

fx = pdf(bino,x,n,p); % Probability mass vector

bar(x,fx) ; % Visualize the probability distribution

26

Descriptive Statistics

corrcoef - Linear correlation coefficient with confidence intervals.

cov - Covariance.

mean - Sample average (in MATLAB toolbox).

median - 50th percentile of a sample.

range - Range.

std - Standard deviation (in MATLAB toolbox).

var - Variance (in MATLAB toolbox).

Example:

>> X = [ 1 2 3 5 6 7 23 45 33 46 22]

X=

1 2 3 5 6 7 23 45 33 46 22

>> mean(X)

ans =

17.5455

>> std(X)

ans =

17.5455

27

Examples:

A = [ 0 2 5 7 20] B = [1 2 3

336

468

4 7 7];

Mean:

mean(A) = 6.8

mean(B) = 3.0 4.5 6.0 (column-wise mean)

mean(B,2) = 2.0 4.0 6.0 6.0 (row-wise mean)

Median:

median(A) = 5

median(B) = 3.5 4.5 6.5 (column-wise median)

median(B,2) = 2.0

3.0

6.0

7.0 (row-wise median)

28

std(X) : Calcuate the standard deviation of vector x

If x is a matrix, std() will return the standard deviation of each column

Variance (defined as the square of the standard deviation) is calculated using the var() function

var(X) : Calcuate the variance of vector x

If x is a matrix, var() will return the standard deviation of each column

29

Descriptive Statistics

Example: The function displaytable.m is posted on the course website

>> X = rand(9,9); %generates 9x9 random matrix

>> displaytable(cov(X)); % plots the covariance matrix of X

>> displaytable(corrcoef(X)); % plots the correlation matrix of X

30

Data Correlations

2

% Compute sample correlation

1

r = corrcoef([var1,var2])

Variable 1

0 r = 1.0000 0.7051

0.7051 1.0000

-1

-2

-3

-2 -1.5 -1 -0.5 0 0.5 1 1.5 2

Variable 2

31

Statistical Plotting

andrewsplot - Andrews plot for multivariate data.

biplot - Biplot of variable/factor coefficients and scores.

boxplot - Boxplots of a data matrix (one per column).

cdfplot - Plot of empirical cumulative distribution function (cdf).

fsurfht - Interactive contour plot of a function.

glyphplot - Plot stars or Chernoff faces for multivariate data.

gplotmatrix - Matrix of scatter plots grouped by a common variable.

gscatter - Scatter plot of two variables grouped by a third.

hist - Histogram (in MATLAB toolbox).

hist3 - Three-dimensional histogram of bivariate data.

normplot - Normal probability plot.

parallelcoords - Parallel coordinates plot for multivariate data.

probplot - Probability plot.

surfht - Interactive contour plot of a data grid.

wblplot - Weibull probability plot.

32

Create a Pareto chart from data measuring the

number of manufactured parts rejected for

various types of defects.

>> quantity = [5 3 19 25];

>> pareto(quantity,defects);

each column of the matrix X. The box has lines

at the lower quartile, median, and upper quartile

values. The whiskers are lines extending from

each end of the box to show the extent of the

rest of the data. Outliers are data with values

beyond the ends of the whiskers

>> boxplot(runout);

33

Pareto charts display the values in the vector Y as bars drawn in descending order.

Values in Y must be nonnegative and not include NaNs. Only the first 95% of the

cumulative distribution is displayed.

Examine the cumulative productivity of a group of programmers to see how normal its

distribution is:

>> coders = {'Travis','Arash','Emad','Waleed','Farshad','Khaled','Mohamed',Maggie'};

>> pareto(codelines, coders)

>> title('Lines of Code by Student')

34

Scatter plots in 2D and 3D

>> X = [Acceleration Displacement Horsepower MPG Weight];

>> scatter(X(:,2),X(:,3),'.');

>> scatter3(X(:,1),X(:,2),X(:,3),'.');

3D histogram

>> hist3([X(:,1),X(:,2)]);

35

>> load carbig

>> X = [MPG,Acceleration,Displacement,Weight,Horsepower];

>> varNames = {'MPG'; 'Acceleration'; 'Displacement'; 'Weight'; 'Horsepowe r'};

>> gplotmatrix(X,[],Cylinders,['c' 'b' 'm' 'g' 'r'],[],[],false); text([.08 .24 .43 .66 .83],

repmat(-.1,1,5), varNames, 'FontSize',8); text(repmat(-.12,1,5), [.86 .62 .41 .25 .02],

varNames, 'FontSize',8, 'Rotat ion',90);

by the number of cylinders: blue for 4

cylinders, green for 6, and red for 8. There is

also a handful of 5 cylinder cars, and rotary-

engined cars are listed as having 3 cylinders.

This array of plots makes it easy to pick out

patterns in the relationships between pairs of

variables. However, there may be important

patterns in higher dimensions, and those are

not easy to recognize in this plot.

36

Statistical Plotting

normplot: Normal probability plot for graphical normality test.

>> x = normrnd(0,1,50,1); Normal Probability Plot

0.98

0.95

0.90

0.75

Probability

0.50

0.25

0.10

0.05

0.02

0.01

-1.5 -1 -0.5 0 0.5 1 1.5

Data

The plot is linear, indicating that you can model the sample by a

normal distribution

37

distribution is a n-dimensional extension

of a univariate Gaussian In a single

dimension a normal distribution is the

familiar bell-shaped curve. In two

dimensions each variable is itself a

normal distribution. If the two dimensions

are independent then they tend to

cluster as a circular cloud of points. if

they are correlated then the form an

ellipse. This can be extended to any

number multiple dimensions.

38

Statistical process control (SPC) refers to a number of different methods for monitoring

and assessing the quality of manufactured goods. Combined with methods from the

Design of Experiments, SPC is used in programs that define, measure, analyze,

improve, and control development and production processes. These programs are

often implemented using "Design for Six Sigma" methodologies.

capaplot - Capability plot.

ewmaplot - Exponentially weighted moving average plot.

histfit - Histogram with superimposed normal density.

normspec - Plot normal density between specification limits.

schart - S chart for monitoring variability.

xbarplot - Xbar chart for monitoring the mean.

39

normspec(specs,mu,sigma) plots the normal density between a lower and upper limit defined

by the two elements of the vector specs, where mu and sigma are the parameters of the

plotted normal distribution.

Example:

Suppose a cereal manufacturer produces 10 ounce boxes of corn flakes. Variability in the

process of filling each box with flakes causes a 1.25 ounce standard deviation in the true

weight of the cereal in each box. The average box of cereal has 11.5 ounces of flakes.

What percentage of boxes will have less than 10 ounces?.

Probability Between Limits is 0.88493

0.35

>> normspec([10 20],11.5,1.25)

0.3

0.25

0.2

Density

0.15

0.1

0.05

0

6 8 10 12 14 16 18 20

Critical Value

40

Control Charts

A control chart displays measurements of process samples over time. The measurements

are plotted together with user-defined specification limits and process-defined control

limits. The process can then be compared with its specificationsto see if it is in control or

out of control.

The chart is just a monitoring tool. Control activity might occur if the chart indicates an

undesirable, systematic change in the process. The control chart is used to discover the

variation, so that the process can be adjusted to reduce it.

Xbar or mean

Standard deviation

Range

Exponentially weighted moving average

Individual observation

Moving range of individual observations

Moving average of individual observations

Proportion defective

Number of defectives

Defects per unit

Count of defects

- FCM2063 Printed Notes.Uploaded byDidi Adilah
- A Program for Aligning Sentences in Bilingual CorporaUploaded byPayel Dutta Chowdhury
- VitUploaded byVinoth Raja
- Workshop on Data analysis using SPSS and AMOSUploaded byRamanathan KV
- MTC Students Portfolio in Statistics 2013Uploaded byJames Lavarias Suñga
- PROBABILITY DISTRIBUTIONUploaded byalborz99
- Probability DistributionsUploaded byBhargav Mendapara
- estadistica.pdfUploaded bypaty
- 5Enote8Uploaded byCyn Syjuco
- 2nd semUploaded byRj Yash
- SimulationUploaded byDarshit Jaju
- Assignment 1 Statistics Masters LevelUploaded byusha
- Chapter 1 Probability DistributionUploaded bymfy
- Bala Second ReviewUploaded byarunspeakers
- real-world-problems-for-secondary-school-mathematics-students.pdfUploaded byMalee Meelar
- MCQ14Uploaded by29_ramesh170
- AP-Stats-12-AP-Stats-Vocab.pdfUploaded byVodounnou
- PSED18 02 Descriptive Statistics, Distributions (1)Uploaded byMatheus Cardim
- 1.pdfUploaded byAdo Xyzyxz
- BoiUploaded byJoe
- Assignment 1 2019Uploaded byAkhil Garg
- BA9201 M -IUploaded byMicheal Vincent
- section6_5Uploaded byChandrachuda Sharma
- JURNALUploaded byenypurwaningsih
- sw652 course goals evidence by backward design 2junel2010Uploaded byapi-97308101
- c Bo 9781139542326 a 003Uploaded byMeron Moges
- sr introdUploaded byChristine Rose Orbase
- GlossaryUploaded by29_ramesh170
- math1040skittles1-4Uploaded byapi-320298210
- JUNE 2003 W2Uploaded byapi-3726022

- W3INSE6220.pdfUploaded bypicala
- W4INSE6220.pdfUploaded bypicala
- W6INSE6220.pdfUploaded bypicala
- W5INSE6220.pdfUploaded bypicala
- Chapters4-5.pdfUploaded bypicala
- W1INSE6220.pdfUploaded bypicala
- LectureNotes.pdfUploaded bypicala
- SyllabusINSE6220Uploaded bypicala
- SampleMidterms.pdfUploaded bypicala
- TablesMidtermExam.pdfUploaded bypicala
- MidtermFormula.pdfUploaded bypicala
- A1INSE6220-Winter17sol.pdfUploaded bypicala
- A1INSE6220-Winter17Uploaded bypicala
- z.pdfUploaded bypicala
- slides11.pdfUploaded bypicala
- slides10.pdfUploaded bypicala
- trace.pdfUploaded bypicala
- slides12.pdfUploaded bypicala
- state.pdfUploaded bypicala
- slides08.pdfUploaded bypicala
- slides09.pdfUploaded bypicala
- notes12.pdfUploaded bypicala
- notes16.pdfUploaded bypicala
- notes14.pdfUploaded bypicala
- seqact.pdfUploaded bypicala
- resolution.pdfUploaded bypicala
- mt2-example-questions.pdfUploaded bypicala
- notes15.pdfUploaded bypicala

- 00 s1 Papers to June 10Uploaded byEhtisham Khalid
- StatisticsUploaded byCt Kursiah
- Data Presentation and AnalysisUploaded bysweety
- b Value (Marzziotti Sandri 2003)Uploaded byVicente Bergamini Puglia
- Central TendencyUploaded byAlok Mittal
- sessio3Uploaded byHarold Llauca
- Leys MAD Final-libre (2)Uploaded byJorgeTrabajo
- Cronqvist Et Al-2001-Real Estate EconomicsUploaded bySebastien DesJardins
- Sampling Methods Applied to Fisheries ScienceUploaded byAna Paula Reis
- UNITIIProbabilityDFTheoryByDrNVNagendramUploaded byPashupati
- Asymmetric Information and Dividend PolicyUploaded bydfg
- Air Medical Services Cost Study ReportUploaded byEd Praetorian
- tutprac1(1)Uploaded byPham Truong Thinh Le
- Knowledge-Based Epistemologies for Erasure CodingUploaded bymehdivinci
- Chapter 7 StatisticsUploaded byDylan Ngu Tung Hong
- revelstokeUploaded byRevelstoke Editor
- Closeness, Loneliness, Support- Core Ties and Significant Ties in PersonalUploaded byCornel
- Stats Chap03.1Uploaded byElena Franco Villamin
- samples and populations investigation 2Uploaded byapi-242396159
- Project ReportUploaded bysushantchauhan24
- D3492.20048-1 - Vol. 09.02Uploaded byJignesh Trivedi
- Exam2SampleProblems SP17 -1507832832437Uploaded byrohitrgt4u
- QRG_ECE.pdfUploaded byckvirtualize
- portfolio assignment mbf3cUploaded byapi-354019809
- STAT 2507 Midterm 2011FUploaded byexamkiller
- Query Training _UMTS RadioUploaded byatices
- PROB STAT.4photocopyUploaded byjames_harrill1994_59
- Mean, Median and Mode_Module 1Uploaded byRavindra Babu
- Statistics.docxUploaded bypenn
- Peabody Picture Vocabulary TestUploaded byYunita Hamsari