161 views

Uploaded by ASClabISB

This tutorial on Notched and Vaiable Width Box-plots is prepared by the Applied Statistics and Computing lab at the Indian School of Business, Hyderabad. It is a part of the module on Descriptive Statistics, prepared by us.

save

- (5) Graphical Presentation 1
- (12)Continuous Distributions
- (4) Condensation of Data
- (6) Graphical Presentation 2
- (14) Joint Distribution
- (15) Chi-square, Student’s t and Snedecor’s F distributions
- (10) Box-Plot With Fences
- (7) Measures of Central Tendency
- (8b) Grouped Data_central Tendency and Dispersion
- (10) Hypergeometric Distribution
- (12) Bivariate Data
- (11) Poisson Distribution
- (8) Measures of Dispersion
- (4) Conditional Probability
- (9) Basic Box-Plot
- (1) Introduction
- (9) Geometric and Negative Binomial Distribution
- (2) Permutations and Combinations
- Gentle Lentil Case.pdf
- Teradata Case
- R Tutorial
- (3) Probability
- (7) Discrete Uniform Distribution
- (6) Random Variables and PMF
- (3) Methods of Data Collection
- (5) Bayes' Rule
- (2) Types of Data
- (1) Set Theory
- (13) Normal Distribution
- (8) Binomial Distribution
- (9) Geometric and Negative Binomial Distribution
- (15) Chi-square, Student’s t and Snedecor’s F distributions
- R Tutorial
- (3) Probability
- (10) Hypergeometric Distribution
- (7) Discrete Uniform Distribution
- (6) Random Variables and PMF
- (5) Bayes' Rule
- (11) Poisson Distribution
- (13) Normal Distribution
- (4) Conditional Probability
- (8) Binomial Distribution
- (1) Introduction
- (2) Permutations and Combinations
- (10) Box-Plot With Fences
- (7) Measures of Central Tendency
- (8b) Grouped Data_central Tendency and Dispersion
- (12) Bivariate Data
- (3) Methods of Data Collection
- (2) Types of Data
- (8) Measures of Dispersion
- (1) Set Theory
- (9) Basic Box-Plot

You are on page 1of 16

Applied Statistics and Computing Lab Indian School of Business

Applied Statistics and Computing Lab

Learning goals

• • • • • • • • What is a notched box-plot? How does one construct such a plot? What is a variable width box-plot? How does one construct such a plot? How are they useful? Can one combine these two features of a box-plot? How does one construct a box-plot for data with factors? What is its use?

Applied Statistics and Computing Lab

2

Dataset

• For this study of Notched and Variable Width box-plots, we consider a slightly modified version of the scores dataset • Suppose the score record is blank for some students, during some or all the exams • The student could have been absent for the exam • There could be a data entry error • In either case, we do not have 50 scores for each of the exams

Variable name First minor Second minor Third minor First semester GPA Second semester scores 47

3

# of observations available

48

45

48

47

Applied Statistics and Computing Lab

**Notched Box-plots
**

• As per Oxford Advanced Learner’s Dictionary, one of the meanings of notch is ‘a V-shaped cut in an edge or a surface.’ • This is used to test whether two or more population medians are equal at 5% level • In a notched box-plot, a notch appears on either side of the median. The interval corresponding to the notch is the confidence interval for the population median • If the notches of the box-plots of variables in the same frame do not overlap, then we conclude that the population medians are different (using a test at 5% level of significance)

Applied Statistics and Computing Lab

4

Notched box-plots (contd.)

Applied Statistics and Computing Lab

5

**Width of the box
**

• If there is only one batch (variable), the width can be arbitrary. • If there are several batches, each having the same number of observations, then again the width can be the same for all the variables. • If there are several batches with varying numbers of observations, it is desirable that the Box-plots produced in the same frame exhibit this information. – This can be done using varwidth option in R – When this option is used, the width of each box is proportional to the square root of the number of observations

Applied Statistics and Computing Lab

6

Variable width box-plot

Applied Statistics and Computing Lab

7

What if we combine the features of notches and variable width, to make a variable width notched box-plot?

Applied Statistics and Computing Lab

8

Variable width Notched Box-plot

Applied Statistics and Computing Lab

9

**Comments on the Box-plot
**

• Earlier we remarked that the medians of the three minors appear to be close. From the preceding plot, it is clear that the notch of First.minor does not overlap with those of the other two • Thus the earlier belief is refuted • The upper end of the notch of the Box-plot of Second.minor barely coincides with the lower end of the notch of Third.minor • Thus it cannot be said that the medians of the minors at population level are the same

Applied Statistics and Computing Lab

10

**Box-plot for data with factors
**

• Sometimes we have data on a batch with factors • Research has shown that in the fast-paced world of electronics, the key factor that separates the winners from the losers is actually how slow a firm is in making decisions: The most successful firms take longer to arrive at strategic decisions on product development, adopting new technologies, or developing new products • The following values are the number of months taken to arrive at a decision, for firms ranked high, medium and low in terms of Performance:

High Medium Low 3.5 4.8 3 1 5.5 2.5 3 6 2 6.5 7.5 4 4 8 4.5 6 2 6 6 2 5.5 6.5 9 4.5 2 7 5 3.5 9 5 10 6

2.5 7 1 2

11

1.5 1.5

3.8 4.5 0.5

Applied Statistics and Computing Lab

**Box-plot for data with factors (contd.)
**

• Notice that in such cases, typically one does ‘analysis of variance’ to test the equality of means. Here, the batch is the data on the number of months taken to arrive at a decision and the factor is the performance: high, medium and low • In such cases one can use a variable width notched Box-plot to examine the equality of medians. This can be used independently or in conjunction with the analysis of variance in arriving at meaningful conclusions on the location behavior of different factors

Applied Statistics and Computing Lab

12

Box-plot for data with factors (contd.)

Applied Statistics and Computing Lab

13

**Comments on the Box-plot
**

• From the plot it is clear that the medians in the population are most unlikely to be equal • For the Box-plot for high performance, the notch is within the first and third quartiles. However, for the plots corresponding to low and medium performances, the lower end of the notch is below the first quartile. Thus the population median could fall below the observed first quartile in these two cases • It is also worth noting that the sampling variability of the median (as observed by the length of the notch) is about the same for the three factors (performance groups).

Applied Statistics and Computing Lab

14

R-codes

Plot Notched box-plot Variable width box-plot R-code boxplot(‘data name’, notch=TRUE) install.packages(“aplpack”) library(aplpack) boxplot(‘data name’, varwidth=TRUE) boxplot(‘data name’, varwidth=TRUE, notch=TRUE) Boxplot(‘numeric variable’~’factor variable’, varwidth=TRUE, notch=TRUE)

Variable width notched boxplot Box-plot for data with factors

Applied Statistics and Computing Lab

15

Thank you

Applied Statistics and Computing Lab

- (5) Graphical Presentation 1Uploaded byASClabISB
- (12)Continuous DistributionsUploaded byASClabISB
- (4) Condensation of DataUploaded byASClabISB
- (6) Graphical Presentation 2Uploaded byASClabISB
- (14) Joint DistributionUploaded byASClabISB
- (15) Chi-square, Student’s t and Snedecor’s F distributionsUploaded byASClabISB
- (10) Box-Plot With FencesUploaded byASClabISB
- (7) Measures of Central TendencyUploaded byASClabISB
- (8b) Grouped Data_central Tendency and DispersionUploaded byASClabISB
- (10) Hypergeometric DistributionUploaded byASClabISB
- (12) Bivariate DataUploaded byASClabISB
- (11) Poisson DistributionUploaded byASClabISB
- (8) Measures of DispersionUploaded byASClabISB
- (4) Conditional ProbabilityUploaded byASClabISB
- (9) Basic Box-PlotUploaded byASClabISB
- (1) IntroductionUploaded byASClabISB
- (9) Geometric and Negative Binomial DistributionUploaded byASClabISB
- (2) Permutations and CombinationsUploaded byASClabISB
- Gentle Lentil Case.pdfUploaded byRahul Sukhija
- Teradata CaseUploaded byMila Gorodetsky
- R TutorialUploaded byASClabISB
- (3) ProbabilityUploaded byASClabISB
- (7) Discrete Uniform DistributionUploaded byASClabISB
- (6) Random Variables and PMFUploaded byASClabISB
- (3) Methods of Data CollectionUploaded byASClabISB
- (5) Bayes' RuleUploaded byASClabISB
- (2) Types of DataUploaded byASClabISB
- (1) Set TheoryUploaded byASClabISB
- (13) Normal DistributionUploaded byASClabISB
- (8) Binomial DistributionUploaded byASClabISB

- (9) Geometric and Negative Binomial DistributionUploaded byASClabISB
- (15) Chi-square, Student’s t and Snedecor’s F distributionsUploaded byASClabISB
- R TutorialUploaded byASClabISB
- (3) ProbabilityUploaded byASClabISB
- (10) Hypergeometric DistributionUploaded byASClabISB
- (7) Discrete Uniform DistributionUploaded byASClabISB
- (6) Random Variables and PMFUploaded byASClabISB
- (5) Bayes' RuleUploaded byASClabISB
- (11) Poisson DistributionUploaded byASClabISB
- (13) Normal DistributionUploaded byASClabISB
- (4) Conditional ProbabilityUploaded byASClabISB
- (8) Binomial DistributionUploaded byASClabISB
- (1) IntroductionUploaded byASClabISB
- (2) Permutations and CombinationsUploaded byASClabISB
- (10) Box-Plot With FencesUploaded byASClabISB
- (7) Measures of Central TendencyUploaded byASClabISB
- (8b) Grouped Data_central Tendency and DispersionUploaded byASClabISB
- (12) Bivariate DataUploaded byASClabISB
- (3) Methods of Data CollectionUploaded byASClabISB
- (2) Types of DataUploaded byASClabISB
- (8) Measures of DispersionUploaded byASClabISB
- (1) Set TheoryUploaded byASClabISB
- (9) Basic Box-PlotUploaded byASClabISB