0 Up votes0 Down votes

67 views5 pagesBox ploys

Mar 31, 2016

© © All Rights Reserved

DOC, PDF, TXT or read online from Scribd

Box ploys

© All Rights Reserved

67 views

Box ploys

© All Rights Reserved

- The Law of Explosive Growth: Lesson 20 from The 21 Irrefutable Laws of Leadership
- Hidden Figures: The American Dream and the Untold Story of the Black Women Mathematicians Who Helped Win the Space Race
- Hidden Figures Young Readers' Edition
- The E-Myth Revisited: Why Most Small Businesses Don't Work and
- Micro: A Novel
- The Wright Brothers
- The Other Einstein: A Novel
- State of Fear
- State of Fear
- The Power of Discipline: 7 Ways it Can Change Your Life
- The Kiss Quotient: A Novel
- The 10X Rule: The Only Difference Between Success and Failure
- Being Wrong: Adventures in the Margin of Error
- Algorithms to Live By: The Computer Science of Human Decisions
- The 6th Extinction
- The Black Swan
- The Art of Thinking Clearly
- The Last Battle
- Prince Caspian
- A Mind for Numbers: How to Excel at Math and Science Even If You Flunked Algebra

You are on page 1of 5

1. Box-and-Whisker Plots

In the previous lecture we learned about several descriptive statistics used to

characterize a sample or population of data. These included measures of central

tendency (mean, median, and mode) and two elementary measures of spread or

dispersion (the range, and the inter-quartile range or IQR).

An easy way to visually describe data is a box plot, also called a box-and-whisker

plot.

A box plot contains a central rectangle (box) with lines (whiskers) that extend from both

ends. The box plot below shows the median value, Q1 and Q3, the range, and the IQR:

There are many variations on this format. Some box plots shows outlier values in the

data, denoting these as asterisks (*). In that case, the whiskers are shortened: one

strategy is to terminate the whiskers at the 10% and 90% percentiles; another is to draw

them as Q1 1.5 IQR (lower whisker) and Q3 + 1.5 IQR (upper whisker).

You can place several box plots in a single figure

Always include the values of your variable along the x- or y-axis

07 Box Plots, Variance and Standard Deviation

The range and the inter-quartile range (IQR) are relatively primitive measures of the

spread or dispersion of a data set. We'll now discuss the more sophisticated and

commonly used variance and standard deviation. The standard deviation is simply

the square root of the variance.

To motivate our inquiry, suppose we have the following data and want to characterize the

spread of the values:

1, 1, 1, 2, 2, 2, 2, 3, 3, 3, 3, 100

We might choose the range (100 1 = 99) to describe the spread. The problem with

that is that 100 is an outlier most of the data fall between 1 and 3. Simply to describe

the range as 99 is misleading, because it doesn't apply to the bulk of the data.

Using the IQR to summarize range might be a little better, but we could come up with

similar examples to show how it can be misleading.

The problem with the range and the IQR is that both are based only on a subset of the

data: the range considers only two values (the minimum and maximum), and the IQR

only considers half the values (the upper quarter and the lower quarter). What we really

want is a measure of spread that considers all the data values.

So let's try a different approach. One way of operationally defining 'how spread out data

are' is to consider how far individual values are from the mean. So, in the example,

below, suppose the mean is 3.

07 Box Plots, Variance and Standard Deviation

In theory, we can calculate the distance of every value from the mean.

And then we might take the average of all these differences from the mean:

1 3, 2 3, 2 3, 0 3, 4 3, 4 3, 5 3

Using summation notation, this could be expressed as:

1

N

(x X )

where N is the number of data values (here N = 7), X is the average value ( X = 3),

and x refers to each of the individual values.

The problem is that we will always have both positive and negative differences, which

will each other out, and we are always left with 0 as the answer!

The way to avoid this is simply to square each difference from the mean. All these

squared differences will be positive. This gives us the formal definitions of the variance

and standard deviation:

variance: the average squared distance from each data value to the mean.

If you remember these simple definitions, you will always be able to correctly calculate

the variance and standard deviation

Applying the first definition to the example above we get the following formula for the

variance:

1

N

(x X )

(x X )

07 Box Plots, Variance and Standard Deviation

This formula simply means: (1) take the sum, of (2) the squared difference between

each value minus the mean, and (3) divide the sum by N.

There is, however, one minor complication. The formula for the variance is different

depending on whether we are treating the data as a population or as a sample.

Specifically:

denominator.

1 in the denominator

Notice that in the second two equations, we denote the number of cases with a little n,

because we are talking about a sample.

07 Box Plots, Variance and Standard Deviation

Excel has several functions for calculating the variance and standard deviation. You use

different functions, depending on whether you wish to treat your data as a sample or a

population:

Statistic

sample variance

population variance

sample standard deviation

population standard deviation

Excel function

=var(<range>)

=varp(<range>)

=stdev(<range>)

=stdevp(<range>)

where <range> means the range of cells that contain your data, e.g., a:a or a1:a100.

So, for example, if your data are in Column A, the sample variance would be given by

the Excel formula:

=var(a:a)

Reading: pp. 97106.

- Three-point Approximations for Continuous Random VariablesUploaded byanshumanmishra
- Solution Chapter5Uploaded byAykut Yıldız
- Newsboy Problem: Optimal Order Quantity in Presence of Random SupplyUploaded bySoumikGogolPurkayastha
- Mathematical symbols list.pdfUploaded byAhmed Zezo
- Texture features from Chaos Game Representation Images of GenomesUploaded byAI Coordinator - CSC Journals
- Report 1Uploaded byDaniyal Aslam
- COMPUTING-THE-VARIANCE-OF-A-DISCRETE-PROBABILITY-DISTRIBUTIONUploaded byGelizza Marie Novino
- Untitled 1Uploaded bySyaza Syahira
- Sept 17 Correlation Covariation DSIUploaded byMagued Ezz El Dine
- SPSS ExamUploaded byjohnboy2011
- ADA formula.xlsxUploaded byDinesh Raghavendra
- numrep1bUploaded bysigmasundar
- ANEXO.pdfUploaded byHuamani Juvenal
- 11_economics_notes_ch06_measures_of_dispersion.pdfUploaded byraghu8215
- Risk and ReturnUploaded byMira Edora
- 1questions for pain medication data setUploaded byapi-125958972
- APSTATS Midterm Cram SheetUploaded bysphazhang
- Standard DeviationUploaded byVishal Gandhi
- ch 4 unit planUploaded byapi-214764900
- CT3_QP_0515Uploaded byNitai Chandra Ganguly
- Descriptive sUploaded bySuganthi Supaiah
- Ezy Math Tutoring - Further MathsUploaded byVincents Genesius Evans
- 90876-2014-syllabusUploaded bymirmoulabux
- Frequency Table.rtfUploaded byGusrah Kardi
- RM ProjectUploaded bypravinswamy
- Chapter 5 AnovaUploaded byMuhd Naim
- C50Uploaded bydanielchontal12
- SAR_TOL_STDUploaded byjoshswanson7
- Mathematical Symbols ListUploaded byAhmed Zezo
- Mathematics s Paper 2 Final Exam Lower 6 2011Uploaded byMasyati Karim

- Restricted_Testing_000.pdfUploaded byAnonymous yatFBhevv
- Trip GenerationUploaded bysatishsajja
- Desain Penelitian Overview - Dr. Kuntjoro Harimukti, SpPD(K)Uploaded byLaurencia Leny
- IIIT-B & UpGrad PG Diploma Program-CurriculumUploaded bySiba Mohapatra
- 380 Spss Exercise Answer SheetUploaded bySanctaTiffany
- Kupdf.net Industrial Plant Engineering Reviewer CompletepdfUploaded byMiguel Ocampo
- Test de normalidad con R.pdfUploaded byJuan Salazar Jaime
- Monte Carlo MethodsUploaded bygenmuratcer
- 25.pdfUploaded byAlejandro Romero
- MQM100_MultipleChoice_Chapter9Uploaded byNakin K
- Agnes StatUploaded byJayne Andrada
- c13Uploaded byrgerwwaa
- TestingUploaded byAkash Bhowmik
- Joint Probability Distributions of SoilUploaded byQiang Li
- Malhotra Mr05 Ppt 17Uploaded byABHISHEK CHAKRABORTY
- statistics practice 4Uploaded byapi-248877347
- Solutions ManualUploaded byDenzel G Francis
- ALONGADOR DE PÉNIS SEM CIRURGIA É COM EXTENSORUploaded byvacextensor10
- Ch+3+Moore.pptUploaded byAdeel Shaikh
- P NB ProbitEUploaded byAnarMasimov
- The Normal CurveUploaded bySuvasish Suvasish
- (Human–Computer Interaction Series) Judy Robertson, Maurits Kaptein (eds.) - Modern Statistical Methods for HCI-Springer International Publishing (2016).pdfUploaded byMutia Fatin
- Dmaic - Book of Knowledge - Green Belt11ea60e023fUploaded bysrikantharvikar
- All TestsUploaded byMazin Alahmadi
- 2 Introduction to Sampling and Survey DesignUploaded byeduson2013
- 20190903 - very useful L5_Sampling and Sample Size_0Uploaded bymdcerisoli
- college math 2Uploaded byapi-401649604
- IMF Institutions and Growth - A GMM IV Panel VAR ApproachUploaded bysdinu
- Effect Based Training, Multiple Intelligence, Characteristics, and Background of the Tranees: The Key to the Improvement of the Early Childhood Tutor CompetencesUploaded byIOSRjournal
- 76197355 Ken Black QA 5th Chapter15 SolutionUploaded byManish Khandelwal

## Much more than documents.

Discover everything Scribd has to offer, including books and audiobooks from major publishers.

Cancel anytime.