4 views

Original Title: Lecture_05_e_book Without Most of Excel

Uploaded by Amit Verma

- Statistics for People Who Think They Hate Statistics Using Microsoft Excel 2016 4th Edition Salkind Test Bank
- scn lesson plan
- Measures of Central Tendency Dispersion and Correlation
- Confidential
- LSD - Statistical Concepts
- Calculating
- 331hw12
- Statistics Lily
- ET
- Group Assignment Statistic 2
- Performance Assessment of Several Filters for Removing Salt and Pepper Noise Gaussian Noise Rayleigh Noise and Uniform Noise
- W2 - Playground Accessibility - Scott A. Bennet.pdf
- data anyalsis edl 7510 all
- Lecture 2 Desc Stat 22016
- Body on Test Analysis
- statistics project
- 123S10-3
- 2015-16 Topic 2 - Descriptive Statistics IB Review Questions.docx
- Seminar an.
- Revisionguide - Stats

You are on page 1of 8

There are three main measures of central tendency: the mean, median, and mode. The purpose of measures of central tendency is to identify the location of the center of various distributions. For example, lets consider the data below. This data represents the number of miles per gallon that 30 selected four-wheel drive sports utility vehicles obtained in city driving.

12 16 15 10 19

17 18 16 14 13

16 17 16 15 16

14 16 15 11 18

16 17 16 15 16

18 15 19 15 20

However, in its current form it is difficult to determine where the center for the above data set lies. Thus, one way to help us get a better idea as to where the center of a distribution is located is to graph the data. Because the data is numerical, the most appropriate method of graphing the data would be to create a histogram. After inspecting the data, a bin size of 1 seems reasonable with a starting point of 10 mpg and an ending point of 20 mpg. The histogram for the gas mileage data is given below.

SUV Gas Mileage Data 10 9 8 7 Frequency 6 5 4 3 2 1 0 10 11 12 13 14 15 16 17 18 19 20 Miles per Gallon

If we rely on sight alone, it seems that the middle of the distributions lies at around 15 to 16 miles per gallon; however, because our senses can sometimes deceive us, we want to be a little more scientific in our methodology. The Mode The first measure of central tendency we will discuss is the mode. The mode is the observation that occurs most frequently. Thus, to find the mode for the above data set we simply locate the observation that occurs most frequently. In this case, the number 16 occurs 9 times, which is more than any other observation. Therefore, the mode of the data is 16. The Median The median is the middle observation in the data. This means that 50% of the data is below the median and 50% of the data is above the median. To find the median, we must first organize the data in order from the smallest to the largest observation. For example, the above gas mileage data would take on the following form: 10 11 12 13 14 14 15 15 15 15 15 15 16 16 16 16 16 16 16 16 16 17 17 17 18 18 18 19 19 20 To help us find the middle, or halfway point, probably the most intuitively appealing action would be to divide the number of observations (n or N) by 2. However, a better method is to divide n+1 by 2. In this case we have 30 observations so our halfway point is

30 + 1 = 15.5 . Next, to find the center, 2

we count in 15.5 spaces or observations from the starting or ending points of the data. This will put us directly between the two highlighted 16s. 10 11 12 13 14 14 15 15 15 15 15 15 16 16 16 16 16 16 16 16 16 17 17 17 18 18 18 19 19 20

We want the number between these two 16s. To overcome this problem we add up the two middle points and divide by 2 (essentially taking the average

of data points, lets remove one of the points (the 10) so that instead of 30 observations we now only have 29. 11 12 13 14 14 15 15 15 15 15 15 16 16 16 16 16 16 16 16 16 17 17 17 18 18 18 19 19 20 We still need to find out the location of the middle observation. In this case, the center of the data is

29 + 1 = 15 numbers in from either end of the dataset. 2

When we count in 15 spaces from the starting and ending points of the data, we land on the same number, 16. Thus, our median, or the middle point of the data, is 16 as shown here 11 12 13 14 14 15 15 15 15 15 15 16 16 16 16 16 16 16 16 16 17 17 17 18 18 18 19 19 20

The Mean The mean is the arithmetic average of all the observations in the data. It is also the fulcrum or, balancing point, of the data. For instance, if you were to place the histogram of the gas mileage data onto a seesaw, the mean would be the point that would allow the histogram to be perfectly balanced. As discussed earlier, the mean is found by adding up all of the observations and dividing by the total number of observations, either N or n depending upon whether you are dealing with the population or sample. The formula for the population and sample mean are =

xi x and x = i respectively. N n

To find the mean of our gas mileage data we should first ask ourselves, is the data based upon a sample or does the data consist of all the observations in the population? Next, to calculate the average, we sum up all of the observation and then divide this sum by the total number of observations. For the gas mileage data the average is

x= 12 + 16 + L 20 = 15.7 30

For additional practice, lets find the mean, median, and mode for the below data set. A random sample of 25 elk was taken at a wildlife refuge near Estes Park Colorado, and each of their respective body temperatures were recorded (in degrees F).

98 97 102 104 97

97 96 98 103 98

98 99 101 97 100

105 99 95 99 97

x = _________

Medain = __________ Mode = _________

Now that we have had a little practice calculating the mean, median, and mode, lets take a closer look at how these measures of central tenancy are influenced by different distributions. To start, it may be informative to look at the distribution of the gas mileage and compare it to the distribution of elk temperatures (notice that we are not making comparisons between the studies, just the distributions).

Histogram 10 Frequency 8 6 4 2 0 10 11 12 13 14 15 16 17 18 19 20 Miles per Gallon

Elk Temps.

6 F re q u e n c y 5 4 3 2 1 0 96 97 98 99 100 101 102 103 104 105 Bin

Notice that for the gas mileage data the measures of central tendency were all very close to the center of the distribution and very similar in value (mean = 15.7, median = 16, and mode = 16). In contrast, the measures of central tendency for the elk population are more divergent in terms of their values (mean = 99, median = 98, mode = 97). Why do you suppose this is the case?

The shape of a distribution and whether any outliers are present have an affect on the closeness of the mean, median, and mode. When a distribution is symmetric with no outliers, the mean, median, and mode will generally have values close to each other. When a distribution is skewed to the right the relationship between mean, median and mode is usually described by: mode < median < mean (mode is the smallest and mean is the largest). When a distribution is skewed to the left, the opposite is generally true: mean < median < mode (mean is the smallest and mode is the largest). Before the advent of computers, the mode was often used as a measure of central tendency because it we easily found and quickly calculated. However, in general, the mode is a poor measure of central tendency-mainly because the mode can be found anywhere in the distribution and often times there can be more than one mode. We will see later that the mode is a poor measure of central tendency for other reasons as well. However, the mode is the only measure of central tendency that can be used for categorical data. The best measure of central tendency for skewed data is the median. This is because the median is resistant to the more extreme values in a data set. Extreme values are data points that numerically stray from the majority of the data points and are thus found in the tails of the distribution. In addition, these extreme data points are often classified as what statisticians call outliers. We will provide a more official definition of an outlier in the next lecture. An example of how the median is more resistant to extreme data points can be seen by revisiting our gas mileage data. 10 11 12 13 14 14 15 15 15 15 15 15 16 16 16 16 16 16 16 16 16 17 17 17 18 18 18 19 19 200 The last data point in our gas mileage data has been changed from 20 to 200. Even though 200 is an extreme data point, notice that the alteration of this point does not affect the value of the median; it is still 16. However, when we calculate the mean, we get a much different result. That is, when we replace 20 with 200, the value of the mean changes from 15.7 to 21.7. Extreme data points tend to influence the mean by pulling the value of the mean towards them. When this shift occurs, it is often the case that the mean is no longer an adequate measure of central tendency.

When the data is symmetrical, the mean is often the preferred measure of central tendency over the median and mode. Based on statistical theory, the mean is the estimator that does the best job of estimating the population parameter over the long run, even if the data is not symmetrical.

To Recap: The Mode is appropriate to use when The observation that is most frequently observed is desired. A quick estimate of central tendency is desired. The data is categorical. Do not use when The data is multi-modal, highly skewed, or uniform, because in these situations, the mode may provide an extremely poor estimate of the center of the distribution. A more accurate measure of central tendency such as the mean or median is available. The Median is appropriate to use when The center or the middle value of the data set is desired. One needs to determine whether additional data points fall either above or below the midpoint. The data is highly skewed. Outliers exist that will affect the mean. Do not use when The distribution of the data is symmetrical because the mean is preferred. The Mean is appropriate to use when The data is symmetrical or at least not really skewed. When the data is roughly symmetrical, the mean, median, and mode are all somewhat decent measures of central tendency. However, when the data is

symmetrical, the mean will provide the best estimate because over the long run it does the best job of estimating the center of the distribution. Do not use when The distribution of the data is extremely skewed. Outliers exist which will affect the mean more than an acceptable amount. OPTIONAL: Using Excel to find the mean, median, and mode Using Excel to find the mean, median, and mode will make our lives easy, but remember for tests and quizzes you will need to find each of these by hand. Finding the Mode Step 1: Click on an empty box on the spreadsheet. Step 2: From the tool bar click on the f x icon. Once the f x icon has been selected the Past Function dialogue box should appear. Step 3: From the Function category widow select Statistical. Step 4: Once Statistical is selected from Function Name chose Mode. Step 5: Click on the Array box (it might say Numbers), highlight your data, and click OK. Your result should be in the empty box from Step 1.

These same steps can be repeated to find the mean or median by replacing Mode in step 4 with Average or Median.

- Statistics for People Who Think They Hate Statistics Using Microsoft Excel 2016 4th Edition Salkind Test BankUploaded bywhite856
- scn lesson planUploaded byapi-299410840
- Measures of Central Tendency Dispersion and CorrelationUploaded byFranco Martin Mutiso
- ConfidentialUploaded byneo_dax
- LSD - Statistical ConceptsUploaded byprabhu81
- CalculatingUploaded bywikileaks30
- 331hw12Uploaded byImhotep Edwards
- Statistics LilyUploaded byFazheng Huang
- ETUploaded byShiva Kumar Dunaboina
- Group Assignment Statistic 2Uploaded byMOHD MU'IZZ BIN MOHD SHUKRI
- Performance Assessment of Several Filters for Removing Salt and Pepper Noise Gaussian Noise Rayleigh Noise and Uniform NoiseUploaded byIJEACS UK
- W2 - Playground Accessibility - Scott A. Bennet.pdfUploaded bycem demirci
- data anyalsis edl 7510 allUploaded byapi-342221895
- Lecture 2 Desc Stat 22016Uploaded byMobasher Messi
- Body on Test AnalysisUploaded byapi-3768623
- statistics projectUploaded byapi-243749103
- 123S10-3Uploaded byAzam Shaikh
- 2015-16 Topic 2 - Descriptive Statistics IB Review Questions.docxUploaded byBhavish Adwani
- Seminar an.Uploaded byFlori Stoica
- Revisionguide - StatsUploaded byS.Waqquas
- stats project chapter 3 megantyler and coleUploaded byapi-442122486
- QT Formulae ONLYUploaded bySudhir Pawar
- Describing NumericalUploaded byABC
- activity 2 - student handoutUploaded byapi-248799149
- Research IUploaded byKhaira Racel Jay Pucot
- Variation is a Measure of the DifferencesUploaded byapi-21434876
- XiiUploaded byAdHam Averriel
- unit 4 review answersUploaded byapi-292903863
- ARMA-90-1051Uploaded byDenis Gontarev
- Simple ain't easyUploaded byDerablel Foscu

- 2_2Uploaded byAmit Verma
- Different Modalities of Antifungal Agents in the Treatment of Fungal Keratitisa Retrospective Study 2155 9570 1000631Uploaded byAmit Verma
- Prof. PhilipThomasFungalKeratitisESCMID2015 2Uploaded byAmit Verma
- (430-435)V10N4CTUploaded byAmit Verma
- 254_pdfUploaded byAmit Verma
- Revival of CulturesUploaded bysureandhraindia
- Miriam_O'Shea_20130717152247Uploaded byAmit Verma
- 6700557a_2Uploaded byAmit Verma
- 08 Chapter 3Uploaded byAmit Verma
- Для Просмотра Статьи Разгадайте Капчу_7Uploaded byAmit Verma
- Masters ThesisUploaded byAmit Verma
- j.1365-2672.2007.03462.xUploaded byAmit Verma
- Natamycin Efficiency forUploaded byAmit Verma
- Mt 297 Kushal Modi Ip 2010Uploaded byAmit Verma
- Phan Chau MinhUploaded byAmit Verma
- Synthesis and Characterization of Potential Drug Delivery SystemsUploaded byAmit Verma
- Jiang Ninghao ThesisUploaded byAmit Verma
- Synthesis and Characterization of Clickable Dendrimer Hydrogels fUploaded byAmit Verma
- Taj Pharma Design and Characterisation of Chloramphenicol Ocular Insert for OcularUploaded byaeshapatel
- AngoltézisfüzetHorvatGabriellaUploaded byAmit Verma
- edepotlink_i311563_001Uploaded byAmit Verma
- Liu ShengyanUploaded byAmit Verma
- Study of Ocular Transport of Drugs Released From a Sustained Release DeviceUploaded byPankaj gupta
- Gevariya Hb Thesis Pharmacetical ScienceUploaded byAmit Verma
- 1b5fe0e27b3309724d440ef1e98a0f7b55f7Uploaded byAmit Verma
- 244724_2Uploaded byAmit Verma
- Flow Placing Order.output (1)Uploaded byAmit Verma
- Analytical Techniques in Pharmaceutical Analysis AUploaded byAmit Verma
- 1-s2.0-S1878535213001056-mainUploaded byfatehatun noor
- 10.1016@j.colsurfb.2015.06.036Uploaded byAmit Verma

- Wedge Failure AgropoliUploaded byArif Kesumaningtyas
- Chan Wai Kuen, Suhaiza Zailani.docUploaded bykhurram95103
- Stataj_belotti_daidone_ilardi_atella-2(1)Uploaded bySaleema Karim
- Lead Auditor Workbook Updated to 2008 VerUploaded bySaaidAgasi
- Bayesian StatisticsUploaded bySergio David Manzanarez Elvir
- IJRTEM_J021068074.pdfUploaded byjournal
- Data Coding TabulationUploaded byShwetank Sharma
- chapter-1-4-tapos-na-ito.docxUploaded byRovelyn Alejandro Tubal
- probreview.pdfUploaded byHasmaye Pinto
- Age ClassificationUploaded bycreature123
- Factors Affecting Student Absenteeism in Atty. Orlando S. Rimando National High SchoolUploaded bygarlene mae colinares
- Market Research Made EasyUploaded byams_73
- Tripping of Thin-walled StiffenersUploaded byOilGas2011
- 259-F605Uploaded byChristianWiradendi
- Slope Stabliity - Bishop MethodUploaded byPisey
- GE221 Lect 4 ControlsUploaded byChristian
- 344480-june-2015-question-paper-21Uploaded byAdnan Mehmood
- Role of GIS as a Decision Support System in Power TransmissionUploaded byscribist
- Sl Research MethodsUploaded byJuliana Salvadori
- Herzog, Sergio (2010). Public perceptions of sexual harassment in Israel: An empirical analysis of Fitzgerald and her associates' typology of sexual harassment behaviors. In Wong, K. (ed.), Sexual Harassment around the World. New York: Nova Science Publishers: 1-29.Uploaded bysergioherzog
- Vol-2-No-2-December-2009-ENTERPRISE-MODEL-FOR-VENDOR-DEVELOPMENT-A-STUDY-AT-A-SELECTED-TECHNOLOGY-PARKUploaded byabhijeet198
- Statistics for Business and Economics: bab 12Uploaded bybalo
- ES714glmUploaded byRichard Ding
- sabbir.docxUploaded bybiswajit
- mba-cp-402-090910Uploaded byw_s_lionardo
- MarsdenUploaded byPatty Lin
- 1-s2.0-S0193953X17300680-mainUploaded byAleja ToPa
- fyp full draft backupUploaded byapi-355299123
- Telecom PenetrationUploaded byWillie Jones Jr.
- 03chapters_5-10Uploaded byihirmiz