You are on page 1of 26
tb Chanter 4 DataManagoment Learning Compet Atter completing this chapter, the learner will be able to: o000 Compare the forms (textual, tabular, and graphical) of data. Identify the essential parts of a table and describe the different kinds of graphs for data presentation. Draw the graph/table to present the data. Analyze and interpret the data presented in a graphitable. Discuss the proper Compute the different measures of dispersion for both grouped and s advantages and disadvantages of measures of s of mean, median and mode. ungrouped data. Discuss the uses, characteris dispersions. Perform operations on mathematical expres Analyze and interpret the data presented in the table Advocate the use of statistical data in making important decisions. Use a variety of statistical tools to process and manage numerical data. Use linear regression to predict the value of a variable given certain conditions, Apply correlation to determine the relationship between two variables Perform operations on mathematical expressions correctly. Articulate the importance of mathematics in one’s life. Express appreciation for mathematics as a human endeavor, Support the use of mathematics in various aspects and endeavors in life. ons correctly sg measur Chapter Outline Unit Introduction to Data Management Measures of Central Tendency Measures of Dispersion Measures of Relative Position Probabilities and Normal Distributions Unit 4.6: Linear Regression and Correlation Whatever exists at all exists in some amount...a nd whatever exists in some amount can be measured. — Edward £. Thorndike (1874-1949) Chapter 4: Data Management.) 9 ite. Page: | Unit A: Introduction t0 A. Organization of Data research must gather g, audy, the ; r study, make conclusions, : » a ion: pe situatio z : » the data gathered in sen, ne m fata is to construct a frequen, Jo categories showing atistical research, ble under investife When conducting, jar varia for the partic : the researc draw inferences about events, idely int meaningful way, The easiest way and iy eoupil of the data distribution, A frequency distribution 18 BUY classes. ne non-overlaPh ent the data so they can by ich of th each of vig to pre study. e number of Wa) number of observation: . ‘The most useful method gf g researche' After organizing data, the next move of the Fe 8 to plot graphs i eading the understood easily by those who will benefit from 1 ing tre i ing graphs and charts. presenting data is by constru and charts, and each one has a specific purpose: data by constructiny harts. Before W' are essential t 1 frequency distribution and hoy je get started in constructing This section discussed how to organize 1o understand deeper the to present data by constructing graphs and ¢ frequency distribution, we must define some terms that nature of data that ate displayed in a frequency distribution. + Raw data is the data collected in original form. _ Jue in a distribution. * Range is the difference of the highest value and the lowest val ation of data in a tabular form, using: mutually exclusive + Frequency distribution is the organ classes showing the number of observations in each. + Class Limits (or Apparent Limits) is the highest and lowest values describing a class. Class Boundaries (or Real Limits) is the upper and lower values of a class for group frequency distribution whose values has,additional decimal place more than the class limits and end with the digit 5. Interval (or width) is the distance between the class lower boundary and the class upper boundary and it is denoted by the symbol i Frequency (f) is the number of values in a specific class of a frequency distribution. + Percentage is obtained by multiplying the relative frequency by 100%. Cumulative Frequency (c/) is the sum of the frequencies accumulated up to the uppet boundary of a class in a frequency distribution. ‘Midpoint is the point halfway between the class limits of each class and is representative of the data within that class. A grouped frequency distribution is used when the range ofthe data setis lange; the data must be grouped into classes whether it is categorical data or interval data, For inte val ram is more than one unit in width. The procedure for constructing the freasercy urination i discussed in the succeeding sections. quency: distribute Categorical Frequency Distribution ‘The categorical frequency distribution is used to organize nomin; e . ; al-level inal-level typ of data, Some examples where we can apply this distribution are gender, eabss 5 ae al ” type, affiliation, and others. aa Chapter 4: Data Managemen ion apprai given a performance evaluat -gample 1: Twenty applicants were ea , High Low ee High High oe ere ‘hverope Average Low Averags sete High Low Average Avera ae es Low Low Average i Construct a frequency distribution for the data. Solution: Siep 1: Construct a table as shown below: Gass | Tally | Frequency | Percentage High Average Low Step 2: Tally the raw data, Gass | Tally | Frequency | Percentage High DALIT Average | IJ-II Low Ia step3: Convert the tallied data into numerical frequencie®- Class | Tally | Frequency | Percentage High IMEI 7 Average | PML-IIL 8 Low ™ = 5 “The percentage is computed using the formula: Step 4: Determine the percentage Percentage = £ x100% n = frequency of the class and n = total number of values. where f Gass, [Tally | Brequency [Percentage |" Found by High TW-I 7 35 (7 +20) x 100 Average | IMEI 8 40 (8 + 20) x 100 Low Mw 5 5 (+20) x 100 Total 20 100 For the sample, more applicants received an average performance rating, Determining Class Interval Generally, the number of classes for a frequency distribution table varies from 5 to 20 depending primarily on the number of observations in the data set. It is preferred to have more classes as the size of a data set increases. The decision about the number of classes d the method used by the researcher. haha Chapter 4: Data Management aa age 75 i is to use the smallest posi 1, Rule 1. To determine the number of class fo use * that 2! , where nis the total number of observations. Range HV-LV Number of Classes k Ve Imteger ky Suggested Class Interval = where: HV = Highest value in a data set LV= Lowest value in a data set k= number of classes i= suggested class interval 2 Rule 2, Another way to determine the class interval is by applying the formula below. Range Suggested Class Inewal = 355 (logarithm: of total frequencies) Grouped Frequency Distribution 17,400 | 32,400 | 20,200] 21,300 26,200 22,750 | 24,600 | 27,300 | 23,500 29,500 44,000 |” 30,500 [17,950 | 20,250 24,750 | 21,750 | -23,700 | 26,500 22,900 | 27,500 15,500 |” 30,700 | 18,400 20,800 | 25,000 |” 21,900'| 23,850 26,800 | 23,000 | 27,800 47,300 | 32,100 | 20,000 | 21,000 26,100 | 22,600 | 24,500 27,000 | 23,400 | 29,300 15,700 | 30,700 |" 18,700 | 20,500 25,150. 21,900 | 24,100 26,900 | -23,200 | 27,900 18.300 | 30,650 | 18,350 20,300 |_25,000 | 21,800 23,700 |_26500 | 29 900 27,600 47,000 | 30,750 | 18,800 20,800 | 26,000 | 22,000 24,300 | 27,000 [23.400 27,900 (17.800 | 33,500 | 20,250 21,600 |_26,300 | 22,800 24,700 | 27,400 | 23,700 30,400 Construct a frequency distribution using 2 Rule and determine the following: a. Range ©. Percentages ros b. interval £. Cumulative frequencies, © Class limits 8 Midpoints 4. Relative frequencies 7 Us to tally the data. 30,500 30,650 30,700 30,700 0 | 30,750 32,100 32,400 33,500 re Determine the classes, + Find the highest and lowest value. Vatue (HV) = 22,840 and Lowest Value (LV) = 14,000 + Find the range, Range » Highest Value (HV) ~ Lowest Value (LV) » 33,500 ~ 14,000 » 19,500 + Determine the number of el Ses, ‘The objective is to use just enough classes. We can determine the number of classes (k) using “2 to the K rule“. This will enable us to select the smallest number (&) for the number of classes such that 2 2 raised to the power of f) is greater than the number of observations (7). Using our example, there are 80 call center agents (or n = 80). Ife apply k = 6, which means we would use 6 classes, then 25» 25 64, somewhat less than 80, Thus, 6is not enough classes. If we try k= 7, then 22 128, which is greater than 80, Therefore, the recommended number of classes is 7. + Determine the class interval (or width). Generally, the class interval (or th) should be equal for all classes. The classes must cover all the values in the raw data (that is, from lowest to highest). Class interval is generated using, the formula: Ran, Suggested Class Interval « - noe Number of Classes |B Note: Round the value of the interval up to the nearest whole number if there is a | i remainder, | + Select a starting point for the lowest class limit. The starting point can be the smallest data value or any convenient number less than the smallest data value. In our case 14,000 is used. * Set the individual class limit. We need to add the interval (or width) to the lowest score taken as the starting point to obtain the lower limit of the next class. Keep adding until we reach the 7 classes, as reflected 14,000; 16,800; 19,600; 22,400; 25,200; 28,000 and 30,800. To obtain the upper class limits, we first need to add the interval to the lower limit of the class to obtain the upper limit of the first class. That is, 14,000 + 2,800 = 16,800. Then add the interval (or width) to each lower limit to obtain all the upper limits. Glass Limits: | 14,000 < 16,800 16,800 < 19,600 19,600 < 22,400 22,400 < 25,200 25,200 < 28,000 28,000 < 30,800 30,800 < 33,600 Chapter 4: Data Management Page 77 Tally the raw data. Glass Limits Tally 14,000 < 16,800, | THT 16, 0-< 19,600 | PRU-HIT 600 < 22,400. | DXD-DY-DNI-1 22,400.< 25,200. | DNIDNU-PRI-PRI-IT 25,200< 28,000 | PAU-DRI-PRIIT 25,000 30,800 ANT aL Step 4: Convert the tallied data into numerical frequencies. ‘Class Limits Tall Frequency 14,000 < 16,800 | IIIT 4 16,800< 19,600 | ULI 9 19,600.< 22,400 | DIDNT a 22,400 <25,200 | DNU-PNI-DR-DMI-IIT 23 25,200 < 28,000. | PNJ-DNU-MU-IT 17 28,000 < 30,800. | N-IIT 8 {30,800 <33,600 | mr wit 3 Step 5: Determine the relative frequency. It can be found by dividing each frequency by the total frequency. [Glass Cini Frequency | Relative Frequency | Found by 14,000 <16,800 4 0.05 4580 16,800 < 19,600 9 11 9+80 19,600 < 22,400 16 020} 16 +80 22,400 < 25,200 23 029} 23.489 25,200 < 28,000 7 0.21 17-80 28,000 < 30,800 8 0.10 8 +80 : 30,800 < 33,600 3 0.04 3430 Step 6: Determine the Percentage. It can be found by couloir to i frequency, lying 100% in each relative L. Class Limits . 14,000 < 16,800 16,800 < 19,609 19,600 < 22,409 22,400 < 25,200 25,200 < 28,000 28,000 < 30,800 30,800 < 33,600 Class Limits Fre e 14,000 <16,800 oe sei aad 4 fants 16,800 < 19,600 9 13} 449 19,600 < 22,400 16 29) 44+9+16 22,400 < 25,200 23 52 | 4+9416+23 25,200 < 28,000 Wy 69 | 449416423417 28,000 < 30,800 8 77 | 449+16+23+17 +8 30,800 < 33,600 3 80 | 44+9+16 +23 +17+8+3 Step 8: Determine the midpoints. The mid; int can be found by getting th .¢ of the upper limit and lower lim a ih a en ve it in each class, Class Limits] Frequency || Midpoint [Found by 14,000 < 16,800 4 15] (14+16) +2 16,800 < 19,600 9 18| (17+19)+2 19,600 < 22,400 16 21} (20+22) +2 22,400 < 25,200 2B 24] (23+25) +2 25,200 < 28,000 7 27| | (26+28) +2 28,000 < 30,800, 8 30] 29431) +2 30,800 < 33,600 3 33] 2434) +2 Example 3: 5JS Travel Agency, a nationwide local travel agency, offers special rates on summer period. The owner wants additional information on the ages of those people taking travel tours. Arandom sample of 50 customers taking travel tours last summer revealed these ages. 418 | 29 | 42 | 57 | 61 | 67 | 37 | 49 | 53 | 47 24 | 34 | 45 | 58 | 63 | 70 | 39 | 51 | 54 [48 28 | 36 | 46 | 60 | 66 | 77 | 40 | 52 | 56 | 49 19 | 31 | 44 | 58 | 62 | 68 | 38 | 50 | 54 | 48 27 | 36 | 46 | 59 [ 64 | 74 [39 | 51 [55 | a8 Construct a frequency distribution using Rule 2, Solution: Step 1: Arrange the raw data in ascending order. 18 | 29 | 37 [42 | 47 | 49 | 53 | 57 | 61 | 67 19 | 31 | 38 | 44 | 48 | 50 | 54 | 58 | 62 | 68 24 | 34 | 39 | 45 | 48 | 51 | 54 | 58 | 63 | 70 27 | 36 | 39 | 46 | 48 | 51 | 55 | 59 | 64 | 74 28 | 36 | 40 | 46 | 49 | 52 | 56 | 60 | 66 | 77 Step 2: Determine the classes. * Find the highest and lowest value. Highest Value (HV) = 77 and Lowest Value (LV) = 18 Find the range. Range = Highest Value (HV) ~ Lowest Value (LV) = 77-18 = 59 Chapter 4: Data Management Page 79 77-18 2 743.322(10g.50) petermine the lass inter class vent lowes 49 to each lower clase limit until reaching the 0 cad ‘ 72)."To oblain the its, we need ty limit o! _ point for the sting Poin ass limit. We 7, 36, 45, 54 69 sour st will add and 5 oblain the upPe all the upPe? the inaividat la rrumber of cla os (18, 27 ; add 9 to the lower limit of jnterval (or width) 0 ‘each upper + sett the lass t0 Eto oblain a 8). Step 3: Tally the raw data. Class Limits [Tally 1<27— | Ill 27<36 | 36<45 | DMIUIT as<5q_— | DRDIBU-II 54<63 INI-INI-L <72 | INL m. Constructing a Frequency Polygon Step 1 Fi tep 1: Find the midpoints of each class. Step: Draw and label the x-axis and axis, is, Page 82 Chapter 4: Data Managemen i axis. step 3: Represent the frequency on the y-axis and the midpoints on the x-2 ine back Connect adjacent points with line segments. Draw a line ba y 4: SP jeginning and end of the graph. Figure 4.2: Frequency Polygon for Call Center Agents’ Salary Frequency Polygon for Call Center Agents’ Salary Frequency uo” Salary (inThousands) 14,000 < 16,800 | 13,500 - 16,500 4 4 16,800 < 19,600 | 16,500-19,500 9 13 19,600 < 22,400 | 19,500 - 22,500 16 29 22,400<25,200 | 22,500~ 25,500 23 52 25,200< 28,000 | 25,500 ~28,500 7 69 28,000 < 30,800 | 28,500 - 31,500 8 7 30,800 < 33,600 | _31,500-34,500 3 80 Step2:. Draw and label the x-axis and y-axis. Step4: Connect adjacent points with line segments. Figure 4.3: Ogive for Call Center Agents’ Salary * Ogive for Call Center Agents’ Salary | Cumulative Frequency 15 9S SSSI Real Limit Salary in Thousands) to the x-axis at the Siep3: Represent the frequency on the y-axis and the upper class boundaries on the x-axis. __Sapter 4: Data Management Page 83 As discussed in the previous section, the only allowable calculation on nominal data, count the frequeney of each value of the variable, We ean graphically display the counts in it 4 ways: : areto charts, bar charts, and pie charts, ‘This section also includes how to graphic al 7 display time series graph, pictograph and scatter plot. . Pareto Chart, A parefo chart is a graph used to represent a frequency distribution , categorical data (or nominab-level) and frequencies are displayed by the heights of vertical y.° ts, which are arranged in order from highest to lowest. s Bar Chart (Bar Graph). A bar chart is similar to bar histogram. The bases of the rectangles . pters are the codes. The height of each rectangle represents jy frequency of that category. ILis also applicable for categorical data (or nominal-level). : arbitrary intervals whose © Pie Chart (Circle Graph). A pie chart is a circle divided into portions that represent thy relative frequencies (or percentages) of the data belonging to different categories. The data in , pic chart should be categorical or nominal-level. Time Series Graph. A time series graph represents data that occur over specific period of time under observation, In addition, it shows,a trend or pattern on the increase or decrease over the period of time. Pictograph (Pictogram). A pictograph immediately suggests the nature of the data being shown. It is a combination of the attention-getting quality and the accuracy of the bar chart, Appropriate pictures arranged in a row (sometimes in a column) present the quantities foy comparison. Scatter Plot. A scatter plot is used to examine possible relationships between two num: variables. The two variables are plot in x-axis and y-axis. Now we will illustrate how to construct the pareto chart, bar chart, pie chart, time series graph, pictograph, and scatter plot using the succeeding examples Example 2: Using the information in the table about the favorite snacks [/ Products | Sales of 870 youth, construct a pareto chart, bar chart, and pie chart. Junk Foods | _ 135 Cand 250 Solution: Tce Cream 185 a. Constructing a Pareto Chart Chocolate 210 . : Others 90 Step 1: Arrange the data from highest to lowest according to. , frequency. Products| Sales Cand: 250 Chocolate 210 ee Cream 185 Junk Foods 135) | Others 90 | Step 2: Draw and label the x-axis (Products) and y-axis (Sales). | ; | ‘Page 84 Chapter 4:,Data Management - wicy from highest to lowest and from left to Al Soine width and draw the height corresponding to the 708 the Pareto Chart om the favorite snacks of the youth. He $4: Pareto Chart for Example 2 } ™ Favorite Snacks Fenciate fee Coram forkFeods Others b. Constructing a Bar Chart S73: Drew and label the x-axis (Products) and axis (Sales). Sigh Make a bar with the seme width and draw the height corresponding to the frequencies: Figure 4.5 shows the Bar Chart on the favorite snacks of the youth. Figure 4.5: Bar Chart for Example 2 Favorite Snacks JunkFoods Candy —KkeCream Chocolate Others Products “The same observation can also be seen in the bar chart that candy is the most preferred snacks ilowed by chocolate while other kinds of snacks are least preferred by the youth from the given Population. © Constructing a Pie Chart Sp: Since there are 360° in a circle, the frequency of each class must be converted into a Proportional part of the circle. This conversion is done by applying the formula Chapter 4: Data Management Page 85 Step 2: Step 3: Degrees = (J }os0 where {= frequency of each class, and n= sum of frequencies, Hence, the following conversions are obtained. The “sees should total to 360° Candy (5)(260") = 103 Junk Foods °) = 56° =, |(360") = 5 \ 360°) =56 Chocolate a] 0°) =87° Others pa (Fo (360°) e (So) 950 - 37 Ice Cream 185\ 3692) 770 (50 50 7 Each frequency must also be converted to a percentage and the sum of t must have a total of 100’ %. This percentage can be done by applying the Percentage — (4 }a00%) 1" where f= frequency of each class, and n = sum of frequencies, hese percentagy formula 250° 135 Candy 1% Junk Foods —— |(100%) = 16% (q)00% = 29% (0 \100%) Chocolate 210 Others 2 \ 100%) = 10% (iro (100%) = 24% 0 10%) 185 Ice Cream 100%) = 21 (s7a}"9% -21% Using a protractor, graph each section and write its name and appropriate percentage, as shown in Figure 4.6, Figure 4.6: Pie Chart for Example 2 Favorite Snacks Others 10% Junk Foods 16% eam Chocolate 24% wed Since the candy has the biggest slice in the pie chart, it is the most preferred snacks follo ven + by chocolate while other kinds of snacks are: least preferred by the youth from the gi population. nt ter 4: Data Manageme! Page 86 Chapt _ yu rate {rom ample Using the Information in the table below about the dollar to peso exchange rat je December of 2017, construct a tne serien graph janis Month January | Hebrunsy | March | April Pachange Bate a w” ry Mo Month | July | August Ocuber | Me Hechange Rate ah 1? olution: sp [sea anal label the x-axls and yo grep? Label the xaxin for months and y-axis for Peso per US Dollar, sup Plot each point according to the table, sig 4s Draw line segments connecting, adjacent points. Figure 4.7: Time Series Graph for FE Ves-US Dollar Exchange Rate : dy /N\A “A | Ba 7 : . fe) Bale i ay | 6 fp pe ra Si a Oy To Months in the table that April has the highest exchange rate of US dollar to Philippine st in the months of January and February. i Itcan be see peso and itis in the lower Example 4: The VSAS Really Inc. is a real estate who develops household in Rizal province. The information in the table show the number of house construction from 2013 to 2017. Construct a pictograph, Year |_No. of Houses 2013 400 | 2014 250 2015 600 | 2016 _| 550 2017 700 Solution: Step: Draw and label the x-axis and y-axis, Step 2 Label the x-axis for years and y-axis for Number of Houses. Slep3: Draw a house to represent the number of houses, Chapter 4: Data Management Figure 4.8: Piclograph for Example 4 800, 70 0 1 EBT = 1000 3 500, Legend: af uses 3 400 g +300 200 100 2013 2014-2015 2017 ‘Year It can’be noted in the pictograph that the real estate built more houses in years 2015 ang ap, while they only constructed less than 50% of houses year 2014 in Rizal province. ; Example 5: The owner of a chain of halo-halo stores would like to study the effect of atmosphey, temperature on sales during the summer season. A random sample of 12 days is selected with y, results given as follows: Day Tl2]314 5 e[7[ 8191/1] 2 Temperature (F) | 79| 76| 78| 84| 90| 83| 93| %4| 97| 85) 88| 82 Total Sales 747 | 143 | 147 | 168 | 206 | 155 | 192 | 211 | 209 | 187 | 200 | 150 Put the data on a scatter diagram. Solution: Step 1: Draw and label the x-axis and y-axis. Step 2: Label the x-axis for Temperature (°F) and y-axis for Sales. Step 3: Plot the points of each ordered pair in the Cartesian coordinate system. Figure 4.9; Scatter Plot for Example 5 25 200 175 150 125 100 sales 0) o 15 20 6 o % 0 ‘Temperature (X) Page 88 ‘Chapter 4: Data Managee™ aceetll and the temperature also ship with the increases the sales We deduced in the graph that there is a positive relations of sales of halo-halo, It means to say that as the tempera ber wn jncre raphs or charts Ws tell what the data are conveying, Silly, many graphs oF a vanrceh or complicated. In order (0 needs to bear in mind Good graphical displ ahown in newspapers and magazines are misleading, incorreel, 0 correctly develop good graphs/charls, there are some guidelines that one such as 1, The graph/chart should include a tite 2, The scales forall axes should be included 3, The scale on the y-axis should start at zero. {4 The graph/chart should not disfigure the data. 5, The x-axis and y-axis should be properly labeled. 6, The graph/chart should not contain unnecessary decorations. 7. The simplest possible graph/chart should be used for any data sel. [fDi supplementary Exercise 4.23 1. $JS Travel Agency, a nationwide local travel agency, offers special rates on summer perio. ‘The owner wants additional information on the ages of those people taking travel tours Construct a histogram, frequency polygon, and cumulative frequency polygon. What) conclusions can you reach based on the information presented? Glass Limits [Class Boundaries | Midpoints [Frequency | of 18-26 175-265 3 3 3 i | 27-35 265-355 5 5 8 i 36-44 355445 9 9 7 | 45-53 145-535 4 u 31 54-62 535-625 n u 2 @-71 05-715 6 6 48 i 72-80 715-805 2 2 50. | 2. The Land Transportation Office (LTO) is interested in the number of brand new catal imported to the Philippines in 2015. The following data art as follow: | County | No. of CarsImported | Japan 225,000 | South Korea’ 78,300, i USA. : 120,250 United Kingdom 19,200 | Italy 16,750 i China 40,500 | Total 500,000 | Sketch the pareto chart, bar chart and pie chart ofthe given data and interpret the data, | 3. Ina Senior High School where General mathematics course is a prerequisite for the Statistics} and Probability course, a sample of 14 students was drawn. The grades for the General} Mathematics and Statistics and Probability were recorded for each student. The data are} atter plot: Interpret the Student ey General Math Stat. and Prob. ‘89 | 95 | 85 | cellular phone subscribers for each of the 1st 12 years isfy 4. The number of postp: below. Use the time series graph to represent these Interpret the result, (ez 3008 008 woo [ 2010 | 20r [ 2012 [2013 [aore [No.of Satsenbers | 312 | 4.10 | 423] 396 | 387 | 350 | 467 [499 | 486 | 4.96 | Sor Laiamittiony oj | 7 Unit 42: Measures of Central Tendency Any data set can be characterized by measuring its central tendency. A measure of centr a single value that represents a data sep tendency, commonly referred to as an average, Purpose is to locate the center of a data set. This chapter discusses three different measures central tendency: the mean, median, and the mode. We will illustrate how to calculate each these measures for ungrouped and grouped data. Measure of central tendency both for samp, grouped and population grouped is also included in the discussion. A. Mean The arithmetic mean, often called as the mean, is the most frequently used measure of centr tendency. The mean is the only common measure in which all values play an equal rol, meaning, to determine its values you would need to consider all the values of any given data set, The mean is appropriate to determine the central tendency of an interval or ratio data. The symbol X, called “x bar,” is used to represent the mean of a sample and the symbol called “mu”, is used to denote the mean of a population. Properties of Mean 1. A set of data has only one mean. 2. Mean can be applied for interval and ratio data. 3. All values in the data set are included in computing the mean. 4. The mean is very useful in comparing two or more data sets. 5 6 f Mean is affected by the extreme small or large values on a data set. Mean is most appropriate in symmetrical data. an - Sum _of all values Mean = ——____—_"S 5 Number of values Sample Mean: 7. Dt Population Mean: = a " N where: = sample mean (it is read “x bar”). population mean (it is read “mu” the value of any particular observation or measurement. Sm sum of all x's 91 i Page 90 “Chapter 4: Data Manageme” n= total number of values in the sample. = total number of value: nthe population. Example 1: The daily salaries of a sample of eight employees at GMS Iric. are 1550), PS, PAD, 1°30, 700, 7670, 860, 480. Find the mean daily rate of emnplayees Solution: PeEES RL ELesetereen 1 " 0+420 0+ 0+ 700 +670 +860+480 4,740 ; sua DUD ‘The sample mean daily salary of employees is 7592.50. Example 2: Find the population mean of the ages of 9 middle-management employees of 2 certain company. The ages are 53, 45, 59, 48, 54, 46, 51, 58, and 33. Solution: Dr aentyeneneninenen N N a eee ‘The mean population age of middle- management employees is 52.11. B. Median The median is the midpoint of the data array. When the data set is ordered, whether ascending or descending, itis called a data array. Median is an appropriate measure of central tendency for data that are ordinal or above, but is more valuable in an ordinal type of data. Properties of Median 1. The median is unique, there is only one median for a set of data, 2, The median is found by arranging the set of data from lowest or highest (or highest to lowest) and getting the value of the middle observation. 3. Median is not affected by the extreme small or large values. 4, Median can be applied for ordinal, interval and ratio data, 5. Median is most appropriate in a skewed data. To determine the value of median for ungrouped, we need to consider two rules: 1. Ifmis odd, the median is the middle ranked. 2. Ifnis even, then the median is the average of the two middle ranked values. Median (Rank Value) ml Note that 1 is the population/sample size, Example 1: Find the median of the ages of 9 middle-management employees of a certain company. The ages are 53, 45, 59, 48, 54, 46, 51, 58, and 55, Chapter 4: Data Management ‘Page91 Solution: Step 1: Arrange the data in order. 45, 46, 48, 51, 53, 54, 55, 58, 59 ; Select the middle rank value. 0 194 Median (Rank Vatre) = = 3 “378 Identity the median in the data set 45, 46, 48, 51, 53, 54, 55, 58, 59 5th Hence, the median age is 53 years. Example 2: The daily rates of a sample of eight employees at GMS Inc. are P550, P420, Pig P500, P700, 7670, P860, P48O. Find the median daily rate of employee. Solution: ep 1: Arrange the data in order, 420, P480, P500, P550, P560, P670, P700, P860 Step 2: Select the middle rank value. Median (Rank Value) = "** Step 3: Identify the median in the data set. P420, P480, P500, P550, P560, P670, P700, P860_ 454 Since the middle point falls between P550 and P560, we can determine the median of the data set by getting the average of the two values. Median .. 55 Therefore, the median daily rate is P555. C. Median The mode is the value in a data set that appears most frequently. Like the median and unlike the mean, extreme values in a data set do not affect the mode. A data may not contain any mode if mone of the values are “most typical”. A data set that has only one value that occurs the greate frequency is said to be unimodal. If the data has two values with the same greatest frequent both values are considered the mode and the data set is bimodal. If a data set has more than tw? modes, then the data set is said to be multimodal. There are some cases when a data set valu have the same number frequency. When this occurs, the data set is said to be no mode. Properties of Mode 1. The mode is found by locating the most frequently occurring value. 2. The mode is the easiest average to compute. 3. There can be more than one mode or even no mode in any given data set. e q Page 92 7 Chapter 4: Data Manageme", {ode is not affected by the extreme small or large values: F Mode can be applied for nominal, ordinal, interval and ratio dala. ‘The following data represent the total unit sales for Smartphones fom > sample oe ation Centers for the month of August: 15, 17, 10, 12, 13, 10, Ms 10,8 and 9. Fin ample! comm uni ed array for these data is 8, 9, 10, 10, 10, 12, 13, 14, 15, 17. jecause 10 appear 3 times, more times than the other values therefore the mode is 10. manufacturing keeps track of the following data that represents sample 2: An operations manager in charge of a company’s ks: 20, 18, 19, 25, 20, 21, 20, 25, umber of manufactured LED television in a day. Compute for the ‘sumer of LED television manufactured for the past three wee 30, 29, 28, 29, 25, 25, 27, 26, 22, and 20. Find the mode of the given data set. solution: ‘The ordered array for these data is 18, 19, 2, 20, 20, 20, 21, 22,25, 25, 25, 25, 26, 27, 28, 29, 29, 30: ‘There are two modes 20 and 25, since each of these values occurs four times. trample 3: Find the mode of the ages of 9 middle-management employees of @ certain company: The ages are 53, 45, 59, 48, 54, 46, 51, 58, and 55. Solution: The ordered array for these data is 45, 46, 48, 51, 53, 54, 55, 58, 59. ‘There is no mode since the data set has the same frequency. D. Weighted Mean when various classes or groups contribute The weighted mean is particularly useful wi und by multiplying each value by its differently to the total. The weighted mean is fo corresponding weight and dividing by the sum of the weights. xywybxgwg begat = tani =n cd wytw2t Ws =weighted mean. wi= corresponding weight. ‘x= the value of any particular observations or measurement. ics Department of San Sebastian College there are 18 instructors, 12 Example 1: At the Mathemati and 3 professors. Their monthly salaries are P30,500, assistant professors, 7 associate professors, 33,700, P38,600, and P45,000. What is the weighted mean salary? wi=3 xa= 45,000 w2=12 Chapter 4: Data Management 1,088, a0 The weighted mean salary is P3 in ahown in the table below. Use the weighted Me, Riana’s first quarter § ae jor the first quarter. formula to find Riana’s GPA var = Sipe Filipin [Religion per BB |_95__| 96 a 3 1 Units _ Solution: _ Lot mn=3 ime3 m3 ws=1 x1=90 wee 87 ane 88 15-96 g_ _ dmcbanwe heerbgwacbrgls Fe = ey ewy tw ty Fs 90(3) + 87(3) + 88(3) + 95(2) + 96(1) _ 1,088 _ 94 57 343434241 12 ‘The weighted mean bedroom per home is 90.67. Example 3: A certain subdivision in Laguna consists of 50 homes. The table shows the frequen distribution of homes with respect to the number of bedrooms it has. Find the mean number ¢ bedrooms for the 50 homes. No.of Bedrooms | 2 [3 | 4 | 5 | 6 No. of Homes 13 | 21 | 10 | 4 | 2 Solution: Let wi=2 w=3 w=4 wi=5 ws=6 m=13 med ne10 xed 5-2 og, = teat eam tage tc wy Fwy Wy FW 4g, = 23) + 3(21) + 4(10) + 5(4) +62) _ 164 ad 13+ 214104442 2 50 ‘The weighted mean of bedrooms per home is 3.22. | —— = ££ supplementary Exercise 4.2 / LA college professor administered a unit exam to one of his classes and found that the! majority of the items were too easy. ‘The scores are 45, 39, 40, 48, 35, 37,36, 37, 40, 44, 41, 49 2B 28, 32, 36, 37, 41, 40, 36, 39, 30, 25, 43, and 50. Calculate the ‘mean, | z fe Caan of Agriculture conducted a survey of farmers in Ilocos Norte. The) owing is the list of acres farmed by a sample of 20 farming families through-out th Province: 200, 1200, 300, 350, 50 | eee 500, 550, 1500, 400, 500, 800, 850, 1300, 2000, 2100, 340, 760, a ei Page 94 Chapter 4: Data Manageme" a 6. The following 17, 21, 23, 43, = planager would like to see cach of his sale’ Ay recruit 8 told t0 Keep a weekly record of the sales pious month: 10, 14, 17, 18, 30, 28, 27,38, 16, 5, 8, 19, 28, Pred 10. Caleatate the mode, grades students on 4 quizze j sale and a final 1 5 22% of the course grade. The ce quiz scores of 75, 80» 85, and! I. Use the weighted mean) i 5, a project, ject counts a* Je, Achaiah has n score 18 4, A professor f A ats as 12% of the quiz. grade, The proh 30% of the course grad is 95 and his final examinatio! s average for the course. examination counts 99. His project score formula to find / of Dispersion “Another important characteristic of a data set is how itis distri ve measure of central tendency (average) Ther S°° se} Ithough the most common and most important rage distance for each element from the mean, seve? sed here. Stantiard deviation is dely values are disperse erage value. element or how far each sure the yys to mea! ibuted, wveral Wa} {is the standard di ral others are also «statistical term that provides 2 .d from the average: is from sor re | ation, ‘variability of the data, Al evi hich provides an aver tant, and are hence discus: indication of volatility. It measures how wi she actual value and the a import good i Dispersion isthe difference between # A. Range Probably the simplest and range is the difference of the highest value javantages of the range: (i) itis easy 10 comp: hand, it also has two disadvantages, it can be disto only two values are used in the calculation. easiest way to determine measure of dispersiOn is the range. The ind the lowest value in the data set. There are two vite and (i) it is easy to understand. On the other ted by a single extreme value (or outlier) and Example 1: The daily rates of a sample of eight employees at GMS Inc. are P550, P420, P56, 500, P700, P670, P860, P480. Find the range- Solution: Step 1: Determine the highest value a Highest Value (HV) = P860 ind lowest value in the data set. Lowest Value (LV) = P420 Step 2: Solve for the range. Range = Highest Value (F1V) - Lowest Value (LV) = 860 - P420 = P440 ‘The range in daily rate salary is P40. B, Variance and Standard Deviation One of the most widely used measures of dispersion is the standard deviation. The mi spread apart the data, the higher the deviation. Standard deviation is calculated os. he re ooee seme: In finance, standard deviation is applied to the annual rate of return ot an vim nt to measure the investments volatility. Standard deviation is also known as historical ity and is used by investors as a gauge for the amount of expected volatility. Chapter 4: Data Management 095 Page 95 ig mean value: ariace jy we mnean Volatility a vip might KE ON Whe dispersion of 4 S tion of the averaBs ip determi are of te expecta invest Ju, 20 this statisties aM hel security ped alt 0 standard Deviation for Ung ae <4 Sample ye Variance a samp be arianee, sample ¥ ye standard de’ ny particu wiation iar observation OF measurement he value) ‘ofa ample mean. = sample population. employees at GMS Ine: are 550, P420, P56y, a sample of eight dd standard deviation. aily rates 0} ce an Example 2 The d pgo0, 2480. Find the ¥ ‘pag, 1700, 2670 P Solution: Sup 1+ Compute the mean of the data set. =e so 420+ 560 500+700-+670+ 860+ 480 S740 _ 599.50 0 8 8 ‘Step 2: Subtract the me: an from each ofthe value in the data set 25 ns 225 25 bok 1075 ae 75 360 2615 sev -125 74740) S(x-2)=0 Step 3: Square thex—Z, then get the sum. x 3 ah iE oe 25 180625 _ oe 29,756.25, toad oe 1,056.25 pes 075 8,556.25 ee 75 11,556.25 = 6,006.25 26 a ns 71,586.28 r= 4740 = ae \ a2) -0 | Sense 2,656.25 1 Page 96 7 = W250 Chapter 4: Data Manageme™ We ean also obtain the standard deviation ore DOTY? 142,950 xy SOC M2980 a aiay gn LEG = REP = ORI = OM Hence, the variance is P20,421.43 and the standard deviation is 142.90. _atternative Solution: An alternative solution can be done using the other formulas. sup ts Get the sum of the data set. Suey 2: Square the values in the data set and get the sum. @ : 550 302,500 420 176,400 560 313,600 | 500 250,000 700 490,000 670 448,900 860 739,600 480 230,400 dx=4740 dx? =2,951400 Step 3: Solve for the values of the variance and standard deviation. (4,74 2 ‘i p22 951400-4 iL __ 2.951400~ 2,808,450 _ 99 491.43 8-1 7 (4,740) 2,951,400 - = a [a 8514002" 20,4213 = 142.90 ‘Thus, the variance is P20,421.43 and the standard deviation is P142.90. Chapter 4: Data Management Page 97 ian sannatare DE Popuation Varian and Pega? saan sew ot Sia wel? oy jo vat nein OO jcvslar ob where: yy yo the pul jpoputallon value onan u wan Incomes of the five rene je monthly 0 1176 1,000, Hind 7,000, a0 Solution: Compute the mean of He data nel. N 5 jalion mean fron Step 2s, Subtroet the popu 58, 4,000 59,500 | 500 62,500 | 3,500 57,000 | 2,000 61,000 | 2,000 NES EE 55,000 4,000 59,500 500, 62,500 3,500 57,000 2,000 2,000 Step 4: Solve for the population variance yp)? ele N Hence, the population variance 2,701.85, 26,500,000 5 730,000 (1) supplementary Exercise 4.3 1 18, 16, 14, 12, and 17. yiation: er nition OF H we yarlance and sh G(X 458, 50004 (2,100 57 (0004+ 6 v each of the value then pet the sum. is 730, arch dit 5 (x-w)? 16,000,000 250,000 12,250,000, 4,000,000 4,000,000 | Se-w “0 | Sox -yy" = 36,500,000 | and population standard deviation. z om pam = 730,000 = 2,701.85 000 and the population standard d a Limestudy analyst observed a packaging operation and collect seconds) required for the operation to fill packages of a fixed volume box: 11, ind the range, variance, standard deviati : Page 98 peasuarement: ator of Recoletos schools are: P55, jandard de 1,000 25) 59,00 in the data set. d the following times (in) Chapter 4: Data Manageme"

You might also like