You are on page 1of 33
Lesson 1 Lesson 2 Lesson 3 Lesson 4 Lesson 5 MODULE | THE STUDY OF STATISTICS Basic Concepts on Statistics Determining Sample Size Tools in Gathering Data Criteria for Data Gathering Organization and Presentation of Data MODULE | THE STUDY OF STATISTICS C] INTRODUCTION This module is the introductory part of Advanced Statistics. It involves lessons on basic concepts on Statistics, functions and types as well as determining sample size. It also includes organization and presentation of data. OBJECTIVES After studying the module, you should be able to: 1, Determine the functions of statistics and cite concrete examples of uses in the different fields; Differentiate the branches of statistics and the types of test; 3. Differentiate and provide examples of the scales of measurement, data and sources of data; 4, Determine the most appropriate way of selecting a sample and collecting a data in a particular study 5. Identify the advantages and disadvantages of each form of presenting data; 6. Recognize the uses of different forms of presenting data; 7. Organize collected data and present them in an appropriate form; and 8. Constructs graphs and charts. SEME 112 - Advanced Statistics Module | é DIRECTIONS/ MODULE ORGANIZER: There are 5 lessons in the module. Read each lesson carefully then answer the exercises/activities to find out how much you have benefited from it. Work on these exercises carefully and submit your output to your teacher. In case you encounter difficulty, discuss this with your teacher during the face-to-face meeting. Good luck and enjoy reading!!! SEME 112 - Advanced Statistics Module | Lesson 1 BASIC CONCEPTS ON STATISTICS Statistics is a branch of mathematics that deals with the processes of gathering, describing, organizing, analyzing and interpreting numerical or statistical data as well as with drawing valid conclusions and making reasonable decisions on the basis of such analysis. FUNCTIONS OF STATISTICS 1. To describe a group in terms of what is average or typical Ex. What is the average salary of DEPEd teachers? To describe a group in terms of its dispersion or variability Ex. Are the IQ of college students in DMMMSU varied? . To determine the existence of a relationship between/among two or more variables Ex. Is salary of employees correlated with their manifestation of work ethics? To compare two or more group scores on a variable Ex. Is there a significant difference in the level of financial management skills of teachers when grouped according to highest educational attainment? To determine the probability of occurrence of an event or observation Ex. What is the probability that a person who is in close contact with a Covid-19 positive be contaminated with the virus To estimate the value of a population parameter on the basis of an observed statistic Ex. Can it be concluded that the claim of a company on the number of kilos of their chocolate products is 1.5kg based on a sample of 20 packs? Examples of the Functions of Statistics in Various Fields Field Concrete Use Medicine Trend of Covid Positive in the SEME 112 - Advanced Statistics Module | different regions 7, Sports Number of wins and loses 3. Education Trend in enrolment, number of graduates 4,_Land Transportation Office Registration of Cars 5. Department of Trade in Industry | Number of Micro Enterprises in the Province 6. Local Government Unit Profile of recipient of Social Amelioration Program 7. Psychology Attitudinal patterns, cause and effects of misbehavior @, Business and Economics Sales, price indices, revenues, costs, inventories 9. Research To test claim or inferences about a group of people or events Branches/Fields of Statistics 1. Descriptive Statistics. This type of statistics is used to describe a group of individuals or describe the data that have been collected. In short, this type of statistics is devoted to summarization and description of data sets. Statistical tools used: frequencies, percentage distribution, measures of central tendency, graphs, skewness and kutosis, measures of variability, degree of relationships of group characteristics Statistics- numerical indices are calculated from a sample drawn from a population. Parameters-numerical indices are calculated from the entire population 2. Inferential Statistics. This type of statistics is used when one makes decision, estimates prediction or generalization about a population based on a sample. In inferential statistics, testing the significant difference and independence between two or more variables are given emphasis. A hypothesis about the population is made and is intended to be rejected or accepted depending on the result of a test based from available samples. Some tools are: Normal Distribution (area under the curve), Sampling Distribution (sample size, standard scores), Probability Distribution, Hypothesis Testing One group: Chi square; Z Two groups: t; Z; chi square, Mann-Whitney test; McNemar Test of Change; Wilcoxon Test; Sign Test; Median Test SEME 112 - Advanced Statistics Module | Three or more groups: Analysis of Variance, Kruskal-Wallis Test; Cochran’s Test; Chi square KINDS OF TESTS IN STATISTICS a. Parametric Test ~ a test of significance appropriate when the data represent an interval or ratio scale of measurement and -it is stronger, the sample size is large n>30 -distribution is normal -sampling is done at random b. Non-parametric test a test of significance appropriate when the data represent an ordinal or nominal scale -sample size is small -the distribution is free -the samples are not randomized (purposive) Constants and Variables Constants- refer to the fundamental quantities that do not change in value. Ex. Fixed costs and acceleration due to gravity Variables- quantities that may take anyone of a specified set of values. Qualitative (categorical) variables. - non measurable characteristics that cannot assume a numerical value but can be classified into two or more categories. Ex. Sex (male or female), opinion in an issue (for, against, undecided), smoking habits (always, often, seldom, very seldom, never). Those data that are obtained about a qualitative variable are called qualitative data. Quantitative (numerical) variables- those quantities that can be counted with bare hands, can be measured with the use of some measuring devices or can be calculated using mathematical formula Those data involving quantitative variables are called quantitative data. Discrete Variables-actual values obtained by counting. Ex. Number of students, number of vehicular accidents Continuous Variables- are obtained by measurement, usually with units such as height, weight, time in minutes -also obtained by evaluating values using a formula such as profits, 1Q and final grades Sources of Data Data- refers to facts concerning things such as status in life of people, defectiveness 1. Primary-from eye or ear witness of past, first hand information SEME 112 - Advanced Statistics Module | 2, Secondary- information furnished by a person who was not a direct observer or participant of the event 3. Documentary data- data obtained from records of offices, hospitals etc. 1.7 Scales of Measurement 1, Nominal - involves naming or labeling; that is of placing cases into categories and counting their frequency of occurrence -distinguishes responses into attributes or categories ex. religion, gender( real dichotomy), nationality, aggression ( either active or passive- artificial dichotomy) 2. Ordinal -distinguishes among categories arranged in rank order, grouped according to rank/ ranges Example of ordinal scale: military rank, comparing and rank-ordering of socio-economic status (high, middle, low); state of happiness (very happy, not so happy, unhappy, very unhappy); rank in an oratorical contest ( It cannot be concluded that the 1* places is twice as good as the 2™ placer. 3. Interval-expressed in terms of numbers and differences between successive numbers are consistently the same -not only tells about the ordering of categories but also indicates exact distance between them (ex. Score in an exam, IQ, performance) -arbitrary zero-0 A zero score does not mean he has no knowledge of the subject at all. 4, Ratio- like the interval measurements are also expressed in numbers and the differences between any two successive numbers are consistent - it has a true zero, meaning measurement starts with zero ex. no of children, height, speed, capacity, years of experience For a statistical technique to be more manageable, an interval /ratio variable may be converted to an ordinal variable, for example length of service Length of Service Rank 40 and above 1 30-39 z 20-29 years 3 10-19 years 4 10 years below 5 SEME 112 - Advanced Statistics Module | & THINK! a ‘Answer the following exercis Exercise I. Name other fields or even agencies of the government and identify specific uses of Statistics in those fields or agencies. Exercise 2. Categorize each of the following according to the level of measurement. 1. sex 2. religious affiliation 3. no. of immediate family members 4. highest level of educational attainment 5. monthly income 6. social class you belong 7. the region where you live 8. math grade 9. first place, second place, third place in a lantern contest 10. rating of a teacher in the licensure exams SEME 112 - Advanced Statistics Module | Lesson 2 Determining Sample Size Determining Sample Size Population-consists all elements considered in a study. It is a universal set. Ex. Students in DMMMSU Finite-can be counted Infinite-cannot be counted Sample-representative group taken from a population Why get a sample size instead of a population? 1. very expensive to get the entire population 2. time consuming 3. the sampling enables the researcher to do some inferences or generalizations Let N be the population size and the margin of error e denotes the allowed probability of committing an error in selecting a small representative of the population. The sample size n can be obtained by using the Slovin’s Formula x wet where: n=sample size N: pulation size jesired margin of error The margin of error, e, could range between 1% or .01 and 10% depending on the desire of the researcher. However, the researcher should be aware of the Law of Large Number which states, “The larger the size of the sample, the more certain we can be sure that the sample mean is a good estimate of the population mean.” The larger the size of the sample, the closer its characteristics would be to the characteristics of the entire population.” (It can be noted that the higher the margin of error, the lesser sample size can be computed.) In social research, usually 5% or .05 is used while in medical research studies, 1% or .01 is used. Lynch Formula = inz?xp =) Nate zp G7) SEME 112 - Advanced Statistics Module | n= sample size Population size Z = the standard value of (2.58) of 1% level probability with 0.99 reliability ( 1.96 for 5%) margin of error (.05) largest possible proportion (0.50) for getting the correct number of sample from the population Solve the sample size for N= 3590 using the slovin’s formula and lynch formula. Compare the results. 3590 Using the slovin’s formula, n = sos cony n= 360 [3590 (1.96)2x 5-5) [35900544 (196)(5)1-3)] Using the lynch formula, n = _ 347836 99354 n= 347 Using stratified random sampling, get the sample size for each group of respondents using slovin’s and lynch formula a. Using Slovin's Formula, the multiplier is = which is 22% DMMMSU Community Group Population Size per | Sample Size per group Group ‘Administrators 40 a Teachers 385 39 Staffs/Personnel 65 7 Students 3100 310 Total 3590 360 b. Using Lynch Formula, the multiplier is > which is SEME 112 - Advanced Statistics Module | 1. 40x 2-4 3500 2. 385 x 2% = 37 3390 DMMMSU Community Group Population Size per | Sample Size per group Group ‘Administrators 40 a Teachers 385 37 Staffs/Personnel 65 6 Students 3100 300 Total 3590 347 SAMPLING TECHNIQUE Questions such as “Which TV network is the most popular among the people in town?” or “Who will probably be the next president of the country?” require gathering of information from a number of respondents in a population. Complete enumeration or the so-called census taking is a vital tool if the information gathered would be used for administrative purposes and if is of local or national concern. Sample surveys are preferred due to material constraints like money, time and efforts Sampling Techniques- is selecting a part of the population to represent the population 1. Probability sampling- (also known as random sampling) every member of the population has an equal chance of being selected for the sample, also called fair sampling a. Lottery/ fishbowl technique- the name of each member in a population is written in a piece of paper then draws n out of N pieces of papers as desired for a sample. b. Table of random numbers: computer generated number representation. Point an entry in the Table, then proceed in any direction vertically, horizontally, or diagonally until n_ distinct numbers could represent the numerically coded elements in the population SEME 112 - Advanced Statistics Module | 12 c. Systematic sampling: this method is taking every kth element in the population (ex. Arranged alphabetically or by age, experience or position) By systematic sampling, every kth employee from the listed order will be included in the sample. If N is nown, k value can be computed as k=“ where N is the population size and n is the sample size d. Stratified- the group is divided based on homogeneity and samples will be selected from each stratum. (When the population can be partitioned into several strata or subgroups, it may be wiser to employ the stratified technique to ensure a representative of each group in the sample. 1. Simple stratified random sampling - The same number of respondents are taken from each stratum Suppose a population of students taking History of size N= 800 can be grouped according to year levels, 50 students will be taken randomly from each of the four groups and that comprises a sample of 200 students. 2. Stratified proportional random sampling-the sample is taken from the strata proportionally e, Multi-Stage-this technique uses several stages in getting the sample from the population. However, the selection is still done at random. Ex. A researcher needs one Barangay in the Philippines. Using lottery method, he can pick first from the different regions, then provinces in the region picked, then towns, then barangays. 2. Non probability sampling: (selective or non-random sampling) not all members are given equal chance of being selected, also called bias sampling a. Purposive (judgement) sampling-representative samples are deliberately chosen based on judgement or criteria. (Ex. Study on the type of credit card plan availed by customers. In determining the sample, the researcher may consider only the people who seem to have white-collar jobs based on attire.) SEME 112 - Advanced Statistics Module | ® I. Given distribution to each of the sector. 2B Quota sampling-the choice of the number of persons or elements to be included in a sample is done at the researcher’s own convenience or preference. Cluster Sampling-sometimes referred to as an area sample because it is usually applied on a geographical basis. . Accidental/Incidental Sampling -the design is applied to those samples which are taken because they are the most available (ex. An interviewer can simply choose to ask those people around him or in a coffee shop where he is taking a break) . Convenience Sampling- this is utilizing the easiest way of reaching the subject (ex. Opinions of ty viewers and listeners concerning a controversial issue- get responses and comments from those who will call) THINK! ‘olve for the sample size using slovin’s and lynch formula by filling in the corresponding boxes the population data below, solve for the sample size and show the N n Faculty 250 Administration 60 Office Personnel 180 Maintenance 150 Students 5000 2 School Teachers Students Total Population | Sample | Population | Sample_| Population | Sample UNP. 32 320 ISPSC 35 114 NUPSC 25 118 DMMMSU | 62 188 NLUC DMMNSU__[72 289 SEME 112 - Advanced Statistics Module | LUC DMMMSU | 72 460 SLUC PSU 52 306 Lingayen PSU 700 350 Bayambang PSU Asingan | 23 a7 Total 473 z, 272 3. School Teachers Students Total Population | Sample | Population | Sample_| Population | Sample UNP 32 320 ISPSC 35 114 NUPSC 25 118 DMMMSU | 62 188 NLUC DMMMSU 72 289 MLUC DMMMSU | 72 460 SLUC PSU 32 306 Lingayen PSU 700 350 Bayambang PSU Asingan_| 23 127 Total SEME 112 - Advanced Statistics Module | Lesson 3 aa) Tools in Gathering Data TOOLS IN GATHERING DATA: Advantages and Disadvantages 1. Direct or Interview--it is a purposeful face to face interaction between two persons, one of whom called the interviewer who asks questions to gather information and the other called the interviewee or the respondent who supplies the information asked for.. It can be tape recorded or written interview Advantages: Precise and consistent answers are obtained by rephrasing or recasting the questions especially to illiterate respondents or to children. Follow up questions can be raised for clarification. Disadvantages: It is money, time, and effort consuming and it will be applicable only for small population, except when conducting a census. Steps in the interview 1. Planning Step selection of the universe and locale of the study -selection of the respondents by any valid sampling method -selection of type of interview -preparation of the instrument (questions to be asked) 2. Selecting a place for interview 3. Establishing rapport. 4. Carrying out the interview. 5. 6. . Recording the interview. . Closing the interview. What to Avoid in Interviews 1. Avoid exerting undue pressure upon a respondent to make him participate in an interview. 2. Avoid disagreeing or arguing with or contradicting the respondent. 3. Avoid unduly pressing the respondent to make a reply. 4, Avoid using a language well over and above the ability of the respondent to understand, 5. Avoid talking about irrelevant matters. 6. Avoid placing the interviewee in embarrassing situations. 7. Avoid appearing too high above the respondent in education, knowledge, and social status. 8. Avoid interviewing the respondent in an unholy hour. 2. Indirect or Questionnaire. an alternative method for the interview method, -paper pencil data gathering method. Written responses are obtained by SEME 112 - Advanced Statistics Module | 16 distributing questionnaires (a list of questions intended to elicit answers to a given problem, must be in a logical order and not too personal) to the respondents through mail, on line or hand carry Advantages: Consumes lesser time, money and efforts. Disadvantages: Many responses may not be consistent due to the poor construction of the questionnaire. The meaning of the questions may vary from one person to the other. Inconsistent responses can no longer be modified, hence, it reduces valid number of respondents. Guidelines: . Make all directions clear. . Use correct grammar. . Make all questions unequivocal. . Avoid asking biased questions . Objectify the responses. . Relate ail questions to the topic under study. . Create categories or classes for approximate answers. . Group the questions in logical sequence. i. Create sufficient number of response categories. j. Word carefully or avoid questions that deal with confidential or embarrassing information. k. Explain and illustrate difficult questions. L. State all questions affirmatively m. Make as many questions as would supply adequate information for the study. n, Add a catch-all word or phrase to options of multiple response questions 0, Place all spaces for replies at the left side p. Make the respondents anonymous zerpance 3. Observation: is a scientific method of investigation that makes possible use of all senses to measure or obtain outcomes/responses from the object of study. Data which cannot be gathered using the other tools can be gathered using observation. Ex. Teaching performance of Mathematics teachers. -a means of gathering information for research, may be defined as perceiving data through senses: sight, hearing, taste, touch, and smell. It is widely used in studying behavior. Advantages: Observation method is usually applied to respondents that cannot be asked or need not speak, especially when behaviours of persons/culture of organization/ performance outcomes of employees/ students are to be considered. Disadvantages: Subjectivity of information sought cannot be avoided. Making Observation More Valid and Reliable 1. Use observation where and when other data gathering devices cannot be used. SEME 112 - Advanced Statistics Module | 7 2. Use appropriate observation forms. 3. Record immediately. 4, Be as objective as possible 5. Base evaluation on several observations. 4, Registration Method/ Documents or Records- is enforced by private organizations or government agencies for recording purposes. It is a process of listing down items of the same kind in some systematic manner for record purposes. Ex. in the Philippine Statistics Authority, data such as population, deaths etc. can be gathered. registered matter may be classified alphabetically, chronologically, quantitatively, qualitatively or otherwise. Advantages: There is an organized data available from different institutions and agencies which can serve as a ready reference for future study or for personal claims of people’s records. Disadvantages: Sometimes agencies have poor Management and Information System. Sometimes, the process or system of registration is not implemented well. It requires rigid protocol to secure data from the different records. 5. Test- a tool used to obtain data about a specific trait or characteristics. It is a device or technique used to measure the performance, skill level, or knowledge of a learner on a specific subject. -a specific type of measuring instrument whose general characteristic is that, it forces responses from a person and the responses are considered to be indicative of the person’s skill, knowledge, attitudes, etc. Classification A. According to Standardization 1. Standard test-prepared by specialist, norms are established 2. Non-standard test-prepared by teachers to measure achievement of their students. B, According to Function 1. Psychological test such as intelligence test, aptitude, personality and vocational and professional interest inventory ‘Advantages: The data gathered is a measure of competence, hence, not a perception. This is an objective method of obtaining a data so long as the test utilize undergo validity and reliability. Disadvantages: Sometimes the data obtained is not valid due to poorly constructed test. Respondents may hesitate to take the test. 6. Experiment - it is used when the objective is to determine the cause-and- effect of a certain phenomenon under some controlled conditions. SEME 112 - Advanced Statistics Module | 18, Advantages: There is objectivity of information since a scientific method of inquiry is used. An equal number of respondents with relatively similar characteristics are being examined to obtain the different effects of something applied to the experimental group. Disadvantages: It’s too difficult to find respondents with almost similar characteristics. The whole method must be repeated if the desired outcome is not reached. & THINK! Exercise 1. Identify 20 different data and determine which data gathering tool is most appropriate. Justify your answer. 2. Name 20 government agencies and determine what data can be obtained from each agency. SEME 112 - Advanced Statistics Module | 19 Lesson 4 CRITERIA FOR DATA GATHERING TOOLS These criteria are applicable for questionnaires and tests. Before they are used to gather data, they should be subjected to validity and reliability tests. 1. Validity- extent to which the procedure actually accomplishes what it seeks to accomplish or it measures what it intends to measure. Experts are supposed to evaluate the test or questionnaire. a. face validity- the construction, arrangement of items and overall presentation are good b. Content validity: relevance of the test items, item analysis, determine which are too easy and which are too difficult 2. Reliability- refers to the degree of consistency, accuracy, stability, repeatability or precision methods: split-half method, test-retest method, parallel-form method, internal consistency method, Richard-kuderson 20 and 21 3. Sensitivity-sensitive to detect changes 4. Specificity-gives only one answer 5. Positive predictive value-note change and improvement 6. Appropriateness-respondents can meet the demands of the instrument 7. Objectivity- free from any influence of the examiner Reliability -means the extent to which a test is dependable, self-consistent and stable. In other words, the test agrees with itself. -refers to the consistency of the scores obtained-how consistent they are for each individual from one administration of an instrument to another and from one set of items to another Tools for reliability 1. Test-retest method The same measuring instrument is administered twice to the same group of subjects. The scores of the first and second administrations of the test are determined by correlation coefficient. The disadvantages are: 1. When the time interval is short, memory effects may operate. The subjects may recall of his previous responses and tends to make the correlation of the test high SEME 112 - Advanced Statistics Module | 20 2. When the interval is long, such factors as unlearning, forgetting, among others may occur and may result to low correlation of the test 3. Regardless of the time interval separating the two administrations, other varying environmental conditions such as noise, temperature, lighting and other factors may affect the correlation of the test Spearman rank correlation coefficient or Spearman rho may be used to correlate the scores of this method. The formula is 62D? N3-N Where: XD? =the sum of the squared difference between ranks N = the total number of cases Steps: 1. Rank the scores separately for the two administration giving the highest score a rank of 1. 2. Obtain the difference between the two sets of ranks. 3. Square each difference. 4, Solve for the rank order correlation coefficient. For example, 10 students in second year high school are used as pilot sample to test the reliability of an achievement test in Biology. Determine the reliability coefficient given their scores in the two administrations of the test. Illustrative Example: Students Test x Rx Test Y Ry D 1 18 1 24 4 3 2 17 2 2B 2 0 3 14 3 30 1 2 4 13 4 2% 3004 5 12 5 2 5 0 6 10 6 18 6 0 7 8 7 15 7 0 Using the formula: sca) 749-1) SEME 112 - Advanced Statistics Module | 21 r,=0.75 interpreted as high reliability. (In research, the reliability coefficient should be .70 for it to be acceptable.) How to interpret the coefficient of reliability? Computed Value Interpretation 0. negligible Tt, low 26> moderate St high 76-1. very high 2. Split-half Method The test in this method may be administered once, but the test items are divided into two halves. The common procedure is to divide the test into ‘odd and even items. The two halves of the test must be similar but not identical in content, difficulty, means and standard deviations. Each student obtained two scores, one on the odd and the other on the even items, in one test. The scores obtained in the two halves are correlated. The result is reliability coefficient for a half test. Since the reliability holds only for half test, the reliability coefficient for the whole test may be estimated by using the Spearman-Brown formula. This formula is: 2rne. n we Ltn where: Twe = reliability of the whole test Tne= correlation coefficient between the odd and even scores which is also called reliability of half the test Ilustrative Example: Given the scores in the odd and even nos. determine if the test is reliable: Student Score (40) Even (20) Odd (20) A 40 20 20 B 28 15 13 c 35 19 16 D 38 18 20 E 2 10 12 F 30 12 18 G 35 16 19 SEME 112 - Advanced Statistics Module | 22 H 33 16 7 l 3 12 9 J 28 14 14 For this information it is possible to calculate correlation using the Pearson Product-Moment Correlation coefficient, a statistical measure of the degree of relationship between the two halves. Pearson Product Moment Correlation Coefficient N (xy) zx) @Y) SSS 4 [N2x2—(@x)?] INEY? -Y)?] Te = Using the data above, assume that the X values are the scores in the even numbered items and the Y scores are the odd numbered items. Step 1. Complete the colums for IXY, IX’, LY”. Step 2: Get the summation of each column. x Yy 20 400 15 169 19 256 18 400 10 144 12 324 16 361 16 289 12 361 44 196 Ex=152 3y7=2900 Step 3. Using the formula: SEME 112 - Advanced Statistics Module | 23 h N (@X¥) - x) @Y) 1. SS (NEX2—(2X)?] [NZY? -(2Y)?] where: N=10, compute the reliability of half the test h 10(2595) — (152) (168) 00° —— SSS 4{[10(2595)—(152)] [10(2900) -(168)"] Tix = -48 (this is the reliability of half the test) Step 4. To get the reliability of the whole test, use the formula Tye = -65. (This is interpreted as high reliability, however may not be sufficient for research. Therefore, there is a need to improve the test items.) Other statistical tools to compute the reliability of tests and questionnaires are KR 21, KR 20, Chronbach alpha etc. But technology can be used to determine reliability. Use the reliability calculator created by Del Siegle (dsiegle@uconn.edu). For rating scale, just input the rates given and for test, if the answer is correct encode 1 and if the answer to the item is wrong, input zero. 3. Kuder-Richarson 21 Formula [no2-M (n—M)] (n-1)0? where: r- reliability of the whole test n- product of the number of items in the questionnaire and the highest scale o. variance SEME 112 - Advanced Statistics Module | 24 2 ECM)? ge N Where: x= total score of each respondent M- mean score of the respondents =x Me Where: Ix is the sum of all the scores Nis the number of respondents tC) THINK! Solve for the reliability using the appropriate tool 1. Given the scores below in the even and odd numbered items, determine the reliability of the given test using the appropriate method. Odd Items Even Items 8 6 9 7 10 6 6 8 7 7 5 6 6 8 7 3 2. Given the scores of 13 students in the odd and even numbered items, determine the reliability of the test. Odd 50 50 48 4 45 44 #44 #43 42 42 41 40 40 Even 36 34 44 50 32 28 42 36 28 40 50 3835 SEME 112 - Advanced Statistics Module | 25 3. Given the scores in the first administration and second administration of the same test, find the coefficient of reliability. 82 86 «675 «674 «(68 BOCs HC 99 48 8 «87 «76 «677 «6700 71 66 7H 99 «50 SEME 112 - Advanced Statistics Module | 26 Lesson 5 Optional Organization and Presentation of Data The data which are collected from primary and secondary sources are still considered raw data. It requires manual tallying and classifying of responses. After tallying, an appropriate form of organization and presentation is used to arrive at a meaningful interpretation of data. 1.1 Forms of Presentation of Data A. Textual- This form of data presentation combines text and numerical facts in a statistical report. It can be narrative or in enumerative form. B. Tabular- This presentation of data makes use of statistical tables. Tables are constructed to see right away relationships and comparisons can be done. Each class is assigned to a particular Advantages of Tabular Presentation 4. It is brief and concise. 2. It provides the reader a good grasp of the meaning of quantitative relationship indicated in the report. 3. The whole story is revealed without the necessity of mixing texts with figures. 4. The presentation is systematic with the use of columns and rows making the comparison easier. C. Graphical Presentation- This makes use of graph. This form is the most effective means of organizing and presenting statistical data because the important relationships are brought out more clearly and creatively in virtually solid and colourful figures. 1.2 Different Kinds of Graphs/Charts 1. Line Graph- It shows relationships between two sets of quantities. This is done by plotting point of X set of quantities along the horizontal axis against the Y set of quantities along the vertical axis in a rectangular coordinate plane. Those plotted points will be connected by a line segment which finally forms the line graph. It is used to predict growth trends for a longer period of time. SEME 112 - Advanced Statistics Module | 27 Sample tine Graph Number of daily confirmed COVID-19 cases India —Singapore —Indonesia — Philippines — Japan Malaysia —GreaterChina —SouthKorea — Thailand — Vietnam — Taiwan ¥ Peak 2,400 > 315 (Mar. 16) 15,152 (Feb. 13) = 813 (Feb. 29) + 2,000 => 252 (Mar. 29) =p 24 (Mar. 23) => 27 (Mar. 20) 1,600 ation, Taiwan Centers 2. Bar Graph-This consists of bars or rectangles of equal widths, drawn either vertically or horizontally, segmented or non- segmented. Two or more information can be compared by showing them in multiple bar graphs, each of which is shaded with different colors to give distinctions of each. SEME 112 - Advanced Statistics Module | 28 Coronavirus outbreak in Southeast Asia 1500 25K f ‘mt Cumulative roportod casos g 1 Dally increase in cases 20K © 1000 g 15K = 500 10K ze a 5K ° 0K Feb? = Mar3— Mart8—Apr2.— Apr iT, 3. Circle graph or Pie- It represents relationships of the different components of a single total as revealed in the sectors of a circle. The angles or size of the sectors are proportional to the percentage components of the data which gives a total of 100%. NCR ™ Luzon* = Visayas ™ Mindanao = Repatriate © No Province Distribution of all Covid-19 cases in the Philippines. NCR accounts for 54.5% of all cases, while the rest of Luzon accounts for 12.8%. Visayas accounts for 16.8% of cases while Mindanao accounts for 2.8% of all cases. There are 1,105 cases (4.9%) classified as repatriates, while 1,855 cases or 8.3% are currently uncategorized (i.e. it is not indicated the region of residence of the Covid-19 SEME 112 - Advanced Statistics Module | Cumulative reported cases 29 case, and it is not indicated if the case is a repatriate). Distribution of all Covid- 19 cases in the Philippines. NCR accounts for 54.5% of all cases, while the rest of Luzon accounts for 12.8%. Visayas accounts for 16.8% of cases while Mindanao accounts for 2.8% of all cases. There are 1,105 cases (4.9%) classified as repatriates, while 1,855 cases or 8.3% are currently uncategorized (i.e. it is not indicated the region of residence of the Covid-19 case, and it is not indicated if the case is a repatriate). Source: https://www.up.edu.ph/covid-19-forecasts-in-the-phili cebu-as-of-june-8-2020/ yines-ncr-and- Figure 5. Favourite movie genres in Mrs. Smyth's Film class mComedy om D Action Romance @ Drama Horror Foreign m Science fiction 14% Source: https: //slideplayer.com/slide/5781935/ 4, Picture Graph or Pictogram- It is a visual presentation of statistical quantities by means of drawing pictures or symbols related to the subject under study. Sizes and magnitudes of drawn pictures should be clear enough to depict differences. SEME 112 - Advanced Statistics Module | 30 Pictograph Figure 1. Number of students who like chocolate chip cookies best ov GED w2 PSPS we @8BS Ow. 4 ws SESSA BH ow. 8 OWv.7 Source: 5. Map Graph of Cartogram- It is used to present geographical data. This kind of graph is always accompanied by a legend which tells us the meaning of lines, colors, or other symbols used and positioned in a map. SEME 112 - Advanced Statistics Module | 31 Source: https: //www.google.com/url?sa=iturl=https%3A%2F%2Fourworldindata.o. £g%2Fworld-population- growthtpsig=AOvVaw0Dd5FaqbmDUG_MYLmhEmxwéust=1593497687879 000&source=imagest&cd=vfet&ved=0CIoBEK- JA20XChMIyK3j8bGmé6gIVAAAAABOAAAAAE AI Source: https: / www. google.com /url?sa=i&url=https%3A%2F%2Fstories. thinkingm achin.es%2Fphilippine- languages%2F&psig=AOWawOMVryJWIGiToMUadmdhE3- Gust=1593499853426000&source=imagestcd=vfe&ved=0CAMQjB1qFwoTC KCst-a3puoCFQAAAAAGAAAAABAD 6. Scatter Point Diagram- It is a graphical device to show the relationship between two quantitative variables. SEME 112 - Advanced Statistics Module | Scatter Plot - Positive Correlation 12 os 06 Weight gained oa 02 1000 1500 2000 2500 3000 3500 ‘4000 500 Calories Consumed Source: https: // www. qimacros.com/scatter-plot-excel/scatter-plot-examples/ tC) THINK! Activities: Read and interpret the contents of the sample graphs above. Research for other sample graphs and give brief interpretation of each. Construct the most appropriate graph for each data set. Describe and interpret the data using graphs. 1. Monthly budget of a family with an income of 23,000 per month Category Amount Food P 8, 000 Shelter P'5, 000 Education P4, 000 Clothing P14, 000 Medical Care P2, 000 Savings P14, 000 Miscellaneous P2, 000 SEME 112 - Advanced Statistics Module | 2. Periodic grades of a fourth year high school students in three subjects Subject Grading Period First Second Third Fouth English 80 4 85 90, ‘Mathematics | 82 85 86 89 Science 78 80, 2 82 El MODULE SUMMARY In module |, you have learned about the study of statistics. The five lessons are basic concepts on statistics, determining sample size, tools in gathering data, criteria for data gathering and organization and presentation. There are five lessons in module I. Lesson 1 consists of the meaning, function, branches/ fields of Statistics as well as the kinds of test in statistics. Statistics describe a group in terms of what is average and in terms of dispersion. It also determine existence of relationship and differences. It is functional in all fields and in all agencies of the government. The two types of Statistics are descriptive and inferential and the kinds of tests are parametric and non parametric. The sources of data are primary or direct witness to the event, secondary or information furnished by a person who was not a direct observer or participant to the event and documentary data or data obtained from records of offices, hospitals etc. The scale of measurement ae nominal, ordinal, interval and ratio. Lesson 2 deals with determining sample size which includes meaning of population and sample, computation of sample size using slovin and lynch formula. Moreover, it also deals with sampling techniques. Lesson 3 includes tools in gathering of data, the advantages and disadvantages. These are using interview, questionnaire, documents or records, observation, test and experiment. Lesson 4 considers the criteria for data gathering such as validity and reliability. Computation of reliability coefficients using the various tools are also considered. The tools for relaiability are test-retest method, split half method, kuder-richardson, chronbach alpha and others. The reliability calculator can also be used to compute reliability. SEME 112 - Advanced Statistics Module |

You might also like