Professional Documents
Culture Documents
STATISTIKA
Pada hari ini Kamis tanggal 19 bulan Agustus tahun 2021 Bahan Ajar Mata Kuliah
Statistika Program Studi Pendidikan Bahasa Inggris telah diverifikasi oleh Ketua
Jurusan Pendidikan dan Sastra Inggris.
General skills:
- Able to implement logical, critical, systematic and innovative thinking in the
context of developing or implementing knowledge and technology in accordance
to humanities based on the expertise.
- Able to show independent, qualified and measurable work;
- Able to make correct decision in the context of problem solving in their expertise
based on the analysis of information and data.
Knowledge
- Able to do research related to English language learning & teaching
- Master theoretical concept of other field of studies in general as well as
formulating procedural problem-solving related to the concepts.
CP Mata Kuliah
CLO 1
Able to demonstrate responsibility by integrating norms, values, and academic
etiquette.
CLO 2
Able to make correct decisions in the context of problem solving in their expertise
based on the analysis of information and data.
CLO 3
Able to apply ICT in the learning process
CLO 4
Able to do research related to English language learning and teaching
CLO 5
Master theoretical concept of English language learning and teaching as well as
other field of studies
DAFTAR ISI
For instance:
As a more concrete example, suppose that an individual, Dave, needs to lose weight for
his upcoming high school reunion. Dave decided that the best way for him to do this was
through dieting or adopting an exercise routine. A health counselor that Dave has hired to
assist him, gave him four options:
Dave, who understands that using data in making decisions can provide additional insight,
decides to analyze some historical trends that his counselor gave him on individuals (similar
in build to Dave) that have used the different diets. The counselor provided the weights for
different individuals who went on these diets over an eight week period.
Weight
Diet 1 2 3 4 5 6 7 8
South Beach 310 312 308 304 300 295 290 289
Reduced Calorie 310 307 306 303 301 299 297 295
Boot Camp 310 308 305 303 297 294 290 287
Average Weight Loss on Various Diets across Eight Weeks (initial weight listed in week 1)
What if Dave’s goal is to lose the most weight? The Atkins1 diet would seem like a reasonable
choice as he would have lost the most weight at the end of the eight week period. However,
the high protein nature of the diet might not appeal to Dave. Also, Atkins seems to have
some ebb and flow in the weight loss (see week 7 to week 8). What if Dave wanted to lose
the weight in a steadier fashion? Then perhaps boot camp might be the diet of choice since
the data shows an even, steady decline in weight loss. This example demonstrates that Dave
can make an educated decision that is congruent with his own personal weight loss goals.
There are two types of statistics that are often referred to when making a statistical decision
or working on a statistical problem.
Definitions
Descriptive Statistics: Descriptive statistics utilize numerical and graphical methods to look
for patterns in a data set, to summarize the information revealed in a data set, and to present
the information in a convenient form that individuals can use to make decisions. The main
goal of descriptive statistics is to describe a data set. Thus, the class of descriptive statistics
includes both numerical measures (e.g. the mean or the median) and graphical displays of
data (e.g. pie charts or bar graphs).
Inferential Statistics: Inferential statistics utilizes sample data to make estimates, decisions,
predictions, or other generalizations about a larger set of data. Some examples of inferential
statistics might be a z statistics or a t-statistics, both of which we will encounter in later
sections of this book.
What information can be gleaned from the above caloric intake log? The simplest thing
to do is to look at the total calories consumed for each day. These totals are shown below.
The third point is an interesting take away from the above analysis. When Dave skips
breakfast, his lunches are significantly higher in caloric content than on the days that he
eats a breakfast. What Dave can take away from this is quite clear – he should start each
day by eating a healthy breakfast as this will most likely lead to eating less total calories
during the day.
Key Definitions
Definition
Dependent Variable: The variable which represents the effect that is being tested
Independent Variable: The variable that represents the inputs to the dependent variable,
or the variable that can be manipulated to see if they are the cause.
In this example, the dependent variable is whether an individual was able to complete a
marathon. The independent variable is which brand of shoes they were wearing, the CEO’s
brand or a different brand. These variables would operationalized by adopting a measure
for the dependent variable (did the person complete the marathon) and adopting a measure
for the kind of sneaker they were wearing (if they were wearing the CEO’s brand when
they ran the race).
After the variables are operationalized and the data collected, select a statistical test to evaluate
their data. In this example, the CEO might compare the proportion of people who finished
the marathon wearing the CEO’s shoes against the proportion who finished the marathon
wearing a different brand. If the CEO’s hypothesis is correct (that wearing his shoes helps
to complete marathons) then one would expect that the completion rate would be higher
for those wearing the CEO’s brand and statistical testing would support this.
Since it is not realistic to collect data on every runner in the race, a more efficient option
is to take a sample of runners and collect your dependent and independent measurements
on them. Then conduct your inferential test on the sample and (assuming the test has been
conducted properly) generalize your conclusions to the entire population of marathon runners.
While descriptive and inferential problems have several commonalities, there are some
differences worth highlighting. The key steps for each type of problem are summarized below:
Quantitative data are measurements that can be recorded on a naturally occurring scale.
Thus, things like the time it takes to run a mile or the amount in dollars that a salesman
has earned this year are both examples of quantitative variables.
Introduction
Educational researchers in every discipline need to be cognisant of alternative
research traditions to make decisions about which method to use when embarking
on a research study. There are two major approaches to research that can be used
in the study of the social and the individual world. These are quantitative and
qualitative research. Although there are books on research methods that discuss the
differences between alternative approaches, it is rare to find an article that exam-
ines the design issues at the intersection of the quantitative and qualitative divide
based on eminent research literature. The purpose of this article is to explain the
major differences between the two research paradigms by comparing them in
terms of their epistemological, theoretical, and methodological underpinnings.
Since quantitative research has well-established strategies and methods but quali-
tative research is still growing and becoming more differentiated in methodological
approaches, greater consideration will be given to the latter.
Assumptions Assumptions
• Reality is single, tangible, and fragmentable. Social facts have an objective • Realities are multiple, constructed, and holistic. Reality is socially
reality. constructed.
• Knower and known are independent, a dualism. • Knower and known are interactive, inseparable.
• Primacy of method • Primacy of subject matter
• Variables can be identified and relationships measured • Variables are complex, interwoven, and difficult to measure.
• Inquiry is objective, value-free. • Inquiry is subjective, value-bound.
Purposes Purposes
• Generalisability (Time and context free generalisations through • Contextualisation (Only time and context bound working hypotheses
nomothetic or generalised statements) through idiographic statements)
• Prediction • Interpretation
• Causal explanations • Understanding actors’ perspectives
Approach Approach
• Begins with hypotheses and theories • Ends with hypotheses or grounded theory
• Manipulation and control • Emergence and portrayal
• Uses formal, structured instruments • Researcher as the instrument
• Experimentation and intervention • Naturalistic or nonintervention
• Deductive • Inductive
• Component analysis • Searches for patterns
• Seeks consensus, the norm • Seeks pluralism, complexity
• Reduces data to numerical indices • Makes minor use of numerical indices
© 2013 John Wiley & Sons Ltd
Source: Adapted from Glesne & Peshkin (1992); Lincoln & Guba (1985).
Kaya Yilmaz 315
information about the research topic studied (Denzin & Lincoln, 1998; Patton,
2002; Wolcott, 1994).
Methods of data collection and analysis are also different in the two
approaches. Quantitative research uses questionnaires, surveys and systematic
measurements involving numbers. Quantitative researchers use mathematical
models and statistics to analyse the data and report their findings in impersonal,
third-person prose by using numbers. In contrast, qualitative research uses par-
ticipants’ observation, in-depth interviews, document analysis, and focus groups.
The data are usually in textual, sometimes graphical or pictorial form. Qualitative
researchers disseminate their findings in a first-person narrative with a combina-
tion of etic (outsider or the researcher’s) and emic (insider or the participants’)
perspectives (Denzin & Lincoln, 1998; Miles & Huberman 1994). Since qualita-
tive findings are highly context and case dependent, researchers are expected to
keep findings in context and report any personal and professional information that
may have an impact on data collection, analysis, and interpretations. Bracketing
their points of view and biases, the researchers must avoid making any judgement
about whether the situation in which they are involved and participants are
engaged is good or bad, appropriate or inappropriate. In addition, researchers
should make their orientation, predispositions, and biases explicit. Lastly, qualita-
tive research reports must provide the reader with sufficient quotations from
participants (Patton, 2002).
There are some situations or questions that qualitative research methods illus-
trate better than quantitative ones or vice versa. For instance, qualitative methods
are especially effective to study a highly individualised programme in which learn-
ers who have different abilities, needs, goals, and interests proceed at their own
pace. In this case, quantitative methods can provide parsimonious statistical data
through mean achievement scores and hours of instruction. But, in order to
understand the meaning of the programme for individual participants, their points
of view and experiences should be illustrated with their own words via such
qualitative methods and techniques as case studies and interviews which provide
the detailed, descriptive data needed to deepen our understanding of individual
variation. On the other hand, some situations require quantitative research design
to be effectively addressed. For example, quantitative methods are more helpful
when conducting research on a broader scale or studying a large number of people,
cases, and situations since they are cost-effective and statistical data can provide a
succinct and parsimonious summary of major patterns (Patton, 2002).
Ontological What is the Reality is subjective and Researcher uses quotes and
nature of multiple, as seen by themes in words of
reality? participants in the study participants and provides
evidence of different
perspectives
Epistemological What is the Researcher attempts to lessen Researcher collaborates,
relationship distance between himself or spends time in field with
between the herself and that being participants, and becomes
researcher researched an ‘insider’
and that
being
researched?
Axiological What is the role Researcher acknowledges that Researcher openly discusses
of values? research is value laden and values that shape the
that biases are present narrative and includes his
or her own interpretation
in conjunction with the
interpretations of
participants
Rhetorical What is the Researcher writes in a Researcher uses an engaging
language of literary, informal style using style of narrative, may use
research? the personal voice and uses first-person pronoun, and
qualitative terms and employs the language of
limited definitions qualitative research
Methodological What is the Researcher uses inductive Researcher works with
process of logic, studies the topic particulars (details) before
research? within its context, and uses generalisations, describes
an emerging design in detail the context of the
study, and continually
revises questions from
experiences in the field
cases which are unique and context-bound, ‘traditional thinking about generaliz-
ability falls short . . . and reliability in the traditional sense of replicability is
pointless’ (Denzin & Lincoln, 1998, p. 51). It is believed that because ontological,
epistemological, and theoretical assumptions of qualitative research are so funda-
mentally different from those of quantitative research, it should be judged on its
own terms. Hence, it is proposed that rather than the concepts of validity and
reliability, an alternative set of criteria based on qualitative concepts need to be
used to judge the trustworthiness of a qualitative research which needs its own
criteria for evaluation (Gibbs, 2007; Lincoln & Guba, 1985; Wolcott, 1994).
The concept of validity in quantitative study corresponds to the concept of
credibility, trustworthiness, and authenticity in qualitative study which means that the
study findings are accurate or true not only from the standpoint of the researcher
but also from that of the participants and the readers of the study (Creswell &
Miller, 2000). The concept of reliability in quantitative study is comparable, but
not identical, with the concept of dependability and auditability in qualitative study,
which means that the process of the study is consistent over time and across
different researchers and different methods or projects (Gibbs, 2007; Miles &
Huberman, 1994). To judge the quality or (a) credibility and (b) dependability of
a qualitative study, the following questions compiled from various studies can be
asked (Miles & Huberman, 1994, pp. 278–279):
Credibility (instead of validity) questions:
• How context-rich and detailed are the basic descriptions?
• Does the account ‘ring true’, make sense, seem convincing or plausible, enable
a ‘vicarious presence’ for the reader?
• Is the account rendered comprehensive, respecting the configuration and tem-
poral arrangement of elements in the local context?
• Did triangulation among complementary methods and data sources generally
lead to converging conclusions? If not, is there a coherent explanation for this?
• Are the presented data linked to the categories of prior or emergent theory if
used?
• Are the findings internally coherent and concepts systematically related?
• Were guiding principles used for confirmation of propositions made explicit?
• Are areas of uncertainty identified?
• Was negative case or evidence sought for? Found? What happened then?
• Have rival explanations been actively considered? What happened to them?
• Were the conclusions considered to be accurate by the participants involved in
the study? If not, is there a coherent explanation for this?
Dependability (instead of reliability) questions:
• Are research questions clearly defined and the features of the study design
congruent with them?
• Are basic paradigms and analytic constructs clearly specified?
• Are the researcher’s role and status within the site explicitly described?
• If multiple field-researchers are involved, do they have comparable data collec-
tion protocols?
• Do multiple observers’ accounts converge, in instances, settings, or times when
they might be expected to?
• Were data connected across the full range of appropriate settings, times,
respondents suggested by the research questions?
• Were coding checks made and did they show adequate agreements?
• Were data quality checks for bias, deceit, informant knowledgeability etc.
made?
• Do findings show meaningful parallelism across data sources (informants,
contexts, and times)?
• Were any forms of peer or colleague review employed?
Lincoln and Guba are commonly acknowledged to have made a great contribution
to the criteria debate in qualitative research by developing parallel criteria to the
concepts of validity and reliability (Spencer et al., 2003). Their alternative criteria
are constantly cited in the research literature and still considered to be the yard-
stick or the ‘gold standard’ (Spencer et al., 2003; Whittemore, Chase & Mandle,
2001).To assess the rigour of qualitative research, Lincoln and Guba (1985) resort
to the concepts of credibility, transferability, dependabilty, and confirmability to
express the quantitative concepts of internal validity, external validity (generalis-
ability), reliability, and objectivity respectively. Credibility means that the partici-
pants involved in the study find the results of the study true or credible.
Transferability is achieved if the findings of a qualitative study are transferable to
other similar settings.Thick description of the setting, context, people, actions, and
events studied is needed to ensure transferability or external validity in quantitative
terms. The study has dependability (reliability) if the process of selecting, justifying
and applying research strategies, procedures and methods is clearly explained and
its effectiveness evaluated by the researcher and confirmed by an auditor, which is
called ‘audit trail’. The study enjoys confirmability when its findings are based on
the analysis of the collected data and examined via an auditing process, i.e. the
auditor confirms that the study findings are grounded in the data and inferences
based on the data are logical and have clarity, high utility or explanatory power
(Table III).
TABLE III. Criteria for Judging the Quality of a Research Study: Quantitative
vs. Qualitative Terms
Aspect Quantitative terms Qualitative terms
reliability and validity1. Patton (2002) argues that judging ‘quality’ constitutes the
foundation for perceptions of credibility. Issues related to quality and credibility
correspond to the audience and intended inquiry purposes. Therefore, the criteria
for judging quality and credibility depend on the philosophical underpinnings,
theoretical orientations, and purposes of a particular qualitative research. Taking
this crucial point into account, Patton (2002, pp. 542–544) suggests alternative
sets of five criteria for judging the quality or credibility of a qualitative inquiry: (1)
traditional scientific research criteria, (2) social construction and constructivist
criteria, (3) artistic and evocative criteria, (4) critical change criteria, and (5)
evaluation standards and principles (Table IV).
The credibility of a qualitative study is affected by the extent to which system-
atic data collection procedures, multiple data sources, triangulation, thick and rich
description, external reviews or member checking, external audits, and other
techniques for producing trustworthy data are used. According to Patton (2002),
three distinct but related inquiry elements determine the credibility of a qualitative
research:
(1) rigourous methods to do fieldwork that yield high-quality data that
are systematically analysed with attention to issues of credibility; (2) the
credibility of the researcher, which is dependent on training, experience,
track, record, status, representation of self; and (3) philosophical beliefs in the
value of qualitative inquiry, i.e. an appreciation of naturalistic inquiry, quali-
tative methods, inductive analysis, purposeful sampling, and holistic thinking
(pp. 552–553).
For a qualitative study to be credible and trustworthy, the data must first and
foremost be sufficiently descriptive and include a great deal of pure description of
people, activities, interactions, and settings so that the reader or reviewer can
understand what occurred and how it occurred. The basic criterion to judge the
credibility of data is the extent to which they allow the reader to enter the situation
or setting under study. In other words, rich and detailed or thick description of the
setting and participants is a must.The researcher must provide an accurate picture
of the empirical social world as it exits to those under investigation, rather than as
he or she imagines it to be. The descriptions must be factual, accurate, detailed
but without being overburdened with irrelevant information or trivia. In addition,
researchers should overtly reveal the biases they bring to the study and discuss
how their background such as gender, ethnicity, disciplinary orientation and ideo-
logical viewpoint affected the interpretation of the findings. Since the nature of
qualitative inquiry is fundamentally people-oriented, qualitative researchers must
get close enough to the people and situation being studied in order to capture
what actually takes place and what people actually say; i.e. develop an in-depth
understanding of the phenomenon under investigation. To that end, they should
spend prolonged time in the setting with the participants without dismissing the
negative or discrepant cases observed in the setting. Member checking (the partici-
pants check and evaluate the final research report to determine if its descriptions
and themes accurately reflect their viewpoints), peer debriefing (involving another
researcher in reviewing the study report to see if it fits or resonates with the
experience of both the participants and the audience rather than the researcher),
and external auditor (an independent researcher who, unlike the peer debriefer, is
not familiar with the researcher, reviews the study project to evaluate its accuracy)
TABLE IV. Criteria for Judging the Quality and Credibility of Qualitative
Inquiry
Traditional Scientific Research Criteria
䊏 Objectivity of the inquirer [attempts to minimise bias]
䊏 Validity of the data
䊏 Systematic rigour of fieldwork procedures
䊏 Triangulation [consistency of findings across methods and data sources]
䊏 Reliability of coding and pattern analysis
䊏 Correspondence of findings to reality
䊏 Generalisability [external validity]
䊏 Strength of evidence supporting causal hypothesis
䊏 Contributions to theory
Construction and Constructivist Criteria
䊏 Subjectivity acknowledged [discusses and takes into account biases]
䊏 Trustworthiness and authenticity
䊏 Triangulation [capturing and respecting multiple perspectives]
䊏 Reflexivity and praxis
䊏 Particularity [doing justice to the integrity of unique cases]
䊏 Enhanced and deepened understanding [Verstehen]
䊏 Contributions to dialogue
Artistic and Evocative Criteria
䊏 Opens the world to us in some way
䊏 Creativity
䊏 Aesthetic quality
䊏 Interpretive vitality
䊏 Flows from self; embedded in lived experience
䊏 Stimulating
䊏 Provocative
䊏 Connects and moves the audience
䊏 Voice distinct and expressive
䊏 Feels true, authentic or real
Critical Change Criteria
䊏 Critical perspective: Increases consciousness about injustice
䊏 Identifies nature and sources of inequalities and injustice
䊏 Represents the perspective of the less powerful
䊏 Makes visible the ways in which those with more power exercise and benefit from this power
䊏 Engages those with less power respectfully and collaboratively
䊏 Builds the capacity of those involved to take action
䊏 Identifies change-making strategies
䊏 Clear historical and values context
Evaluation Standards and Principles
䊏 Utility
䊏 Feasibility
䊏 Propriety
䊏 Accuracy [balance]
䊏 Systematic inquiry
䊏 Evaluator competence
䊏 Integrity/honesty
䊏 Respect for people [fairness]
䊏 Responsibility to the general public welfare [taking into account diversity of interests and
values]
Conclusion
Educational researchers, especially graduate students, need to acquaint them-
selves with different research approaches. Quantitative and qualitative research
approaches represent the two ends of the research continuum. They differ in
terms of their epistemological assumptions, theoretical frameworks, methodo-
logical procedures and research methods. Whereas the former is based on posi-
tivism or objective epistemology, relies on quantitative measures for collecting
and analysing data, and aims to make predictions and generalisations, the latter
is based on constructivism, draws on naturalistic methods for data collection and
analysis, and aims to provide an in-depth understanding of people’s experiences
and the meanings attached to them. Having been viewed not only as competitive
but also incompatible research paradigms for some decades, they are now con-
sidered as alternative strategies for research. Both approaches have their
own strengths and weaknesses in their design and application. Which approach
should be used when planning a research depends on several factors such as
the type of questions asked, the researcher’s training or experinces, and the
audience.
Kaya Yilmaz, Marmara University, Ataturk Faculty of Education, Department of
Elementary Education, Kadikoy, Istanbul, Turkey, kaya.yilmaz@marmara.edu.tr
NOTE
1. However, the term validity is used by some researchers in relation to qualita-
tive research. Schwandt (1997) defines validity as the extent to which the
qualitative account accurately represents the research participants’ views of
social phenomena and is credible to them. Likewise, reliability is defined as
the extent to which the qualitative study provides an ‘understanding’ of a
situation, setting, case, programme, or event that otherwise would be confusing
and enigmatic (Eisner, 1991, p. 58). Lincoln and Guba (1985) use the concept
of dependability to refer to reliability in quantitative studies (p. 300).
REFERENCES
BERGMAN, M. M. (2008) Advances in Mixed Methods Research (Los Angeles,
Sage).
BREWER, J. D. (2003) Qualitative research, in: R. L. MILLER & J. D. BREWER (Eds)
The A-Z of Social Research: a dictionary of key social science research concepts
(pp. 238–241) (Thousand Oaks, Sage).
BRYMAN, A. (1988) Quantity and Quality in Social Research (London, Routledge).
COHEN, L., MANION, L. & MARRISON, K. (2007) Research Methods in Education
(6th edition) (New York, Routledge).
CORBIN, J. & STRAUSS, A. (2008). Basics of Qualitative Research: techniques and
procedures for developing grounded theory (3rd edition) (Thousand Oaks, Sage).
CRESWELL, J. W. (1994) Research Design: qualitative & quantitative approaches
(Thousand Oaks, Sage).
CRESWELL, J.W. (2007) Qualitative Inquiry and Research Design: choosing among five
traditions (2nd edition) (Thousand Oaks, Sage).
CRESWELL, J. W. (2009) Research Design: qualitative, quantitative and mixed
approaches (3rd edition) (Thousand Oaks, Sage).
CRESWELL, J.W. & MILLER, D. (2000) Determining validity in qualitative inquiry,
Theory into Practice, 39, pp. 124–130.
CROTTY, M. (1998) The Foundations of Social Research: meaning and perspective in
the research process (Thousand Oaks, Sage).
DAVIES, D. & DODD, J. (2002) Qualitative research and the question of rigor,
Qualitative Health Research, 12, pp. 279–289.
DENZIN, N. K. & LINCOLN, Y. S. (1994) Introduction: entering the field of
qualitative research, in: N. K. DENZIN & Y. S. LINCOLN (Eds) Handbook
of Qualitative Research (pp. 1–17) (London, Sage).
DENZIN, N. K. & LINCOLN, Y. S. (1998). Strategies of Qualitative Inquiry
(Thousand Oaks, Sage).
DENZIN, N. K. & LINCOLN, Y. S. (2005) Handbook of Qualitative Research (3rd
edition) (Thousand Oaks, Sage).
EISNER, E.W. (1991) The Enlightened Eye: qualitative inquiry and the enhancement of
educational practice (New York, NY, Macmillan Publishing Company).
GAY, L. R. & AIRASIAN, P. (2000) Educational Research: competencies for analysis and
application (6th edition) (Upper Saddle River, Prentice Hall).
GELO, O., BRAAKMANN, D. & BENETKA, G. (2008) Quantitative and qualitative
research: beyond the debate, Integrative Psychological Behavioral Science, 42,
pp. 266–290.
GIBBS, G. R. (2007) Analyzing qualitative data, in: U. FLICK (Ed) The Sage
Qualitative Research Kit (London, Sage).
GLESNE, C. & PESHKIN, A. (1992) Becoming Qualitative Researchers: an introduction
(New York, Longman).
GUBA, E. & LINCOLN, Y. (1989) Fourth Generation Evaluation (Newbury Park,
Sage).
HITCHCOCK, G. & HUGHES, D. (1995) Research and the Teacher: a qualitative
introduction to school-based research (2nd edition) (London, Taylor & Francis).
HUCK, S. W. (2000) Reading Statistics and Research (3rd edition) (New York,
Longman).
KEPPEL, G. (1991) Design and Analysis: a researchers’ handbook (Upper Saddle
River, Prentice Hall).
LINCOLN, Y. S. & GUBA, E. G. (1985) Naturalistic Inquiry (Newbury Park,
Sage).
MILES, M. & HUBERMAN, M. (1994) Qualitative Data Analysis (2nd ed) (Thou-
sand Oaks, Sage).
PATTON, M. Q. (2002) Qualitative Research & Evaluation Methods (3rd Edition)
(Thousands Oaks, Sage).
SCHWANDT, T. A. (1997) Qualitative Inquiry: A dictionary of terms (Thousand
Oaks, Sage).
SNAPE, D. & SPENCER, L. (2003) The foundations of qualitative research, in:
J. RITCHIE & J. LEWIS (Eds) Qualitative Research Practice: a guide for social
science students and researchers (pp. 1–23) (Thousand Oaks, Sage).
SPENCER, L., RITCHIE, J., LEWIS, J. & DILLON, L. (2003) Quality in Qualitative
Evaluation: a framework for assessing research evidence (London, National Centre
for Social Research, Government Chief Social Researcher’s Office).
STEINKE, I. (2004) Quality criteria in qualitative research, in: U. FLICK, E. V.
KARDORFF & I. STEINKE (Eds) A Companion to Qualitative Research
(pp. 184–190) (Thousand Oaks, Sage).
STENBACKA, C. (2001) Qualitative research requires quality concepts of its own,
Management Decision, 39, pp. 551–555.
STRAUSS, A. L. & CORBIN, J. (1998) Basics of Qualitative Research: techniques and
procedures for developing grounded theory (2nd edition) (Thousand Oaks, Sage).
TROCHIM, W. M. (2005) Research Methods Knowledge Base. www.socialresearch
methods.net/kb
WHITTEMORE, R., CHASE, S. K. & MANDLE, C. L. (2001) Validity in qualitative
research, Qualitative Health Research, 11, pp. 522–537.
WOLCOTT, H. F. (1994) Transforming Qualitative Data: description, analysis,
interpretation (Thousand Oaks, Sage).
▪ Numbers (sometimes called raw numbers). For example, the total number of people who live in
poverty.
▪ Percentages: the number per 100 in a population. For example, 30% of British voters regularly
vote Conservative.
▪ Rates: the number per 1000 in a population. For example, if the birth rate in Britain was 1 (it’s
not...) for every 1000 people one baby is born each year.
Research data is often expressed as a rate or percentage because it allows accurate comparisons
between and within groups and societies. For example, comparing unemployment between Britain
and America as a raw number wouldn’t tell us very much, since the population of America is
roughly 5 times larger. Expressing unemployment as a percentage or rate on the other hand
allows us to compare "like with like”.
Strengths Matveev (2002) notes the ability to control the
conditions under which data is collected (such
The ability to express relationships numerically as using standardised questionnaires) makes
can be useful if the researcher doesn’t need to quantitative data more reliable. Where the
explore the reasons for behaviour. If you simply same questions are asked of different groups -
want to know how many murders there are per or the same group at different times - we can
year quantitative data satisfies this purpose. be sure that any differences in their answers
are not down to being asked different
Quantitative data lends itself to statistical questions.
comparisons. This is useful for things like
hypothesis testing or cross-cultural The ability to replicate quantitative research
comparisons (such as crime rates in different also means it is likely to be highly reliable and
countries). where the researcher has no direct, necessary
and personal involvement with the generation
Kruger (2003) of data it makes it less-likely for personal biases
Longitudinal studies, where the same group to intrude into the research process.
may be researched at different times to track
changes in their behaviour, can exploit this Kealey and Protheroe (1996), for example,
comparative feature of quantitative data. suggest the ability to “eliminate or minimize
subjective judgments” is a major contributory
Quantitative methods and data “allow us to factor to increased data reliability.
summarize vast sources of information and
make comparisons across categories and
over time”.
Shortcutstv.com 1
Sociology Research Methods
Limitations
Shortcutstv.com 2
Sociology Research Methods
Qualitative data tries to capture the quality of people’s behaviour and such data says something
about how people experience the social world. It can be used to understand the meanings they
give to behaviour.
Boyle (1977), for example, studied the behaviour of a juvenile gang from the viewpoint of its
members while Goffman (1961) tried to understand the experiences of patients in an American
mental institution.
Both were trying to capture and express how people feel about and react to different situations.
Shortcutstv.com 3
Sociology Research Methods
Where qualitative research generally focuses 2. Different researchers looking at the same
on the intensive study of relatively small data may arrive at different conclusions based
groups, opportunities to generalise research on the data they choose to use
data may be limited.
Levy (2006), however, argues reliability
For similar reasons it’s difficult to compare evidenced through the ability to replicate
qualitative research across time and space research is not a useful test for qualitative
because researchers are unlikely to be research methods. She suggests the concept of
comparing "like with like". trustworthiness might be a more useful
measure of the internal reliability of qualitative
Qualitative data also tends to be structured in methods: “In qualitative research, as there are
ways that make the research difficult to no numerical measures. It is up to the
replicate - a consequence, Cassell and Symon researcher to provide evidence of reliability by
(1994) argue, of the fact that where research carefully documenting the data collection and
evolves to take account of the input of different analysis process, hence “trustworthiness” is
respondents the original research objectives used to assess how reliable the results are.
may change. Can we trust that the results are a ‘true’
reflection of our subject?”.
In general this means qualitative research
generally produce data with lower levels of Qualitative methods require different skills from
reliability. While all data, quantitative as well as the researcher to those required of a
qualitative, requires interpretation by the quantitative researcher and this means
researcher, qualitative methods - from qualitative data may be harder to collect. In
participant observation to unstructured something like observational research, for
interviews - tend to produce vast amounts of example, the researcher needs to be able to
data across a wide range of issues. This raises convincingly and consistently “play a role”
two potential reliability issues: within the group they are studying - a very
different set of skills to those needed to deliver
a questionnaire or structured interview.
Although we’ve considered quantitative and qualitative data as separate entities, there are
occasions when a researcher may want to combine quantitative and qualitative types of data,
such as collecting quantitative data about educational achievement or the number of people
who visit their doctor each year alongside qualitative data that seeks to explore the satisfaction
levels of pupils or patients.
This technique is called methodological triangulation and is one that can also be used to
improve both research validity - by creating a more-accurate measurement of something - and
reliability by using the strengths of one type of data (the ability to quantify behaviour, for
example) to offset the weaknesses of the other.
Shortcutstv.com 4
A HANDBOOK OF STATISTICS Describing Data Sets
Statisticians can help out decision makers by summarizing the data. Data is summarized
using either graphical or numeric techniques. The goal of this chapter is to introduce these
techniques and how they are effective in summarizing large data sets. The techniques shown
in these chapters are not exhaustive, but are a good introduction to the types of things that
you can do to synthesize large data sets. Different techniques apply to different types of data,
so the first thing we must do is define the two types of data that we encounter in our work.
Quantitative and qualitative data/variables (introduced in chapter 1) are the two main types
of data that we often deal with in our work. Qualitative variables are variables that cannot
be measured on a natural numerical scale. They can only be classified into one of a group
of categories. For instance, your gender is a qualitative variable since you can only be male
or females (i.e. one of two groups). Another example might be your degree of satisfaction
at a restaurant (e.g. “Excellent”, “Very Good”, “Fair”, “Poor). A quantitative variable is a
variable that is recorded on a naturally occurring scale. Examples of these might be height,
weight, or GPA. The technique that we choose to summarize the data depends on the type
of data that we have.
We can illustrate the frequency table using a simple example of a data set that consists of
three qualitative variables: gender, state, and highest degree earned.
1 M VA Law
2 M MD Law
3 M IA PhD
4 F ID HS Degree
5 F VA MBA
6 M MD BA
7 M DC MS
8 F CA MBA
9 F CA MBA
10 F MD MS
11 F VA MS
12 M IA MS
13 M AK MBA
14 M AK MBA
15 M AR MBA
16 F AK MBA
17 F DC PhD
18 F MD PhD
19 F MD Law
20 M VA Law
BA 1 5.00%
HS Degree 1 5.00%
Law 4 20.00%
MBA 7 35.00%
MS 4 20.00%
PhD 3 15.00%
The table below the raw data summarizes the data by the variable “Highest Degree Earned”.
The first column shows the different levels that the variable can take. The second and third
columns show the frequency and relative frequency, respectively for each class. For this
data set, the MBA degree is the most represented degree with 7 individuals holding MBA’s.
Finally, the third column shows the relative frequency, or the percentage of people with that
degree. MBA’s have 35%. The frequency table summarizes the data so that we can better
understand the data set without having to look at every record. Most of our small business
owners have MBA’s with some other advanced degree a close second. In fact, 90% of the
small business owners have some sort of advanced degree.
5%
5%
15%
20%
20%
35%
The pie chart provides the same information that the frequency chart provides. The benefit
is for the individuals who prefer seeing this kind of data in a graphical format as opposed
to a table. Clearly the MBA is most highly represented. Even without the labels, it would
be easy to see that the MBA degree is the most represented in our data set as the MBA is
the largest slice of the pie.
Frequency Histogram
The final type of graph that we will look at in this chapter is the frequency histogram. The
histogram pictorially represents the actual frequencies of our different classes. The frequency
histogram for the “Highest Degree Earned” variable is shown below.
In the frequency histogram, each bar represents the frequency for each class level. The
graph is used to compare the raw frequencies of each level. Again, it is easy to see that
MBA dominates, but the other higher level degrees (Law, MS, and PhD) are almost equal
to each other.
The mean, or average, is calculated by summing the measurements and then dividing by the
number of measurements contained in the data set. The calculation can be automated in excel.
=average()
=average(A1:A4) = 5.5
The median of a quantitative data set is the middle number when the measurements are
arranged in ascending or descending order. The formula for the median in Excel is
=median()
=median(A1:A4) = 5.5
The mode is the measurement that occurs most frequently in the data. The mode is the
only measure of central tendency that has to be a value in the data set.
=mode()
Since the mean and median are both measures of central tendency, when is one preferable
to the other? The answer lies with the data that you are analyzing. If the dataset is skewed,
the median is less sensitive to extreme values. Take a look at the following salaries of
individuals eating at a diner.
The mean salary for the above is ~$15K and the median is ~$14K. The two estimates are quite
close. However, what happens if another customer enters the diner with a salary of $33 MM.
The median stays approximately the same (~$14K) while the mean shoots up to ~$5MM.
Thus, when you have an outlier (an extreme data point) the median is not affected by it
as much as mean will be. The median is a better representation of what the data actually
looks like.
The job of a statistician is to explain variance through quantitative analysis. Variance will
exist in manufacturing processes due to any number of reasons. Statistical models and
tests will help us to identify what the cause of the variance is. Variance can exist in the
physical and emotional health of individuals. Statistical analysis is geared towards helping
us understand exactly what causes someone to be more depressed than another person, or
why some people get the flu more than others do. Drug makers will want to understand
exactly where these differences lie so that they can target drugs and treatments that will
reduce these differences and help people feel better.
There are several methods available to measure variance. The two most common measures
are the variance and the standard deviation. The variance is simply the average squared
distance of each point from the mean. So it is really just a measure of how far the data is
spread out. The standard deviation is a measure of how much, on average each of the values
in the distribution deviates from the center of the distribution.
Say we wish to calculate the variance and the standard deviation for the salaries of the five
people in our diner (the diners without Jay Sugarman). The excel formulas for the variance
and the standard deviation are:
=var.p(12000,13000,13500,16000,20000)
=stdev.p(12000,13000,13500,16000,20000)
360°
thinking .
360°
thinking . 360°
thinking .
Discover the truth at www.deloitte.ca/careers Dis
Discover the truth at www.deloitte.ca/careers © Deloitte & Touche LLP and affiliated entities.
Discoverfree
Download theeBooks
truth atatbookboon.com
www.deloitte.ca/careers
Click on the ad to read more
24
A HANDBOOK OF STATISTICS Describing Data Sets
The variance is 8,240,000 and the standard deviation is $2870.54. A couple of things
worth noting:
• The variance is not given in the units of the data set. That is because the variance
is just a measure of the spread of the data. The variance is unit free.
• The standard deviation is reported in the units of the dataset. Because of this, people
often prefer to talk about the data in terms of the standard deviation.
• The standard deviation is simply the square root of the variance.
• The formulas for the variance and the standard deviation presented above are for the
population variances and standard deviations. If your data comes from a sample use:
=var.s()
=stdev.s()
1. When the standard deviation is 0, the data set has no spread – all observations are
equal. Otherwise the standard deviation is a positive number.
2. Like the mean, the standard deviation is not resistant to outliers. Therefore, if you
have an outlier in your data set, it will inflate the standard deviation.
3. And, as said before, the standard deviation has the same units of measurements as
the original observations.
The data tells us that, on average, Dave’s pours are a bit closer to the mark of two liters
when he fills his beer growler. However, his pours have a higher standard deviation than
does Jim which indicates that his pours fluctuate more than Jim’s does. Jim might pour
less, but he consistently pours about the same amount each time. How could you use this
information? One thought you might have is that from a planning purpose, you know that
Jim is performing more consistently than Dave. You know you are going to get slightly
below the 2 liter goal. You can confidently predict what his next pour will be. On the other
hand, on average Dave gets closer to the mark of 2 liters, but his pours vacillate at a much
higher rate than does Jim.
The Empirical Rule is a powerful rule which combines the mean and standard deviation to
get information about the data from a mound shaped distribution.
Definition
• 68% of all data points fall within 1 standard deviation of the mean.
• 95% of all data points fall within 2 standard deviations of the mean.
• 99.7% of all data points fall within 3 standard deviations of the mean.
Put another way, the Empirical Rule tells us that 99.7% of all data points lie within three
standard deviations of the mean. The empirical rule is important for some of the work
that we will do in the chapters on inferential statistics later on. For now though, one way
that the empirical rule is used is to detect outliers. For instance suppose you know that
the average height of all professional basketball players is 6 feet 4 inches with a standard
deviation of 3 inches. A player of interest to you stands 5 foot 3 inches. Is this person an
outlier in professional basketball? The answer is yes. If we know that the distribution of
the height of pro basketball players is mound shaped, then the empirical rule tells us that
99.7% of all player’s heights will be within three standard deviations of the mean, or 9
inches. Our 5 foot 3 inches player is beyond 9 inches below the mean, which indicates he
is an outlier in professional basketball.
The last concept introduced in this chapter is the z-score. If you have a mound shaped
distribution the z-score makes use of the mean and the standard deviation of the data set in
order to specify the relative location of a measurement. It represents the distance between a
given data point and the mean, expressed in standard deviations. The score is also known
as “standardizing” the data point. The excel formula to calculate a z-score is
Large z-scores tell us that the measurement is larger than almost all other measurements.
Similarly, a small z-score tells us that the measurement is small than all other measurements.
If a score is 0, then the observation lies on the mean. And if we pair the z-score with the
empirical rule from above we know that:
Example
Suppose 200 steelworkers are selected and the annual income of each is determined. This
is our sample data set. The mean is $34,000 and the standard deviation is $2,000. Joe’s
annual income is $32,000. What is his sample z-score?
=standardize(32000,34000,2000)
= -1.0
Joe’s income falls one standard deviation below the average of all the steelworkers.
1) You survey 20 students to see which new cafeteria menu they prefer, Menu A,
Menu B, or Menu C. The results are A, A, B, C, C, C,A,A,B,B,B,C,A,B,A,A,C,
B,A,B,B. Which menu did they prefer? Make a frequency table carefully defining
the class, class frequency and relative frequency. Explain your answer.
2) A PC maker asks 1,000 people whether they’ve updated their software to the latest
version. The survey results are 592 say yes, 198 say no and 210 do not respond.
Generate the relative frequency table based off of the data. Why would you need
to include the non-responders in your table, even though they don’t answer your
question.
3) What is the benefit of showing the relative frequency (the percentages) instead of
the frequency in a table?
4) The pie chart below represent a random sample of 76 people, 22 females and the
remaining are males. How could the pie chart below be improved?
Measure of Central Tendency: Usually when two or more different data sets are
to be compared it is necessary to condense the data, but for comparison the
condensation of data set into a frequency distribution and visual presentation are
not enough. It is then necessary to summarize the data set in a single value. Such a
value usually somewhere in the center and represent the entire data set and hence it
is called measure of central tendency or averages. Since a measure of central
tendency (i.e. an average) indicates the location or the general position of the
distribution on the X-axis therefore it is also known as a measure of location or
position.
1. Arithmetic Mean
2. Geometric Mean
3. Harmonic Mean
4. Mode
5. Median
Numerical Example:
Calculate the arithmetic mean for the following the marks obtained by 9
students are given below:
x i
xi
x i 1
n 45
32
n9 37
46
360 39
x 40 marks 36
9 41
48
36
n
Numerical Example: x
i 1
i 360
Calculate the arithmetic mean for the following data given below:
fx i i n
x i 1
n
, n f i
i 1
fi i 1
The weight recorded to the nearest grams of 60 apples picked out at random from a
consignment are given below:
f
i 1
i 60 fx
i 1
i i 7350.0
fx i i
7350.0
x i 1
n
122.5 grams
60
f
i 1
i
Using formula of short cut method of arithmetic mean for grouped data:
n
fD n n f
i i
x A ,
i 1
i
n
f
i 1
i
i 1
i 1
f i 60 f D =480
i 1
i i
fD i i
480
x A i 1
n
114.5 122.5 grams
60
f
i 1
i
Using formula of step deviation method of arithmetic mean for grouped data:
n
fu i i
xi A
x A i 1
n
h , ui , where h is the width of the class interval:
h
f
i 1
i
Xi A
Weight (grams) Frequency ui
Midpoints ( xi ) h f i ui
( fi ) A 114.5 , h=20
65----84 (65 84) 2 74.5 09 -2 -18
85----104 94.5 10 -1 -10
105----124 114.5 17 0 0
125----144 134.5 10 1 10
145----164 154.5 05 2 10
165----184 174.5 04 3 12
185----204 194.5 05 4 20
n n
f
i 1
i 60 fu
i 1
i i =24
fu i i
24
x A i 1
n
h 114.5 20 114.5 08 122.5 grams (Answer).
60
f
i 1
i
Chapter 03 Measures of Central Tendency
Geometric Mean: “The nth root of the product of “n” positive values is called
geometric mean”
Geometric Mean n product of " n " positive values
The following are the formulae of geometric mean:
xi 45 32 37 46 39 36 41 48 36
n9
xi log xi
n
45 log 45 =1.65321 log xi
32 1.50515 G.M anti log i 1
n
37
1.56820
46 1.66276
14.38700
39 1.59106 G.M anti log
9
36 1.55630 G.M anti log 1.59856
41 1.61278
48 G.M 39.68 (Answer).
1.62124
36 1.55630
n
log x =14.38700
i 1
i
124.2483
xi 45 32 37 46 39 36 41 48 36
Using formula of harmonic mean for ungrouped data:
n9
xi 1 xi n
H .M
45 0.02222
n
1
i 1 xi
32 0.03125
37 0.02702 9
H .M
46 0.02173 0.22862
39 0.02564
H .M 39.36663 (Answer).
36 0.02777
41 0.02439
48 0.02083
36 0.02777
n
1
x
i 1
=0.22862
i
f i
H .M i 1
n
fi
x
i 1 i
Weight (grams) Midpoints ( xi ) Frequency
( fi ) f i xi
65----84 (65 84) 2 74.5 09 0.12081
85----104 94.5 10 0.10582
105----124 114.5 17 0.14847
125----144 134.5 10 0.07435
145----164 154.5 05 0.03236
165----184 174.5 04 0.02292
185----204 194.5 05 0.02571
n n
fi
i 1
fi 60 x =0.53044
i 1 i
f i
60
H .M i 1
113.11 grams (Answer).
n
fi 0.53044
i 1 xi
Calculate the median for the following the marks obtained by 9 students are
given below:
xi 45 32 37 46 39 36 41 48 36
Arrange the data in
ascending order
32, 36, 36, 37, 39, 41, 45, 46, 48. n 9 “n” is odd
th
n 1
Median Size of observation
2
th
9 1
Median Size of observation
2
Median Size of 5thobservation
Median 39 (Answer).
Calculate the median for the following the marks obtained by 10 students
are given below:
xi 45 32 37 46 39 36 41 48 36 50
32, 36, 36, 37, 39, 41, 45, 46, 48, 50. n 10 “n” is even
Median
Size of 5 th
6th observation
2
39 41
Median 40 (Answer).
2
The number of values above the median balances (equals) the number of
values below the median i.e. 50% of the data falls above and below the
median.
Numerical examples: The following distribution relates to the number of
assistants in 50 retail establishments.
No.of 0 1 2 3 4 5 6 7 8 9
assistant
f 3 4 6 7 10 6 5 5 3 1
f
i 1
i =50
n
n f i 50 “n” is even
i 1
Size of 1 observation
2 2
Median
2
Median
Size of 25 th
26 th observation
2
44
Median 4 (Answer).
2
Numerical example: Find the median, for the distribution of examination marks
given below:
h n n
Median l C here n f i
f 2 i 1
Class boundaries Midpoints xi Frequency fi Cumulative frequency c. f
29.5---39.5 34.5 8 8
39.5---49.5 44.5 87 95
49.5---59.5 54.5 190 285
59.5---69.5 64.5 304 589
69.5---79.5 74.5 211 800
79.5---89.5 84.5 85 885
89.5---99.5 94.5 20 905
n
f
i 1
i =905
n 905
= =452.5th student which corresponds to marks in the class 59.5---69.5
2 2
Therefore
h n 10 905 10
Median l C 59.5 285 59.5 452.5 285
f 2 304 2 304
1675
Median 59.5 , Median 59.5 5.5 , Median 65 marks (Answer).
304
Quartiles: “ wh en th e obs er v a t i o n ar e ar r a n g e d in in c r ea s i n g
or d er th e n th e va l u e s , th at di v i d e th e wh ol e dat a in t o fou r (4)
eq u a l par t s , ar e ca l l e d qu a r t i l e s ”
These values are denoted by Q1, Q2 and Q3. It is to be noted that 25% of the data falls
below Q1, 50% of the data falls below Q2 and 75% of the data falls below Q3.
Deciles: “ wh en th e obs e r v a t i o n ar e ar r a n g e d in in cr e as i n g or de r
th e n th e va l u es , th a t di v i d e th e wh ol e dat a in t o te n (10 ) eq u al
pa r t s , ar e ca l l e d qu a r t i l e s ”
These values are denoted by D1, D2,…,D9. It is to be noted that 10% of the data falls
below D1, 20% of the data falls below D2,…, and 90% of the data falls below D9.
Percentiles: “wh en th e ob s e r v a t i on ar e ar r an ge d in in c r e as i n g
or d er th e n th e va l u es , th a t di v i d e th e wh o l e dat a in t o hu n d r e d
(1 00) eq u a l par t s , ar e ca l l e d qu ar t i l e s ”
These values are denoted by P1, P2,…,P99. It is to be noted that 1% of the data falls
below P1, 2% of the data falls below P2,…, and 99% of the data falls below P99.
Median Q2 D5 P50 are same and are equal to median.
Calculate the quartiles for the following the marks obtained by 9 students
are given below:
xi 45 32 37 46 39 36 41 48 36
Q1 36 (Answer).
f
i 1
i =50
th
3 50
Q3 Marks obtained by 1 student
4
th
150
Q3 Marks obtained by 1 student
4
Q3 Marks obtained by 38 thstudent
Q3 6 (Answer).
Calculate the Decile for the following the marks obtained by 9 students are
given below:
xi 45 32 37 46 39 36 41 48 36
D4 37 (Answer).
f
i 1
i =50
th
6 50
D6 Marks obtained by 1 student
10
th
300
D6 Marks obtained by 1 student
10
D6 Marks obtained by 31thstudent
Q3 5 (Answer).
Calculate the percentile for the following the marks obtained by 9 students
are given below:
xi 45 32 37 46 39 36 41
48 36
Arranged the observation in ascending order
D4 45 (Answer).
f
i 1
i =50
th
49 50
P49 Marks obtained by 1 student
100
th
2450
P49 Marks obtained by 1 student
100
P49 Marks obtained by 24 thstudent
Q3 4 (Answer).
Numerical example of Quartile, Decile and Percentile for continuous grouped
data:
Class boundaries Midpoints xi Frequency fi Cumulative frequency c. f
29.5---39.5 34.5 8 8
39.5---49.5 44.5 87 95
49.5---59.5 54.5 190 285
59.5---69.5 64.5 304 589
69.5---79.5 74.5 211 800
79.5---89.5 84.5 85 885
89.5---99.5 94.5 20 905
n
f i 1
i =905
10
Q2 69.5 678.75 589
211
897.5
Q2 69.5
211
Q2 73.7535 (Answer)
10
D4 59.5 362 285
304
770
D4 59.5
304
D4 62.0328 (Answer)
h 8n
D8 l C
f 10
8n 8 905
= =724
10 10
10
D8 69.5 724 589
211
1350
D8 69.5
211
D8 75.8981 (Answer)
18 905
= 162.9
100
10
P18 49.5 162.5 95
190
675
P18 49.5 53.05 (Answer).
190
h 76n
P76 l C
f 100
76 905
=687.8
100
10
P76 69.5 687.8 589
211
988
P76 69.5 = 74.182 (Answer).
211
Calculate Q2 , D5 , D8 , P25 , P50 and P88 ?
The main object (purpose) of the average is to give a bird’s eye view
(summary) of the statistical data. The average removes all the unnecessary
details of the data and gives a concise (to the point or short) picture of the
huge data under investigation.
Average is also of great use for the purpose of comparison (i.e. the
comparison of two or more groups in which the units of the variables are
same) and for the further analysis of the data.
Averages are very useful for computing various other statistical measures
such as dispersion, skewness, kurtosis etc.
Requisites (desirable qualities) of a Good Average: An average will be
considered as good if:
It is mathematically defined.
It utilizes all the values given in the data.
It is not much affected by the extreme values.
It can be calculated in almost all cases.
It can be used in further statistical analysis of the data.
It should avoid to give misleading results.
Mode in case of Ungrouped Data: “A value that occurs most frequently in a data is
called mode”
OR
“if two or more values occur the same number of times but most frequently than the
other values, the there is more than one whole”
“If two or more VAlues occur the SAme number of times but most frequently thAn
the other vAlues, then there is more thAn one mode”
Mode in case of Discrete Grouped Data: “A value which has the largest frequency
in a set of data is called mode”
xi : 2, 3, 8, 4, 6, 3, 2, 5, 3.
Mode = 3 (Answer).
No. of assistants fi
0 3
1 4
2 6
3 7
4 10
5 6
6 5
7 5
8 3
9 1
n
f
i 1
i =50
Mode = 4
f
i 1
i =905
f m f1 304 190
Mode l h 59.5 10
f m f1 f m f 2 304 190 304 211
114
Mode 59.5 10 Mode 59.5 5.05072
114 93
Mode = 4 (Answer)
Purpose:
It removes all the unnecessary details of the data and gives a concise
picture of the huge data.
It is used for the purpose of comparison.
It is very useful in computing other statistical measures such as
dispersion, skewness and kurtosis etc.