You are on page 1of 537

A2Z

PhD THESIS
Practical TIPS for research scholars

Prof Dr S Ramalingam

e Rg;G nul;bahH 100


Centenary Committee Publications
A2Z
PhD THESIS
Practical TIPS for research scholars

Prof Dr S Ramalingam
Head – Management Studies
Dr MGR Univesity
Chennai – 600 095 INDIA
A2Z PhD Thesis
Practical TIPS for research Scholars

ALL RIGHTS RESERVED. No part of this book covered by the copyright herein may be
reproduced, transmitted, stored or used in any form or by any means graphic, electronic, or
mechanical, including but not limited to photocopying, recording, scanning, digitizing, taping, Web
distribution, information networks, or information storage and retrieval systems, without the prior
written permission of the publisher.

© Prof Dr S Ramalingam

First Edition March 2012


First Reprint May 2012
Second Reprint July 2020

Published by:
Na Subbu Reddiar 100 Educational Trust
AD-13, 5th Street, Anna Nagar, Chennai—600 040

Cover Design:
Centenary Committee
Dedicated to:

My Paternal GRANDmother
&

ALL my Teachers
Foreword i
Preface v

Contents
Chapter I
The Ethics of Academic Research 1
Trust is the foundation of scholarship in the academic fratenity. Innovation can continue only in an atmosphere of
confidence and fairness. Scholars will strengthen the foundation of trust within the fratenity by gaining knowledge of their
fields and committing themselves to cultivating collegial relationships. It is always advantageous if the research scholar is
aware of the possible ethical issues involved during the process of an academic research.

Chapter II
Journey of a PhD Thesis 8
The journey of academic research is fascinating and fabuluous one, especially the research scholar is highly committed.
Here, the various stages and their intricacies of an academic research are interestingly narrated and also some practical
tips and guidelines are provided to make the research scholars comfortable.

Chapter III
Research Proposal 18
A PhD thesis proposal is an extremely important document, and much thought and planning should go into crafting this
document. Writing a clear and effective thesis proposal is the first step to a career in research. The major stages in an
academic research are detailed with illustrations and also a suitable format for a research proposal is suggested.

Chapter IV
Selecting a Supervisor 32
Matching of scholar to supervisor for effective relationships is crucially important. There are several qualities that the
research scholars expect to see in their research supervisor. This chapter makes a serious attempt to indicate highly
astonishingly approaches that a scholar never dares to attempt. Highly critical chapter indeed, it is.

Chapter V
Finalizing the Topic 40
Topic represents the core subject matter of scholarly communication and the means by which the scholar arrives at other
possible topics of research and discover new knowledge. It is important to keep in mind that an initial topic may not be the
exact topic.This chapter provides a variety of scholarly strategies to design and then, finally decide a research topic.

Chapter VI
Research Problem 46
A research problem is the topic one would like to address, investigate, or study, whether descriptively or experimentally. It
is the focus or reason for engaging in research study. The research problem should be stated in such a way that it would
lead to analytical thinking on the part of the researcher with the aim of possible concluding solutions to the stated
problem. This chapter comprehensively illustrates the various stages and intricacies involved in identifying a research
problem.

Chapter VII
Review of Literature 54
The "literature" of a literature review refers to any collection of materials on a topic, not necessarily the great literary texts
of the world. Literature reviews provide the research scholar with a handy guide to a particular topic. A literature review is
usually organized around ideas, not the sources themselves as an annotated bibliography would be organized. This
means that one will not just simply list the sources and go into detail about each one of them, one at a time. Literature
review, fulcrum of an academic research is, in its breadth, width and depth, discussed in this chapter.

Chapter VIII
Scope of Research Study 69
Scope is simply boundary of the research. Scope of coverage defines what areas around the subject matter the research
covers and what it did not. This means that the scope of the study may be referred to the specific element and content
that the researcher wants to explore in a study. The scope of a ‘research scope’ is nicely presented here.

Chapter IX
Limitations 72
Without exception, all research is limited in several ways. It is important to remember that all research suffers from
limitations. Even though there may be a large number of limitations in any thesis, it is not necessary todiscuss all of these
limitations in the Research Limitations section. Limitless scope of ‘limitations’ has been discussed in this chapter.

Chapter X
Objectives 79
The objective of the research should be closely related to the research study of the thesis.The main purpose of the
research objective is to focus on research problem, avoid the collection of unnecessary data and provide direction to
research study. Scholars should remember that the objectives of a research study form and define the direction and path
of the research journey. Greater and meticulous attention given at the stage of designing the objectives would make the
scholars feel at ease at the later stages of the reaearch study.

Chapter XI
Research Design 84
The research design refers to the strategy a scholar chooses to integrate the different components of the study in a
cohesive and coherent way in order to address the research problem; it constitutes the blueprint for the collection,
measurement, and analysis of data. Throughout the design construction task, it is important to have in mind some
endpoint, some criteria which we should try to achieve before finally accepting a design strategy.

Chapter XII
Sampling 99
The size of the sample depends on the type of research design being used; the desired level of confidence in the results;
the amount of accuracy wanted; and the characteristics of the population of interest. Sample size has little to do with the
size of the population, however. Detailed illusustrations for various types of sampling techniques are provided to enable
the research scholar to get the comprehensive understanding of the concept.

Chapter XIII
Designing a Questionnaire 110
Perhaps the most important stage of the survey process is the creation of questions that accurately measure the opinions,
experiences and behaviors of the public. Questionnaire design is a multiple-stage process that requires attention to many
details at the same time. The effects of question wording are one of the least understood areas of questionnaire research.

Chapter XIV
Data Collection 128
Data collection is the process of gathering and measuring information on variables of interest, in an established
systematic fashion that enables one to answer stated research questions, test hypotheses, and evaluate outcomes. Data
collection is to be considered as an art as well as science. It playes a very crucial role in any research study. Selecting a
correct method of data collection helps a research scholar in correct path and yields a desired quality results.

Chapter XV
Statistical Tools for Research 138
Scholars frequently use statistics to analyze their results. Statistics can help understand a phenomenon by confirming or
rejecting a hypothesis. It is vital to how one acquires knowledge to most scientific theories. It is the scholar’s primary
responsibility to identify and use the relevant types of statistical tools that suit his nature of research study. It is not always
safe to rely entirely on statisticians.
Chapter XVI
Reliability & Validity 147
Measurement experts believe that every measurement device should possess certain qualities. Perhaps the two most
common technical concepts in measurement are reliability and validity. Any kind of assessment, whether traditional or
"authentic," must be developed in a way that gives the assessor accurate information about the performance of the
individual. This chapter provides an interesting of reliability & validity.

Chapter XVII
Data Analysis 152
Data analysis is a body of methods that help to describe facts, detect patterns, develop explanations, and test
hypotheses. It is used in all of the sciences. It is used in business, in administration, and in policy. Data analysis is not
about numbers — it uses them.

Chapter XVIII
Findings of Research Study 160
The value of a scholar’s thesis will stand or fall on the validity and quality of the thesis findings. Critical as well as the most
significant stage of the thesis is identifying and finalizing the findings of the thesis. Various components that should occuy
in the research findings section are narrated and some useful tips to develop a good and quality findings are provided in
this chapter.

Chapter XIX
Structure of Thesis 163
This chapter addresses the problem/issues/difficulties involved in designing and structuring a PhD Thesis. A flexible five
chapter structure is comprehensively illustrated. A highly detailed sequence of a PhD thesis – chapterwise, sectionwise
and subsectionwise – is presented to enable a research scholar to feel at ease while drafting the thesis. Very interesting
chapter in the book.

Chapter XX
Research Discussion 171
The discussion section explains scholar’s interpretation of the findings as they relate to the research problem already
investigated. This section is comprised of all new information and focuses on the implications of the findings in relation to
the overall scope of other research that has taken place.

Chapter XXI
Writing a Thesis 178
Research scholars encounter many pitfalls when writing a thesis. A well-written thesis is essentially a sustained analysis
of a research topic and even the most careful scholar can succumb to commonly made mistakes in a work of this
magnitude. This chapter discusses the planning of the writing process, issues/difficulties encounted and mainly,
commonly made mistakes of writing thesiss such as the danger of disorganization, the problem of writing a worthy
conclusion and the problem of writing an analytical literature review and offers some strategies to overcome them.

Chapter XXII
Anatomy of an Abstract 185
An abstract is a condensed version of a longer piece of writing that highlights the major points covered, concisely
describes the content and scope of the writing, and reviews the writing's contents in abbreviated form. Many scholars
struggle to write a good abstract because they know that a poor abstract will wreck their whole thesis. Even if the whole
thesis is perfect, a mere indifference in the quality of abstract will turnoff the mind of the reader from the whole thesis.
Thus a scholar should put maximum efforts to write an astonishing thesis abstract so that output can be obtained in form
of encouragement and nice suggestions from the readers.

Chapter XXIII
Endnotes 189
Endnotes are used: (1) to cite the source of statements quoted or closely paraphrased in the text, (2) to make additional
comments about some point of the text, or (3) to acknowledge someone else for an idea or argument. The quantity and
quality of citations noted in a research thesis reflect the seriousness and curiosity of research scholars. Evaluators or
examiners would definitely take a note of the scholarship of scholars and appreciate the efforts put by the scholars.
Chapter XXIV
Research Conclusion 193
The conclusion should provide a restatement of the thesis, a summary of the author's conclusions, and perhaps a solution
to the problem, if this is the writer's intent. The conclusion of a thesis should be closed summarizing everything that has
come before, explaining in simple terms the way in which the research study ended, relating it to the greater environment
of the world at large, and leaving the reader with the ability to draw his or her own conclusions from what you have
described.

Chapter XXV
Editing and Proofreading 198
Proofreading is the act of searching for errors before you hand in your final research thesis.Individualizing your
proofreading process to match weaknesses in your writing will help you proofread more efficiently and effectively.

Chapter XXVI
Writing an Annotated Bibliography 204
An annotated bibliography is a list of citations related to a particular subject area or theme that include a brief, usually not
more than 150 words, descriptive or evaluative summary. As a result, you are better prepared to develop your own point
of view and contributions to the literature. The format of an annotated bibliography can differ depending on its purpose
and the nature of the assignment. It may be arranged alphabetically by author or chronologically by publication date. Ask
your supervisor for specific guidelines in terms of length, focus, and type of annotation as cited.

Chapter XXVII
Research Results 210
The results section of the research paper is where you report the findings of the research study based upon the
information gathered as a result of the methodology [or methodologies] applied in the research study. The results section
should simply state the findings, without bias or interpretation, and arranged in a logical sequence. The results section
should always be written in the past tense. A section describing results is particularly necessary if the research includes
data generated from the study.

Chapter XXVIII
Defending a Thesis 214
The thesis defense or viva voce is like an oral examination in some ways. It is different in many ways, however. The chief
difference is that the candidate usually knows more about the syllabus than do the examiners. It would be a mistake,
however, to underestimate the examiners' knowledge of your subject. Think of your defense as a high-level professional
conversation about a topic of interest. No doctoral dissertation committee worthy of the name assembles for the sole
purpose of publicly humiliating a candidate; faculty members are in the business of supporting successful program
completion whenever possible. Have some confidence in you!

Chapter XXIX
Reading a Research Paper 222
The process of reading research papers effectively is challenging. Reading a research paper often requires a special
approach as well as skill. The aim of reading a research paper varies depending upon the necessities of the reader. But it
has to be borne in mind , whatever be type of the reader, that one has to be familiar with the standard format of any
research paper and has to have some correct prospective while reading the paper. At end of the day, the author serves
the community provided the reader gets what he/she wants.

Chapter XXX
Evaluating a Research Paper 230
While research papers contribute to the community in general, the well-judged and well-balanced evaluation endures the
quality of the paper and enriches the value and utility of the paper. This chapter discusses, in detail, the stages, intricacies
and strategies involved in evaluating a research paper and the purpose of evaluation and benefits a scholar gets are
detailed.

Chapter XXXI
Journal Impact Paper 238
It has become mandatory for academic research scholars to publish a minimum number of research papers in peer-
reviewed national or international reputed journals having a reasonable Impact Factor. There have been many innovative
applications of journal impact factors. The impact factor can be used to provide a gross approximation of the prestige of
journals in which individuals have been published. This is best done in conjunction with other considerations such as peer
review, productivity, and subject specialty citation rates.

Chapter XXXII
Publishing a Research Paper 248
Publishing your work in a peer reviewed journal is an indication of quality. Intending researchers need to submit their
articles for review by experts in the field before the article can be approved for publication in a peer-reviewed journal.
Many databases allow the scholar to restrict the search to peer-reviewed journals. The important salient feature of this
chapter is that it provides a highly comprehensive [nearly 100] guidelines to enable a scholar to publish a paper of
academit quality.

Chapter XXXIII
Plagiarism 263
Plagiarism is the method of taking another person's writing, conversation, song, or even idea and showing it off as one’s
own. This includes information from web pages, books, songs, television shows, email messages, interviews, articles,
artworks or any other medium. Careful notetaking and a clear understanding of the rules for quoting, paraphraing, and
summarizing sources can help prevent plagiarism.

Glossary 272

Appendices

Appendix I : Detailed Guidelines for Chapters 381

Appendix II : Simple Guide to SPSS 389

Appendix III : APA Citation Style 449

Appendix IV : Excel for Statistical Data Analysis 467

Appendix V : Online Research Sources 520 [Available only in CD]


Foreword
What is a PhD?

A PhD ought to:


 be a report of work which others would want to read;
 tell a compelling story articulately whilst pre-empting inevitable critiques;
 carry the reader into complex realms, and inform and educate him/her;
 be sufficiently speculative or original to command respectful attention .

A PhD is something that is finished.

There are a number of ways of thinking about it. The first thing that comes
immediately to mind to many PhD students is that it is ‘a contribution to
knowledge’.Other elements to it are that it is: a license to teach in a university, a
signal of expertise and authority, a qualification, the highest degree that can be
awarded in a university.

A thesis is a research report. The report concerns a problem or series of


problems in scholar’s area of research and it should describe what was known
about it previously, what the scholar did towards solving it, and where or how
further progress in the field can be made. If the thesis is for a PhD, the university
requires that it make an original contribution to human knowledge: the research
must discover something hitherto unknown. The thesis will also be used as a
scientific report and consulted by academic fraternity, in future, who will want to
know in detail. Theses are occasionally
consulted by people from other institutions,
and the library sends microfilm versions if
requested. More commonly theses are now
stored in an entirely digital form. These may
be stored as .pdf files on a server at the
university. The advantage is that the thesis can be consulted much more easily
by researchers around the world.

Research is about discovery, the testing of hypotheses and of ideas. It is about


the establishment of facts through enquiry and exploration. The outcome of
research is new knowledge leading to improved understanding of mechanisms
and the development of new and improved procedures. To ensure that the use of
research results is maximised, it must be disseminated in an appropriate manner.

i
In many senses, the dissemination of the research results is just as important as
the research activity itself.

There are many ways to disseminate research results, and the production of a
research thesis is one of them. Although a research thesis is a usual requirement
for academic degree programmes that include a research element, it is more
than an instrument for the assessment of the research scholar. It must be written
such that the results presented can be validated and to form the basis for further
research. Procedures adopted must be justified; claims and conclusions must be
supported by experiments or reasoned arguments and deductions. A research
thesis contains elements which distinguish it from other types of reports, and
because it is the culmination of several years of work, the publication can be
quite voluminous. Writing one therefore requires some thought, planning and
organisation.

For a research scholar, thesis writing is a very important aspect of one’s learning
life because passing the course depends on it. Therefore, one should be focused
while working on the thesis. One should make a systematic beginning by
examining what topic to present the thesis on. Then, one has to see that a topic
for research thesis is decided which one can easily prove. One has to do some
research on it. This way, a scholar will be able to find out whether the topic is
worth investing one’s time or not.

When a research scholar is about to begin, writing a thesis seems a long, difficult
task. That is because it is a long, difficult
task. Fortunately, it will seem less daunting
once one has a couple of chapters done.
Towards the end, one will even find
enjoying it---an enjoyment based on
satisfaction in the achievement, pleasure in
the improvement in the technical writing, and of course the approaching end.

Despite the fact that universities have been assessing doctoral theses for many
years, there has been little research done on the processes involved in drafting a
PhD thesis.There are several scholarly books on ‘Research Methodology’ by
reputed authors in the market. Though they deal comprehensively about the
theoretical aspects of academic research and even offer some guidelines,
elaborate and down to earth guidelines are rare. Of course, research supervisor’s

ii
guidance is always available to scholars. Obviously there is a gap between these
known two sources. Probably this book, “A2Z PhD Thesis” is a serious attempt
to fill the gap.

The set of tips intends to give some ideas and guidelines on how to go about
writing a research thesis. As one reads through this ‘A2Z PhD Thesis’ one will
probably notice that writing a thesis is not as daunting and hard as feared.This is
a general guide for all disciplines, but is most suitable for scholars in social and
behavioral sciences. The text is organized according to the stages of the
research-and-writing process as defined by the authors: preparation, choosing a
topic, collecting information, organizing information, interpreting results, and
presenting the finished product.

The author's philosophy of the "systematic approach" is that pre-planning and


"structuring" different elements of the thesis can improve performance and the
final product while providing specific tasks that will help the scholar manage the
project. With an informal tone, this book provides help for the doctoral students
who feel that she or he is wandering around the thesis process with no clear
purpose, and helps research scholars to ‘translate’ what research supervisors
say about ‘good referencing’ and ‘clean research questions.’

More specifically, the book addresses the following issues:

o What are the steps involved in the initial stages of planning?


o What is the timeline and show it should be decided?
o How the process of chapterization is designed?
o What is citation style and how it should be selected?
o Why the various aspects of a thesis should be meticuluously planned?
o How a seamless flow among the chapters is created?
o What are the ingredients of the structure of a thesis?
o How a research paper should be read, evaluated and used?
o What is the style of presentation of the research results?
o What are the intricacies involved in editing and proofreading?
o How a scholar should prepare himself/herself to defend the thesis?

·
The special feature of the book is that it contains [a] a practical guidelines to use
SPSS, [b] comprehensive guidelines to statistical tools using Excel, [c] a list of
online research resources and [d] Research 360o comprising of (i) web links to
more than 10 million online papers/articles, (ii) web links to more than 100 000

iii
online journals, (iii) web links to nearly 8000 online international libraries, (iv) web
links to more than 10000 international research organizations,
(v) web links to several online TV channels, Newspapers, etc,
(vi) web links to all countries’ official websites, WHO, UN, ILO,
IMF, World Bank, etc, (vii) web links to online Encyclopedia,
Dictionaries, etc and (viii) web links to useful tools to
researchers like, Citation Style, Writing Skills, Report Writing,
Researh Methods, etc. All these are made available in an accompanied CD.

Efforts in bringing out such a comprehensive source of reference are to be


appreciated and I do hope that research scholars and supervisors would find this
as a good useful piece of resource for several aspects of thesis writing.

I congratulate the author for the efforts in bringing out a resource material for the
benefit of the research scholars in the Process of Thesis Writing. I wish the
Research Scholars to make use of this book and become successful in their
Research activities.

Chennai – 600 095 Dr A Thirunavukkarasu


20 February 2012 Dean - Research
Dr MGR University

iv
Preface

In most countries a PhD is a basic requirement for a career in academia. It is an


introduction to the world of independent research—a kind of intellectual
masterpiece, created by a research scholar in close
collaboration with a supervisor. The requirements to
complete one vary enormously between countries,
universities and even subjects. There are two classic
ways of doing a PhD. One involves knowing just what
you are doing; you will then go through a clearly defined
path, suffer occasional fits of gloom and despair, emerge
with a PhD, unless you do something remarkably silly or
give up, and then proceed smoothly with the next stage
of your career. The other way is the one followed by most
PhD students, which involves stumbling in, wandering
round in circles for several years, suffering frequent fits of
gloom and despair, and probably but not necessarily
emerging with a PhD, followed by wondering what to do
next in career terms. This book is written for those who
find themselves following the second path.

A PhD, by its very nature, is a very individualistic venture.


There is no right way to do a PhD (there are however a
multitude of wrong ways). Firstly, a scholar chooses a
topic to research. Then finds someone willing to be the
supervisor. Then the scholar gets himself/herself through
the procedures to sign up for a PhD at some institution.
Then starts research that topic for a year or two, at which
point the scholar is assessed to see whether he/she is
doing well enough to continue to the end of the PhD. If
that goes well, then the scholar does another year or two
of research. In the third or fourth year of the PhD, the
scholar writes a large document, called a thesis (typically
around 300 pages) about the research. This is read by a
panel of experts who then ask questions about it to check
that the scholar’s understanding of the topic is good
enough. They will typically conclude that the scholar
needs to make some changes to it. If these changes are
made to their satisfaction within a specified period, then
the scholar will be awarded a PhD. The award of a
research degree effectively says `This person knows how to do research in
his/her chosen area' and `research' is a nebulous, difficult to nail down thing

v
which relies on insight, lateral thinking, inspiration and a lot of hard work. Clearly
the purpose of this book is to help the scholar to set out to obtain a PhD.
The ability to conduct research in an area requires deep knowledge in that area,
knowledge about related areas, and the experience of working on research
problems, i.e. problems whose outcomes are not
known. To develop these critical abilities, most PhD
programs have three components in them – some
course work to provide the breath of knowledge, some
methods to develop the depth of knowledge in the
chosen area of study, and a thesis that provides the
experience of working on research problems. Doing a
PhD is mostly self driven and self taught degree and
the supervisor gently aiding the process. The program
and supervisor help mostly in creating an atmosphere
and environment in which the scholar gets motivated to excel. Hence, while
doing a PhD, the scholar should be self motivated and committed, and willing to
work hard and long on problems. Research is often a lonely business and PhD is
a preparation for a career in it. Research is tough career, but with the tips
provided in this book it can become easier and more satisfying.

A PhD is a qualification which shows that a person is good enough at research to


be appointable in a university post. If a scholar is willing to think of working as an
academic in a university, a PhD is highly advisable. It is also helpful if one wants
a career as a researcher in industry. A further practical point is that PhDs are
recognized around the world, and tend to have pretty good quality control, so a
PhD from one country will be recognized in another without too much snobbery.
Still at the practical level, if a person has a PhD, he/she usually goes onto a
higher pay scale. At a professional level, a PhD involves a scholar doing a
decent sized piece of research, writing it up and
then discussing it with professional academics. This
demonstrates one’s ability to do proper research
without someone holding one’s hand. A scholar has
a supervisor to help and advise, but in theory at
least the PhD is something where the scholar has to
take the initiative. A scholar needs to know what the
requisite skills are for the selected branch of
academia (since different disciplines require
different skills) and make sure that scholar
demonstrates mastery of each of this somewhere in
the thesis. If the scholar is a methodical sort of
person, he/she might go so far as to draw up a list of
the skills required and tick off each one as it is
represented in the thesis. For an academic, the skills are things like mastery of
formal academic language, familiarity with the relevant literature in the discipline,
vi
knowledge of the main data collection techniques, adherence to the standards of
rigour and so on.

There are other things which look simple until a scholar stops and thinks about
them. For instance, how do to choose a topic, and
how to find a good supervisor? The standard books
give quite a lot of good advice about this, but there
will still be quite a lot of things that one is not sure
about. So, what do you do about this? One good
step is to read the rest of this book at this point. The
main thing is that it gives a fair idea about which
things matter, which things are well understood and which things are
comparatively peripheral. For instance, a brief about academic writing as
opposed to formal English is provided (because most scholars are pretty bad at
it) and about feeling lost. Similarly, not much is said about statistics and about
experimental design, because these are comprehensively covered by numerous
excellent texts and training courses, so one should have no problems getting
access to them if they’re needed for the research. But, regarding citation style, a
brief guideline on APA system is available in the appendix to give relief to
scholars.

In the fast track of ICT, any written material becomes obsolete quickly and to
meet this eventuality, a CD accompanies the book; the CD contains exhaustive
information by way of links to various web sites covering almost all materials that
one scholar may require. For example, CD contains [a] a practical guidelines to
use SPSS, [b] comprehensive guidelines to statistical tools using Excel, [c] a list
of online research resources and [d] Research 360 o comprising of links to online
libraries, international research organizations, journals, articles/papers, news
media, etc. These sources would be very helpful and handy to a research scholar
enabling him/her to get updated always. Any suggesions to improve the contents
and quality of the book are always welcome.

A word of gratitude. The members of faculty in the Department of Management


Studies are always encouraging and helpful in this venture.

All the sources, like books, articles, papers, journals, electronic sources like
webpages, etc are duly acknowledged and readers, if necessary, may make use
of them.

Chennai – 600 095 Dr S Ramalingam


1 March 2012 Prof & Head
Dept of Mgt Studies
Dr MGR University

vii
A2Z

PhD
Thesis

Reflections on Academic Research

Chapter I

The Ethics of Academic Research

1
THE ETHICS OF ACADEMIC RESEARCH
Introduction

Ethics are moral principles that guide behavior; in an academic environment, these
moral principles expand to become the standard rules of scholarly conduct. Academic
ethics involves such concepts as intellectual property, copyright, fair use, plagiarism,
censorship, freedom of speech, and the use of proprietary and non-proprietary
resources.

Trust is the foundation of scholarship in the academic fratenity. Innovation can continue
only in an atmosphere of confidence and fairness. A scholar must be able to trust that
colleagues are honest in presenting their research, and they must have the same trust in
other’s work. The range of research subjects and methods, along with systems of
analysis and data presentation that guide each field, give rise to situations of great moral
complexity. Likewise, relationships between research scholars and supervisors, along
with great opportunity, carry important responsibilities and obligations. Scholars will
strengthen the foundation of trust within the fratenity by gaining knowledge of their fields
and committing themselves to cultivating collegial relationships.

Ethical Issues

It is always advantageous if the research scholar is aware of the possible ethical issues
involved during the process of an academic research. The following is the summary of
those possible issues:

General Ethical Issues:


Privacy, Volunary nature, Consent, Deception, Confidentiality, Anonymity, Embarrassment,
Stress, Harm, Discomfort, Pain, Objectivity and Quality of research.

Formulation of Research Topic:


Researcher’s right to absence of sponsor coercion, Sponsor’s right to useful research, Sponsor’s
/ Partcipant’s right to Quality research

Designing research:
Researcher’s right to absence of gatekeeper coercion, Panticipant’s right to be fully informed,
Participant’s right to privacy, Sponsor’s/Participant’s right to Quality research

Collection of Data:

2
Researcher’s right to absence of sponsor coercion, Researcher’s right to safety, Participant’s
right to informed consent, Participant’s right to withdraw, Participant’s deception, Participant’s
right to confidentiality/anonymity, Organization’s right to confidentiality/anonymity, Sponsor’s /
Participant’s right to Quality research

Processing of Data:
Participant’s right as individuals to the processing and storing of his personal data

Analysis of Data:
Researcher’s right to absence of sponsor coercion, Organization’s rights to confidentiality /
anonymity, Participant’s right to confidentiality / anonymity, Sponsor’s/Participant’s right to Quality
research

Primary research is conducted all of the time--journalists use it as their primary means of
reporting news and events; national polls and surveys discover what the population
thinks about a particular political figure or proposal; and companies collect data on their
consumer base and market trends. When conducting research in an academic or
professional setting, a scholar needs to be aware of the ethics behind the research
activity.

Here are some specific and important points to consider:

 One should have the permission of the people whom he will be studying to conduct
research involving them.
 Not all types of research require permission—for example, if you are interested in
analyzing something that is available publicly (such as in the case of commercials, public
message boards, etc) you do not necessarily need the permission of the authors.
 One should not do anything that would cause physical or emotional harm to your
subjects. This could be something as simple as being careful how one is word sensitive
or difficult questions during the interviews.
 Objectivity vs. subjectivity in the research is another important consideration. Be sure
one’s own personal biases and opinions do not get in the way of your research.
 Many types of research, such as surveys or observations, should be conducted under the
assumption that you will keep one’s findings anonymous. Many interviews, however, are
not done under the condition of anonymity. One should let one’s subjects know whether
the research results will be anonymous or not
 When one is doing research, one should be sure not taking advantage of easy-to-access
groups of people (such as children at a daycare) simply because they are easy to
access. One should choose your subjects based on what would most benefit the
research.
 Some types of research done in a university setting require Institutional Board Approval.
This means that the research has to be approved by an ethics review committee to make
sure the scholar is not violating any of the above considerations.
 When reporting the results one should be sure that the scholar accurately represents
what is observed or what was told. Interview responses should not be out of context and

3
the small parts of observations should not be discussed without putting them into the
appropriate context.

Ethical Codes

Given the importance of ethics for the conduct of research, it should come as no surprise
that many different professional associations, government agencies, and universities
have adopted specific codes, rules, and policies relating to research ethics.

The following is a rough and general summary of some ethical principals that various
ethical committee usually address. [Adapted from Shamoo A and Resnik D. 2009. Responsible Conduct
of Research, 2nd ed. (New York: Oxford University Press].

Honesty
Strive for honesty in all scientific communications. Honestly report data, results, methods and
procedures, and publication status. Do not fabricate, falsify, or misrepresent data. Do not deceive
colleagues, granting agencies, or the public.

Objectivity
Strive to avoid bias in experimental design, data analysis, data interpretation, peer review,
personnel decisions, grant writing, expert testimony, and other aspects of research where
objectivity is expected or required. Avoid or minimize bias or self-deception. Disclose personal or
financial interests that may affect research.

Integrity
Keep your promises and agreements; act with sincerity; strive for consistency of thought and
action.

Carefulness
Avoid careless errors and negligence; carefully and critically examine your own work and the
work of your peers. Keep good records of research activities, such as data collection, research
design, and correspondence with agencies or journals.

Openness
Share data, results, ideas, tools, resources. Be open to criticism and new ideas.

Respect for Intellectual Property


Honor patents, copyrights, and other forms of intellectual property. Do not use unpublished data,
methods, or results without permission. Give credit where credit is due. Give proper
acknowledgement or credit for all contributions to research. Never plagiarize.

Confidentiality
Protect confidential communications, such as papers or grants submitted for publication,
personnel records, trade or military secrets, and patient records.

Responsible Publication
Publish in order to advance research and scholarship, not to advance just your own career.
Avoid wasteful and duplicative publication.

4
Responsible Mentoring
Help to educate, mentor, and advise students. Promote their welfare and allow them to make
their own decisions.

Respect for colleagues


Respect your colleagues and treat them fairly.
Social Responsibility
Strive to promote social good and prevent or mitigate social harms through research, public
education, and advocacy.

Non-Discrimination
Avoid discrimination against colleagues or students on the basis of sex, race, ethnicity, or other
factors that are not related to their scientific competence and integrity.

Competence
Maintain and improve your own professional competence and expertise through lifelong
education and learning; take steps to promote competence in science as a whole.

Legality
Know and obey relevant laws and institutional and governmental policies.

Animal Care
Show proper respect and care for animals when using them in research. Do not conduct
unnecessary or poorly designed animal experiments.

Human Subjects Protection


When conducting research on human subjects, minimize harms and risks and maximize benefits;
respect human dignity, privacy, and autonomy; take special precautions with vulnerable
populations; and strive to distribute the benefits and burdens of research fairly.

Ethical Dilemmas

There are many other activities that the government does not define as "misconduct" but
which are still regarded by most researchers as unethical. These are called "other
deviations" from acceptable research practices and these situations create difficult
decisions for research known as ethical dilemmas. The following is a list:

 Publishing the same paper in two different journals without telling the editors
 Submitting the same paper to different journals without telling the editors
 Not informing a collaborator of your intent to file a patent in order to make sure that you
are the
sole inventor
 Including a colleague as an author on a paper in return for a favor even though the
colleague did not make a serious contribution to the paper

5
 Discussing with your colleagues confidential data from a paper that you are reviewing for
a
journal
 Trimming outliers from a data set without discussing your reasons in paper
 Using an inappropriate statistical technique in order to enhance the significance of your
research
 Bypassing the peer review process and announcing your results through a press
conference
without giving peers adequate information to review your work
 Conducting a review of the literature that fails to acknowledge the contributions of other
people in
the field or relevant prior work
 Stretching the truth on a grant application in order to convince reviewers that your project
will
make a significant contribution to the field
 Stretching the truth on a job application or curriculum vita
 Giving the same research project to two graduate students in order to see who can do it
the
fastest
 Overworking, neglecting, or exploiting graduate or post-doctoral students
 Failing to keep good research records
 Failing to maintain research data for a reasonable period of time
 Making derogatory comments and personal attacks in your review of author's submission
 Promising a student a better grade for sexual favors
 Using a racist epithet in the laboratory
 Not reporting an adverse event in a human research experiment
 Wasting animals in research
 Exposing students and staff to biological risks in violation of your institution's biosafety
rules
 Rejecting a manuscript for publication without even reading it
 Sabotaging someone's work
 Stealing supplies, books, or data
 Rigging an experiment so you know how it will turn out
 Making unauthorized copies of data, papers, or computer programs
 Owning over a statutory amount in stock in a company that sponsors your research and
not
disclosing this financial interest
 Deliberately overestimating the clinical significance of a new drug in order to obtain
economic
benefits

6
Finally, situations frequently arise in research in which different people disagree about
the proper course of action and there is no broad consensus about what should be done.
In these situations, there may be good arguments on both sides of the issue and
different ethical principles may conflict.

Conclusion
If "deviations" from ethical conduct occur in research as a result of ignorance or a failure
to reflect critically on problematic traditions, then a course in research ethics may help
reduce the rate of serious deviations by improving the researcher's understanding of
ethics and by sensitizing him or her to the issues. Finally, training in research ethics
should be able to help researchers grapple with ethical dilemmas by introducing
researchers to important concepts, tools, principles, and methods that can be useful in
resolving these dilemmas. Whistleblowing is one mechanism to help discover
misconduct in research. But apart from these, a supervisor has his own ethical role in
advising and guiding the research scholar of these issues/prctices and ensuring that the
scholars do not indulge in these highly unethical practices.

“Even the most rational approach to ethics is defenseless if there isn’t the will to do what is
right.”
<< Alexdander Solzhenitsyn

7
Reflections on Academic Research

A2Z

PhD
Thesis
Chapter II

Journey of a
PhD Thesis

8
JOURNEY OF A PhD THESIS
SOME TIPS

Introduction

Although every thesis is unique, they all aim to persuade the reader of one 'big idea'.
This central claim is otherwise referred to as the ‘thesis’; hence a research thesis is the
improvement/development of one central claim. This is reflected in research degree
requirements that demand candidates to demonstrate a ‘significant original contribution
to knowledge, and/or to the application of knowledge within the field of study’. When you
are about to begin, writing a thesis seems a long, difficult task. That is because it is too
long, difficult task. But, it will seem less daunting once a couple of chapters are done.
Towards the end, you will even find yourself enjoying it---an enjoyment based on
satisfaction in the achievement, pleasure in the improvement in your technical writing,
and of course the approaching end. Like many other tasks, thesis writing usually seems
worst before you begin. Of course, each long journey starts with a single step.

Start Early

Begin working on your essay as soon as the assignment is given. Take advantage of the
time at your disposal to do your research and writing to meet the due date. If you wait
until the last minute, you may have difficulty finding library materials, particularly if other
students are researching the same topic, and you may be pressured by other
assignments.

Do a literature survey

Start the journey. Try to know and have several sources that would help you gathering
materials, such as research articles, papers, journals, etc. Go through these materials
seriously and thoroughly and make a systematic list of all materials to enable you to
refer back at any time during your research. List may be useful to you when you go for a
bibliography.

Focus a field/area

9
Keeping in mind the guidelines your supervisor has set down for the assignment in terms
of length, subject matter, types of sources, etc., choose a topic you would be interested
in pursuing. Your next step is to verify at the library that there is sufficient material to
support your choice. If not, discard your topic and adopt a more realistic one.

Narrow down the Topic

Do not fall into the trap of selecting a topic that is so broad you would have to write a
book to do it justice. Limit your topic to one particular aspect that you will be able to treat
thoroughly within the prescribed limits of your thesis. Background reading in a general or
specialized encyclopedia will give your a clue as to the subject's natural limits and
divisions. The librarian can direct you to the encyclopedia that will be appropriate to your
particular needs

Draft a rough Outline

Roughly organize your thoughts to produce an outline that will give direction to your
reading and note-taking. Take advantage of the Libraries' varied resources: The library
catalogue, for books (including government publications) on your topic; consult the
databases to locate articles; request the advice of your supervisor and the librarian who
may be able to direct you to other sources pertinent to your subject area.

Review your Documentation

For each source that you have consulted, be sure you have all the information
necessary to cite it in your bibliography. Accuracy at this stage will save you the trouble
of having to re-trace your steps when you are writing your final draft. For a book, mark
down the author, title, place of publication, publisher and copyright date. For an article
from a journal, take note of the author, title of the article, title of the journal, volume and
issue number, date and inclusive page numbers. For a Web document, take down the
author, title, date, URL (Web address) and date consulted.

Draft a Final Outline

Map out your approach by composing a detailed sentence outline. First, compose a
thesis statement. This one sentence statement is the most important one of your entire

10
research paper; so be sure to phrase it carefully. A thesis statement clearly
communicates the subject of your paper and the approach you are going to take to it. It
is the controlling factor to which all information that follows must relate. Secondly, group
and regroup your notes according to the various aspects of your topic until you find a
sequence that seems logical. This can serve as the basis for your outline.

Write a rough Draft

In writing a rough draft you are striving for a flow of ideas. Write non-stop using your final
outline and organized notes as guides. Do not worry about correct spelling or
punctuation at this stage. Remember that the purpose of a rough draft is to see if you
have a logical progression of arguments and sufficient supporting material.

Revise/Review the rough Draft

Make the necessary adjustments until you are satisfied your statements flow logically
and your ideas have been fully presented in clear, concise prose. You may need to
review your documentation if some sections of your text need further development.

Prepare your Bibliography

A bibliography is a listing in alphabetical order according to the author's last name of all
the sources you consulted in preparing your research paper. It is presented on a
separate page at the end and is set up according to a standard format that you will find
described in most style manuals. Examples for the most commonly used citation styles
(APA, MLA, Chicago) are available. Reworks are a Web-based tool that helps organize
the references you find and prepares a bibliography automatically.

Proofread your final Draft

You are now ready to focus primarily on the style of your essay rather than the content.
Make use of:

 a dictionary or spell check for correct spelling


 a thesaurus for synonyms
 a grammar book
 a style manual for the mechanics of citing references

11
Time required

It is strongly recommended sitting down with your supervisor and making up a timetable
for writing it: a list of dates for when you will give the first and second drafts of each
chapter to your supervisor. This structures your time and provides intermediate targets.
If you merely aim "to have the whole thing done by some distant date", you can deceive
yourself and procrastinate more easily. If you have told your supervisor that you will
deliver a first draft of chapter 2 on Friday, it focuses your attention. You may want to
make your timetable into a chart with items that you can check off as you have finished
them. This is particularly useful towards the end of the thesis when you find there will be
quite a few loose ends here and there.

How much time it may take? Let us hear what Chinneck (1999) has to say: “Longer than
you think. Even after the research itself is all done ‐‐ models built, calculations complete
‐‐ it is wise to allow at least one complete term for writing the thesis. It's not the physical
act of typing that takes so long, it's the fact that writing the thesis requires the complete
organization of your arguments and results. It's during this formalization of your results
into a well‐organized thesis document capable of withstanding the scrutiny of expert
examiners that you discover weaknesses. It's fixing those weaknesses that take time.”

In general, after completing and finalizing the preliminary administrative formalities which
may take approximately upto 18 months,a scholar should allow from 18 to 24 months.
As this suggestion is time-tested, scholar may take this as a basis and could proceed
accordingly. The following is a breakdown of the phases of analysis and the time to be
alloted for each one:

(1) Literature Review: The literature review is in many ways the most difficult and time
consuming part of the thesis project. It is also the most important. The review of the literature
provides the context for your thesis project. You will be building on previous researchers’ work so
it is important that you be thoroughly familiar with it. The review of the literature provides your
hypothesis, your methodology, and your context for analysis and interpretation. Therefore you
should spend considerable time on this part of your project. You should initially allot at least four
to six months for this part of your research. However you should also realize that the literature
review will continue for the duration of the project.

(2) Data Collection: Depending on where you will be doing your data collection and whether you
will be doing it full‐time or part‐time, the data collection phase of your project will take between
three and six months. It is very important that you build in enough time to go back and redo some
of your data collection. Most researchers find that their expertise changes over the course of data
collection and you will need to go back and recheck the data that you initially collected.

12
(3) Data Analysis: Do not underestimate the time you allot here. Learning a statistics package
takes time. Researching the appropriate statistics and learning how to use and apply them to
your data takes time. Plan on at least three months for this phase of your analysis.

(4) Preparation of the Thesis Drafts: You will be writing in drafts. You should count on writing at
least three to four complete drafts before your thesis is complete. As a general guideline, from
first draft to final draft you should count on at least six months.

(5) Final Submission: Once the final draft is completed, it has to be finetuned, printed, required
number of copies taken and finally submitted complying with the administrative formalities. This
may take another two months.

Description & Nature of Activities Budgeted Time [in months]


Registration, Coursework and other administrative 12 - 15
formalities
Literature Review 4–6
Data Collection 3–6
Data Analysis 3
Drafting of the Thesis 6
Final Submission 2

Iterative solution

Whenever you sit down to write, it is very important to write something. So write
something, even if it is just a set of notes or a few paragraphs of text that you would
never show to anyone else. Most of us find it easier, however, to improve something that
is already written than to produce text from nothing. So put down a draft (as rough as
you like) for your own purposes, then improve it up with the held of your supervisor.
Word-processors are wonderful in this regard: in the first draft you do not have to start at
the beginning, you can leave gaps, you can put in little notes to yourself, and then you
can fill the gaps with relevant materials later.

Your supervisor will want your thesis to be as good as possible, because his/her
reputation as well as yours is affected. Scientific writing is a difficult art, and it takes a
while to learn. As a consequence, there will be many ways in which your first draft can
be improved. So take a positive attitude to all the scribbles with which your supervisor
decorates your text: each comment tells you a way in which you can make your thesis
better.

The process of writing the thesis is like a course in scientific writing, and in that sense
each chapter is like an assignment in which you are taught, but not assessed.
Remember, only the final draft is assessed: the more comments your supervisor adds to
13
first or second draft, the better. Before you submit a draft to your adviser, run a spell
check so that s/he does not waste time on those. If you have any characteristic
grammatical failings, check for them.

What is a thesis?

Your thesis is a research report. The report concerns a problem or series of problems in
your area of research and it should describe what was known about it previously, what
you did towards solving it, what you think your results mean, and where or how further
progress in the field can be made. The readers of a thesis do not know what the
"answer" is. If the thesis is for a PhD, the university requires that it makes an original
contribution to human knowledge: your research must discover something hitherto
unknown.

Who will read it?

Obviously your examiners will read the thesis. They will be experts in the general field of
your thesis but, on the exact topic of your thesis, you are the world expert. Keep this in
mind: you should write to make the topic clear to a reader who has not spent most of the
last three years thinking about it. Your thesis will also be used as a scientific report and
consulted by future workers in your laboratory who will want to know, in detail, what you
did. Theses are occasionally consulted by people from other institutions, and the library
sends microfilm versions if requested (yes, still). More commonly theses are now stored
in an entirely digital form. These may be stored as .pdf files on a server at your
university. The advantage is that your thesis can be consulted much more easily by
researchers around the world

Practical Tips

A research scholar should take all earnest efforts and steps in preparing/drafting a
thesis. Each stage, and each minor issue/ point have to be minutely taken care of and
meticulously attended to. Throughout the drafting of a thesis, serious thinking in depth
and breadth will help a scholar succeed and derive the complete satisfaction. Following
are some of the points one should bear in mind:

14
1. A thesis is a hypothesis or conjecture.
2. A PhD dissertation is a lengthy, formal document that argues in defense of a particular
thesis.
3. Two important adjectives used to describe a dissertation are ``original'' and ``substantial.''
The research performed to support a thesis must be both, and the dissertation must show
it to be so. In particular, a dissertation highlights original contributions.
4. The scientific method means starting with a hypothesis and then collecting evidence to
support or deny it. Before one can write a dissertation defending a particular thesis, one
must collect evidence that supports it. Thus, the most difficult aspect of writing a
dissertation consists of organizing the evidence and associated discussions into a
coherent form.
5. The essence of a dissertation is critical thinking, not experimental data. Analysis and
concepts form the heart of the work.
6. A dissertation concentrates on principles: it states the lessons learned, and not merely
the facts behind them.
7. In general, every statement in a dissertation must be supported either by a reference to
published scientific literature or by original work. Moreover, a dissertation does not repeat
the details of critical thinking and analysis found in published sources; it uses the results
as fact and refers the reader to the source for further details.
8. Each sentence in a dissertation must be complete and correct in a grammatical sense.
Moreover, a dissertation must satisfy the stringent rules of formal grammar (Indeed, the
writing in a dissertation must be crystal clear. Shades of meaning matter; the terminology
and prose must make fine distinctions. The words must convey exactly the meaning
intended, nothing more and nothing less.
9. Each statement in a dissertation must be correct and defensible in a logical and scientific
sense. Moreover, the discussions in a dissertation must satisfy the most stringent rules of
logic applied to mathematics and science.

Conclusion

Remember the following phrase: "No one will ever read your thesis.'' You'll hear this
phrase a number of times as you finish up, and it's vitally important that you believe it to
be true. The phrase is important because without it you would be tempted to work on
your thesis until everything is perfect, and you would never finish. Writing a thesis is
tough work. It is often said: "You should tell everyone that it's going to be unpleasant,
that it will mess up their lives, that they will have to give up their friends and their social
lives for a while. It's a tough period for almost every student." By the way, there is a key
to success: practice. No one ever learned to write by reading essays like this. Instead,
you need to practise, practise. Every day. On behalf of scholars everywhere, I wish you
good luck!

15
Some Guidelines for Writing a Thesis

The following are the possible stages involved in the process of pursuing a research
study leading to PhD degree. A research scholar may have to undergo almost all stages,
though situations vary for each scholar. It is better for a scholar to know all these stages,
their importance and intricacies well before he/she involves himself/herself in the full
pledged research activities. One should have a pre-research discussion with the
supervisor.

[A] Thinking Stage

1. Be inclusive with your thinking.


2. Write down your ideas.
3. Don't be overly influenced by others-it's your research.
4. Try and set a realistic goal.
5. Set appropriate time lines.
6. Take a leave of absence when it will do the most good.
7. Try a preliminary study to help clarify your research.

[B] Preparing the Proposal

8. Read other proposals.


9. Prepare a comprehensive review of the literature.
10. Photocopy relevant articles.
11. Proposal should be first 3 chapters of the thesis.
12. Focus your research.
13. Include a title on your proposal.
14. Organize around a set of questions.
15. Some considerations for designing your research:
a. Design your research so the subjects benefit.
b. Choose your methodology wisely.
c. Consider combining methodologies.
d. Carefully select location for your research.
e. Avoid conducting research in conjunction with another agency.
16. Use your advisory committee well.
a. Select faculty who will support you.
b. Your major professor is your ally.
c. Provide committee with well written proposal.
d. Plan the proposal meeting well.

16
[C] Writing the Thesis

17. Begin writing with sections you know the best.


18. Rewrite your proposal into thesis sections.
19. Use real names/places in early drafts of thesis.
20. Print each draft on a different color paper.
21. Use hand drawings of graphics/tables for early drafts.
22. Make your writing clear and unambiguous.
23. Review other thesis before you begin to write.
24. Introduce tables in the text, present the table and then describe it.
25. Use similar or parallel wording whenever possible.
26. Let your Table of Contents help you improve your manuscript.
27. Write real conclusions and implications - don't restate your findings.
28. Make your Suggestions for further meaningful research.
29. Chapter 1 should be written last.

[D] Defending the Thesis

30. Attend some defenses before it's your turn.


31. Discuss your research with others.
32. Don't circulate chapters to committee.
33. The defense should be team effort - you and supervisor.
34. Don't be defensive at your defense.
35. Organize your defense as an educational presentation.
36. Consider tape recording your defense.
37. Prepare an article on the outcomes of your research.

”It is better to deserve honors and not have them than to have them
and not to deserve them.” << Mark Twain

17
A2Z

PhD
Thesis
Reflections on Academic Research

Chapter III
Research Proposal

18
RESEARCH PROPOSAL

A research proposal is similar in a number of ways to a project proposal; however, a


research proposal addresses a particular project: academic or scientific research. The
forms and procedures for such research are well defined by the field of study, so
guidelines for research proposals are generally more exacting than less formal project
proposals. Research proposals contain extensive literature reviews and must offer
convincing support of need for the research study being proposed. Doctoral
dissertations begin with research proposal; the proposal must be accepted by a panel of
experts (usually professors) before the actual research can begin. In addition to
providing rationale for the proposed research, the proposal must describe a detailed
methodology for conducting the research--a methodology consistent with requirements
of the professional or academic field.

A PhD thesis proposal is an extremely important document, and much thought and
planning should go into crafting this document. Its immediate purpose is to secure the
agreement of your thesis committee to allow you to pursue your research, but it has very
long-term implications: not only will the research take several years to complete, it will
have a major impact on your search for grant funding, post-doctoral positions, and the
competition for a tenure-track position. Writing a clear and effective thesis proposal is
the first step you will take on your way to a career in research.

The Elements of a Research Proposal

A : Introduction and Theoretical Framework

1
“The introduction is the part of the paper that provides readers with the
background information for the research reported in the paper. Its purpose is to
establish a framework for the research, so that readers can understand how it is
related to other research” (Wilkinson, 1991, p. 96).

2
In an introduction, the writer should

a] create reader interest in the topic,


b]. lay the broad foundation for the problem that leads to the study,
c]. place the study within the larger context of the scholarly literature, and
d]. reach out to a specific audience. (Creswell, 1994, p. 42)
19
3
If a researcher is working within a particular theoretical framework/line of inquiry,
the theory or line of inquiry should be introduced and discussed early, preferably
in the introduction or literature review. Remember that the theory/line of inquiry
selected will inform the statement of the problem, rationale for the study,
questions and hypotheses, selection of instruments, and choice of methods.
Ultimately, findings will be discussed in terms of how they relate to the theory/line
of inquiry that undergirds the study.

4
Theories, theoretical frameworks, and lines of inquiry may be differently handled
in quantitative and qualitative endeavours.

(a) “In quantitative studies, one uses theory deductively and places it toward the
beginning of the plan for a study. The objective is to test or verify theory. One thus
begins the study advancing a theory, collects data to test it, and reflects on whether
the theory was confirmed or disconfirmed by the results in the study. The theory
becomes a framework for the entire study, an organizing model for the research
questions or hypotheses for the data collection procedure” (Creswell, 1994, pp. 87-
88).

(b) In qualitative inquiry, the use of theory and of a line of inquiry depends on the nature
of the
investigation. In studies aiming at “grounded theory,” for example, theory and
theoretical tenets emerge from findings. Much qualitative inquiry, however, also aims
to test or verify theory, hence in these cases the theoretical framework, as in
quantitative efforts, should be identified and discussed early on.

B : Statement of the Problem

1
“The problem statement describes the context for the study and it also identifies
the general analysis approach” (Wiersma, 1995, p. 404).

2
“A problem might be defined as the issue that exists in the literature, theory, or
practice that leads to a need for the study” (Creswell, 1994, p. 50).

3.
It is important in a proposal that the problem stand out—that the reader can
easily recognize it. Sometimes, obscure and poorly formulated problems are
masked in an extended discussion. In such cases, reviewers and/or committee
members will have difficulty recognizing the problem.

4.
A problem statement should be presented within a context, and that context

20
should be provided and briefly explained, including a discussion of
the conceptual or theoretical framework in which it is embedded. Clearly and
succinctly identify and explain the problem within the framework of the theory or
line of inquiry that undergirds the study. This is of major importance in nearly all
proposals and requires careful attention. It is a key element that associations
such as AERA and APA look for in proposals. It is essential in all quantitative
research and much qualitative research.

5.
State the problem in terms intelligible to someone who is generally sophisticated
but who is relatively uninformed in the area of your investigation.

6.
Effective problem statements answer the question “Why does this research need
to be conducted.” If a researcher is unable to answer this question clearly and
succinctly, and without resorting to hyperspeaking (i.e., focusing on problems of
macro or global proportions that certainly will not be informed or alleviated by the
study), then the statement of the problem will come off as ambiguous and diffuse.

7.
For conference proposals, the statement of the problem is generally incorporated
into the introduction; academic proposals for theses or dissertations should have
this as a separate section.

C : Purpose of the Study

1
“The purpose statement should provide a specific and accurate synopsis of the
overall purpose of the study” (Locke, Spirduso, & Silverman, 1987, p. 5). If the
purpose is not clear to the writer, it cannot be clear to the reader.

2.
Briefly define and delimit the specific area of the research. You will revisit this in
greater detail in a later section.

3
Foreshadow the hypotheses to be tested or the questions to be raised, as well as
the
significance of the study. These will require specific elaboration in subsequent
sections.

4
The purpose statement can also incorporate the rationale for the study. Some
committees prefer that the purpose and rationale be provided in separate
sections, however.

5
Key points to keep in mind when preparing a purpose statement.

21
(a) Try to incorporate a sentence that begins with “The purpose of this study is . .……...”
This will clarify your own mind as to the purpose and it will inform the reader directly
and explicitly.
(b) Clearly identify and define the central concepts or ideas of the study. Some
supervisors
prefer a separate section to this end. When defining terms, make a judicious choice
between
using descriptive or operational definitions.
(c) Identify the specific method of inquiry to be used.
(d) Identify the unit of analysis in the study.

D : Review of the Literature

1
“The review of the literature provides the background and context for the
research problem. It should establish the need for the research and indicate that
the writer is knowledgeable about the area” (Wiersma, 1995, p. 406).

2
The literature review accomplishes several important things.

(a) It shares with the reader the results of other studies that are closely related to the
study being reported (Fraenkel & Wallen, 1990).
(b) It relates a study to the larger, ongoing dialogue in the literature about a topic, filling
in gaps and extending prior studies (Marshall & Rossman, 1989).
(c) It provides a framework for establishing the importance of the study, as well as a
benchmark for comparing the results of a study with other findings.
(d) It “frames” the problem earlier identified.

3
Demonstrate to the reader that you have a comprehensive grasp of the field and
are aware of important recent substantive and methodological developments.

4
Delineate the “jumping-off place” for your study. How will your study refine,
revise, or extend what is now known?

5
Avoid statements that imply that little has been done in the area or that what has
been done is too extensive to permit easy summary. Statements of this sort are
usually taken as indications that the writer is not really familiar with the literature.

6
In a proposal, the literature review is generally brief and to the point. Be
judicious in your choice of exemplars—the literature selected should be pertinent
and relevant (APA, 2009). Select and reference only the more appropriate
citations. Make key points clearly and succinctly.

7
Doctoral Committees may want a section outlining your search strategy—the
22
procedures you used and sources you investigated (e.g., databases, journals,
test banks, experts in the field) to compile your literature review. Check with your
supervisor.

E : Questions and/or Hypotheses

1
. Questions are relevant to normative or census type research (How many of them
are there? Is there a relationship between them?). They are most often used in
qualitative inquiry, although their use in quantitative inquiry is becoming more
prominent.Hypotheses are relevant to theoretical research and are typically used
only in quantitative inquiry. When a writer states hypotheses, the reader is
entitled to have an exposition of the theory that lead to them (and of the
assumptions underlying the theory). Just as conclusions must be grounded in the
data, hypotheses must be grounded in the theoretical framework.

2
A research question poses a relationship between two or more variables but
phrases the relationship as a question; ahypothesis represents a declarative
statement of the relations between two or more variables (Kerlinger, 1979;
Krathwohl, 1988).

3
Deciding whether to use questions or hypotheses depends on factors such as
the purpose of the study, the nature of the design and methodology, and the
audience of the research (at times even the taste and preference of committee
members, particularly the Chair).

4
The practice of using hypotheses was derived from using the scientific method in
social science inquiry. They have philosophical advantages in statistical testing,
as researchers should be and tend to be conservative and cautious in their
statements of conclusions (Armstrong, 1974).

5
Hypotheses can be couched in four kinds of statements.

(a) Literary null—a “no difference” form in terms of theoretical constructs. For
example, “There is no relationship between support services and academic
persistence of nontraditional-aged college women.” Or, “There is no difference in
school achievement for high and low self-regulated students.”

(b) Operational null—a “no difference” form in terms of the operation required to test
the hypothesis. For example, “There is no relationship between the number of
hours nontraditional-aged college women use the student union and their
persistence at the college after their freshman year.” Or, “There is no difference
between the mean grade point averages achieved by students in the upper and
lower quartiles of the distribution of the Self-regulated Inventory.” The operational
null is generally the preferred form of hypothesis-writing.

23
(c) Literary alternative—a form that states the hypothesis you will accept if the null
hypothesis is rejected, stated in terms of theoretical constructs. In other words,
this is usually what you hope the results will show. For example, “The more that
nontraditional-aged women use support services, the more they will persist
academically.” Or, “High self-regulated students will achieve more in their classes
than low self-regulated students.”

(d) Operational alternative—Similar to the literary alternative except that the


operations are specified. For example, “The more that nontraditional-aged
college women use the student union, the more they will persist at the college
after their freshman year.” Or, “Students in the upper quartile of the Self-
regulated Inventory distribution achieve significantly higher grade point averages
than do students in the lower quartile.”

6
In general, the null hypothesis is used if theory/literature does not suggest a
hypothesized relationship between the variables under investigation; the
alternative is generally reserved for situations in which theory/research suggests
a relationship or directional interplay.

7
Be prepared to interpret any possible outcomes with respect to the questions or
hypotheses. It will be helpful if you visualize in your mind=s eye the tables (or
other summary devices) that you expect to result from your research (Guba,
1961).

8
Questions and hypotheses are testable propositions deduced and directly
derived from theory (except in grounded theory studies and similar types of
qualitative inquiry).

9
Make a clear and careful distinction between the dependent and independent
variables and be certain they are clear to the reader. Be excruciatingly consistent
in your use of terms. If appropriate, use the same pattern of wording and word
order in all hypotheses.

F : The Design--Methods and Procedures

1
“The methods or procedures section is really the heart of the research proposal.
The activities should be described with as much detail as possible, and the
continuity between them should be apparent” (Wiersma, 1995, p. 409).

2
Indicate the methodological steps you will take to answer every question or to
test every hypothesis illustrated in the Questions/Hypotheses section.

24
3
All research is plagued by the presence of confounding variables (the noise that
covers up the information you would like to have). Confounding variables should
be minimized by various kinds of controls or be estimated and taken into account
by randomization processes (Guba, 1961). In the design section, indicate

(a) the variables you propose to control and how you propose to control them,
experimentally or statistically, and
(b) the variables you propose to randomize, and the nature of the randomizing
unit (students, grades, schools, etc.).

4
Be aware of possible sources of error to which your design exposes you. You will
not produce a perfect, error free design (no one can). However, you should
anticipate possible sources of error and attempt to overcome them or take them
into account in your analysis. Moreover, you should disclose to the reader the
sources you have identified and what efforts you have made to account for them.

5
Sampling

(a) The key reason for being concerned with sampling is that of validity—the
extent to which the interpretations of the results of the study follow from the
study itself and the extent to which results may be generalized to other
situations with other people (Shavelson, 1988).

(b) Sampling is critical to external validity—the extent to which findings of a


study can be generalized to people or situations other than those observed in
the study. To generalize validly the findings from a sample to some defined
population requires that the sample has been drawn from that population
according to one of several probabilitysampling plans. By a probability
sample is meant that the probability of inclusion in the sample of any element
in the population must be given a priori. All probability samples involve the
idea of random sampling at some stage (Shavelson, 1988). In
experimentation, two distinct steps are involved.

Random selection—participants to be included in the sample have been


chosen at random from the same population. Define the
population and indicate the sampling plan in detail.

Random assignment—participants for the sample have been assigned at


random to one of the experimental conditions.

(c) Another reason for being concerned with sampling is that of internal
validity—
the extent to which the outcomes of a study result from the variables that
were manipulated, measured, or selected rather than from other variables
not systematically treated. Without probability sampling, error estimates
cannot be constructed (Shavelson, 1988).

(d) Perhaps the key word in sampling is representative. One must ask oneself,
25
“How representative is the sample of the survey population (the group from
which the sample is selected) and how representative is the survey
population of the target population (the larger group to which we wish to
generalize)?”

(e) When a sample is drawn out of convenience (a nonprobability sample),


rationale and limitations must be clearly provided.

(f) If available, outline the characteristics of the sample (by gender, race /
ethnicity, socioeconomic status, or other relevant group membership).

(g) Detail procedures to follow to obtain informed consent and ensure anonymity
and/or confidentiality.

6
Instrumentation

(a) Outline the instruments you propose to use (surveys, scales, interview
protocols, observation grids). If instruments have previously been
used, identify previous studies and findings related to reliability and
validity. If instruments have not previously been used, outline
procedures you will follow to develop and test their reliability and
validity. In the latter case, a pilot study is nearly essential.

(b) Because selection of instruments in most cases provides the


operational definition of constructs, this is a crucial step in the
proposal. For example, it is at this step that a literary conception such
as “self-efficacy is related to school achievement” becomes “scores on
the Mathematics Self-Efficacy Scale are related to Grade Point
Average.” Strictly speaking, results of your study will be directly
relevant only to the instrumental or operational statements (Guba,
1961).

(c) Include an appendix with a copy of the instruments to be used or the


interview protocol to be followed. Also include sample items in the
description of the instrument.

(d) For a mailed survey, identify steps to be taken in administering and


following up the survey to obtain a high response rate.

7
Data Collection

(a) Outline the general plan for collecting the data. This may include survey
administration procedures, interview or observation procedures. Include an
explicit statement covering the field controls to be employed. If appropriate,
discuss how you obtained entré.

(b) Provide a general outline of the time schedule you expect to follow.

26
8
Data Analysis

(a) Specify the procedures you will use, and label them accurately (e.g.,
ANOVA,
MANCOVA, HLM, ethnography, case study, grounded theory). If coding
procedures are to be used, describe in reasonable detail. If you triangulated,
carefully explain how you went about it. Communicate your precise intentions
and reasons for these intentions to the reader. This helps you and the reader
evaluate the choices you made and procedures you followed.

(b) Indicate briefly any analytic tools you will have available and expect to use
(e.g., Ethnograph, NUDIST, AQUAD, SAS, SPSS, SYSTAT).

(c) Provide a well thought-out rationale for your decision to use the design,
methodology, and analyses you have selected.

G : Limitations and Delimitations

1
A limitation identifies potential weaknesses of the study. Think about your
analysis, the nature of self-report, your instruments, the sample. Think about
threats to internal validity that may have been impossible to avoid or minimize—
explain.

2
A delimitation addresses how a study will be narrowed in scope, that is, how it is
bounded. This is the place to explain the things that you are not doing and why
you have chosen not to do them—the literature you will not review (and why not),
the population you are not studying (and why not), the methodological
procedures you will not use (and why you will not use them). Limit your
delimitations to the things that a reader might reasonably expect you to do but
that you, for clearly explained reasons, have decided not to do.

H : Significance of the Study

1
Indicate how your research will refine, revise, or extend existing knowledge in the
area under investigation. Note that such refinements, revisions, or extensions
may have either substantive, theoretical, or methodological significance. Think
pragmatically (i.e., cash value).

2
Most studies have two potential audiences: practitioners and professional peers.
Statements relating the research to both groups are in order.

3
This can be a difficult section to write. Think about implications—how results of
the study may affect scholarly research, theory, practice, educational
interventions, curricula, counseling, policy.

27
4
When thinking about the significance of your study, ask yourself the following
questions.

i What will results mean to the theoretical framework that framed the study?
ii What suggestions for subsequent research arise from the findings?
iii What will the results mean to the practicing educator?
iv Will results influence programs, methods, and/or interventions?
v Will results contribute to the solution of educational problems?
vi Will results influence educational policy decisions?
vii What will be improved or changed as a result of the proposed research?
viii How will results of the study be implemented, and what innovations will
come about?

I : References

1
Follow APA (2009) guidelines regarding use of references in text and in the
reference list. This is the requirement of the Department of Management Studies,
Dr MGR University.

2
Only references cited in the text are included in the reference list; however,
exceptions can be found to this rule. For example, committees may require
evidence that you are familiar with a broader spectrum of literature than that
immediately relevant to your research. In such instances, the reference list may
be called a bibliography.

3
Some committees require that reference lists and/or bibliographies be
“annotated,” which is to say that each entry be accompanied by a brief
description, or an abstract. Check with your Supervisor.

J : Appendices

The need for complete documentation generally dictates the inclusion of


appropriate appendixes in proposals.

The following materials are appropriate for an appendix. Consult with your
Supervisor.

 Verbatim instructions to participants.


 Original scales or questionnaires. If an instrument is copyrighted,
permission in writing to reproduce the instrument from the copyright
holder or proof of purchase of the instrument.
 Interview protocols.
 Sample of informed consent forms.
 Cover letters sent to appropriate stakeholders.
 Official letters of permission to conduct research.
28
Bibliography

American Psychological Association (APA). (2001). Publication manual of the American


Psychological Association (Fourth edition). Washington, DC: Author.
Armstrong, R. L. (1974). Hypotheses: Why? When? How? Phi Delta Kappan, 54, 213-214.
Creswell, J. W. (1994). Research design: Qualitative & quantitative approaches. Thousand Oaks,
CA: Sage.
Guba, E. G. (1961, April). Elements of a proposal. Paper presented at the UCEA meeting, Chapel
Hill, NC.
Fraenkel, J. R. & Wallen, N. E. (1990). How to design and evaluate research in education. New
York: McGraw-Hill.
Kerlinger, F. N. (1979). Behavioral research: A conceptual approach. New York: Holt, Rinehart, &
Winston.
Krathwohl, D. R. (1988). How to prepare a research proposal: Guidelines for funding and
dissertations in the social and behavioral sciences. Syracuse, NY: Syracuse University
Press.
Locke, L. F., Spirduso, W. W., & Silverman, S. J. (1987). Proposals that work: A guide for
planning dissertations and grant proposals(2nd ed.). Newbury Park, CA: Sage.
Marshall, C., & Rossman, G. B. (1989). Designing qualitative research: Newbury Park, CA: Sage.
Shavelson, R. J. (1988). Statistical reasoning for the behavioral sciences (second edition).
Boston: Allyn and Bacon.
Wiersma, W. (1995). Research methods in education: An introduction (Sixth edition). Boston:
Allyn and Bacon.
Wilkinson, A. M. (1991). The scientist’s handbook for writing papers and dissertations. Englewood
Cliffs, NJ: Prentice Hall.

Some Guidelines for Writing a Research Proposal

1
Describe the theoretical framework for the dissertation. This section describes the foundations of
the research, especially if the thesis is heavily indebted to a particular approach to a topic, or if it
tests the validity of a given theory.
2
Describe the research problem itself, placed in this theoretical framework. You may choose to
include the principal studies that are relevant to your research proposal, although a fuller
literature review will be included below.
3

29
Describe the hypothesis. This is basically your statement of what you believe the research might
indicate. You should have some preliminary basis for this statement, in order to demonstrate that
a more thorough investigation is merited.
4
Describe the purpose of the study. This is perhaps the most important section of the PhD thesis
proposal: why should this study proceed? A dissertation at the doctoral level is intended to add to
the body of world knowledge; will this study achieve that goal? You should show how it will it
change the direction of current research, confirm current research in a novel manner, or in some
other way contribute in a positive way to what we know about the world.
5
Demonstrate a solid grasp of the parameters of the literature to be reviewed. This need not be
exhaustive for the proposal stage of the dissertation, but the major sources should be identified.
6
Describe the methodology of the study. This is especially important for quantitative proposals in
the sciences and social sciences.
7
Describe the limitations of the study. It may seem like your dissertation is incredibly huge in
scope, but every research project has its self-defined limitations.
8
Conclude with the significance of the proposal. This should be a briefer statement of the ideas
described earlier in the section describing the purpose of the study, but with a broader focus. Why
would the conclusions of this dissertation matter? Would they deepen our understanding of a
topic in a major way, or would they lead to some material benefit for humanity?
9
Attach any relevant appendices, including a preliminary bibliography, samples of survey
instruments, and the like.
10
Plan the proposal meeting well. If graphic presentations are necessary to help the committee with
understandings make sure you prepare them so they look good. A well planned meeting will help
your committee understand that you are prepared to move forward with well planned research.
Your presentation style at the meeting should not belittle your committee members (make it
sound like you know they have read your proposal) but you should not assume too much (go
through each of the details with an assumption that maybe one of the members skipped over that
section).

Model Research Proposal


Research Proposal provides an overview of your proposed plan of work, including the general
scope of your project, your basic research questions, research methodology, and the overall
significance of your study. In short, your proposal explains what you want to study, how you will
study this thesis, why this thesis needs to be studied, and (generally) when you intend to do this
work. Occasionally, you may also need to explain where your study will take place.

1 Title of the Research Study


Give your project a working title, which may or may not become the title of your Thesis.
2 Statement of purpose
Explain what you hope your research will find or show. State your question or series of questions
30
before you begin your research. After you have conducted significant research you should be
able to answer your question(s) in one or two sentences, which may help you in your research
study.
3 Background
Explain your interest in and experience with this topic. Describe any previous research you have
conducted on this or related topics, any classes you have taken on this or related topics, or any
reading you have already done in the field. If you have personal experience that has lead you to
want to do more research, describe that here too.
4 Significance
Explain why this topic is worth considering, or this question or series of questions is worth
answering. Answer the following questions: why should your supervisor let you select this topic?
What do you hope to learn from it? What will this new knowledge add to the field of knowledge
that already exists on this topic? What new perspective will you bring to the topic? What use
might your final research paper have for others in this field or in the general public? Who might
you decide to share your findings with once the project is complete?
5 Description
Describe the kind of research you will conduct to complete this study (library research, internet
research, interviews, observations, ethnographies, etc.)
6 Methodology
Explain how you will conduct your research in as much detail as possible. If you will consult
others (such as a statistician, an ethnographer, or a librarian) explain what role they will serve
and how you hope they will enhance your development of an appropriate methodology for this
project. Discuss the kinds of sources you hope to consult and the methods you will use to extract
and process the information you gather in as much detail as is possible at this stage. (As the
research study is underway you might find the need to revise your methodology, explore new
types of source material, and/or adopt new methods of gathering and processing data.)
7 Problems
Describe the problems you expect to encounter and how you hope to solve them. For example,
texts might be unavailable, necessitating travel to other libraries or use of inter-library loan
facilities; people you had hoped to interview might be unavailable or unwilling to participate,
necessitating that you select other interviewees or change the focus; internet sites might be down
or no longer available, etc. (Try to imagine every possible problem so that you have contingency
plans and your study doesn't become derailed.)
8 Bibliography
Make a list of texts you plan to consult. If you are writing a library-based research paper you
should aim to make a list of at least 30 potential sources (40 is better), which you will then narrow
down as you conduct the research. Many sources initially seem relevant, but turn out not to be,
so it is always better to list all sources that might be of interest. As you eliminate sources, cross
them off of this list. Mark sources that are particularly useful, and add new sources as you come
across them. This will enable you to make a Reference List at the end of your Thesis (i.e.: a list of
only the works you have summarized, paraphrased, or quoted from in the Thesis.)

“Let not our proposal be disregarded on the score of our youth.”<< Virgil

31
Reflections on Academic Research

A2Z

PhD
Thesis

Chapter IV

Selecting a
Supervisor

32
SELECTING A SUPERVISOR

Introduction

A fundamental characteristic of doctoral research is that it is carried out under the


guidance of one or more academic supervisors. Although researchers have paid
attention to many aspects of student learning and research in management education,
one facet still seriously overlooked is that of research supervision (Armstrong, Allison,
&Hayes, 2004). Several problems, such as poor completion rates of research degrees
(Burnett, 1999) and delayed completion of thesis (Garcia, Malot, & Brethower, 1988),
have been found in work related to thesis in postgraduate and higher levels of
education. The quality of supervision has been often indicated as the main reason for
these problems (Dillon & Malott, 1981; Zoia, 1981). Students have expressed
dissatisfaction with the process of supervision (Hockey, 1991) with reasons for
dissatisfaction, which include poor direction and structure (Acker, Hill, & Black, 1991),
allocation to a supervisor with interests not matching with those of the student, and
insufficient guidance and time scaling (Eggleston & Delamont, 1983; Wright an Lodwick,
1989). Such dissatisfaction rates have been found to be higher in the domain of social
sciences than in natural sciences (Young, Fogarty & McRea, 1987).

Eggleston and Delamont (1983), found that the matching of student to supervisor for
effective relationships is crucially important. The question that arises is how can this
match between student and supervisor be made? In a doctoral level program, the
student chooses a supervisor and has to develop a relationship with this individual. This
relationship is different in many ways from the relationships that students have had with
the lecturers who delivered most of the courses. For example, research students do
need guidance, but they also need to develop sufficient autonomy and freedom to
design and execute their own projects (Cornwall, Schmithals, & Jaques, 1977; Harding,
1973). Clearly, there are several qualities that a student expects to see in her research
supervisor, all of which may or may not be of equal significance to the student.
Consequently, the process of selection of the supervisor becomes one of the critical
factors in determining the degree of fit between the student and her supervisor.

33
Interviewing a Research Supervisor

Professors enjoy talking to prospective students about their research, and this process is
an excellent opportunity to meet the faculty and to discover their current research
interests. Before you talk to each one, read their selected publications again and think of
the questions you would like to ask them. Some important questions you should ask
everyone you interview are:

Is the professor taking on new students?


Would I work on my own project, or on the professor's?
How many students are currently working for the professor?
How many students have graduated under the professor in the last few years? Where
are they now?
How many students left before submission? Why did they leave? Where are they
now?
How long does it typically take for a student to complete under the professor's
supervision? What is the funding policy in the group, especially after year five?
What conferences would I have the opportunity to attend? Which of your students
have recently attended conferences?
Would I have the opportunity to publish papers? Who is typically first author?
What does the professor expect for a Ph.D. in terms of publications?
What is the source of the professor's funding? How stable is it? Are the resources
sufficient and available for the work I want to do, especially if it is a new project? How
are resources shared in the research group?
Is the professor retiring soon, or leaving for an extended period?
Would I be required to travel abroad? How often and for how long?
What prospects would I have in this line of research after I complete?

Supervisor Selection Criteria


Most often supervisor selection happens based on input from senior doctoral students
and one’s own understanding of various strengths of the faculty members. Here three
key elements are worth considering: expertise and interest in the topic pursued by the
student, contacts with academic and other organizations where one wishes to end up
working post dissertation, and good interrelationship between committee members. At a
broad level, these elements do cover a majority of the concerns in supervisor selection
and most doctoral students do consider them while selecting a thesis supervisor,
34
however a detailed set that students consciously use in selecting a supervisor is rarely
documented. However, the following table may form comprehensive selection criteria:
Key elements considered in supervisor selection

Element Description
Freedom to work The professor is open to ideas and is flexible about
adopting alterative approaches
Time conscious The professor is conscious about time taken for
completion and is generally willing to work towards it
Job prospect The professors’ ability to help the candidate in
obtaining a suitable job after completion of thesis
Convergence of interest The matching of interest of the student and the
professor
Reputation/Subject The reputation of the professor in his or her field.
knowledge/Publications
Personal relationship with the professor Cordial and understanding relationship with the
professor
Social networks The professors’ social network and relationship with
other professors in the institute and outside
Can take a stand The extent to which the professor will support the
student in contentious situations, and defend his or her
stand once it has been agreed upon previously
Number of thesis guided Number of thesis guided by the professor, the more
the better
Commitment and involvement Professors’ enthusiasm in guiding the thesis

Important Qualities of the Ideal Research Supervisor

Support
Supportiveness is the quality that PhD students value most highly in supervisors.
This involves supervisors being encouraging, mentoring, and aware that students'
lives extend beyond the PhD. Supportive supervisors make an effort to understand
how the student prefers to work. In addition, such supervisors attend to the student
as a whole person, rather than purely as a research student.

Availability
Students value availability in their supervisors. This involves supervisors meeting
with students regularly, setting aside adequate time for students, and being
contactable through several media (e.g., email, phone) – particularly if they are not
physically present.

Interest and Enthusiasm


Students portrayed the ideal supervisor as someone who is interested and
enthusiastic about the student's work. This is achieved by supervisors who are
positive, empowering, motivational, and committed. Such supervisors are often in
the vicinity of their students and are likely to show an interest in the student's
progress.

35
Knowledge and Expertise in the Field Surrounding the PhD
Ideal supervisors are those who have expertise in the field surrounding the
student's research. Students value highly a supervisor who can use their
knowledge of the area to understand and demonstrate how the student's research
topic fits within the wider field. Students do not necessarily expect the supervisor to
have expertise in the precise topic of their research, however. Having a supervisor
with expertise in the methodologies required in their research is particularly
important.

Interest in the Student's Career


Ideal supervisors are likely to show an interest in the student's career. They help to
provide support for the establishment of the student's career in several ways. These
include having good contacts and introducing students to their network of
colleagues, looking out for and informing students of conferences and seminars
relevant to their research and career, and encouraging and facilitating the
publication of the student's research.

Good Communication
Ideal supervisors have good communication skills. In particular: good listening
skills; the tendency to maintain an open dialogue about the project, its progress and
problems; the ability to communicate in an open, honest, and fair manner about
issues that arise as they arise; and making expectations clear with regard to
matters such as the process of completing a PhD or Master's thesis, budget
considerations, and the role each party must play in performing the project
research.

Constructive Feedback
Students see an ideal supervisor as one who provides feedback and criticism of
their work that is constructive and prompt. In addition students value consistency in
the feedback given. Some valued consistency across time. This is often a sign that
the supervisor and student share the same focus regarding the project. In addition,
where more than one supervisor is responsible for providing feedback, consistency
between supervisors is important.

Provides Direction and Structure


The ideal supervisor is perceived to be one who provides an appropriate amount of
direction and structure to the student's research project. She or he is prepared to
create deadlines, challenge, and push the student a little when required. Such a
supervisor is informative and helpful when it comes to areas of uncertainty. Further,
the ideal supervisor helps to encourage good work habits in the student, thereby
helping the student to help her or himself achieve the desired outcomes from their
research.

Approachability and Rapport


The ideal supervisor is approachable and works to establish a good rapport with
their students.

Experience and Interest in Supervision


Part of being experienced and interested in supervision, a key quality of an ideal
supervisor, is having a complete understanding of the requirements and process of
36
completing a thesis. In addition, students value supervisors who consider the needs
of particular subgroups of the student population (e.g., international students, those
with children, those with disabilities, and those with cultural differences). It is
important that supervisors recognise the individual supervisory needs of each
student. These vary between students and between different stages of their studies.

Substantial Problems faced by Research Scholars


The Supervisor is too Busy to be Effective in their Role
The most common supervisor-related problem that PhD students face is having a
supervisor whose extensive commitments make them too difficult to get hold of. This
comes as a result of supervisors having too many other students and commitments. The
consequences arising from this are numerous. Students see this as the main barrier to
receiving optimal supervision. It is also a likely cause of many of the additional problems
students emphasize.

Poor Feedback
Feedback which conflicts with previous feedback given, too little feedback, delayed and
infrequent feedback, illegible feedback, and too much negative feedback relative to
encouraging and positive comments are all problematic issues for students.

The Supervisor lacks Commitment and Interest


A supervisor who lacks commitment to, or interest in, research poses problems for
graduate research students. Such supervisors fail to show an interest by their lack of
presence and their lack of enquiry into the progress of the work. They tend to make little
or no effort to encourage or motivate the student, fail to give guidance and direction on
issues and questions raised, and don't cooperate well with the student or help the
student to develop skills to help her or himself.

Tensions or Conflicting Perspectives from within the Supervisory Panel


Having to manage the relationship between co-supervisors who do not get along with
each other is a substantial problem for students. Similarly, students find it problematic
when they receive conflicting advice and opinions from each supervisor.

Poor Communication and Disagreements about the Project


Problems arise for students when they feel unclear or in disagreement with their
supervisors about what the aims of the project are or how to best use and interpret their
findings. A failure to discuss the direction and progress of the research poses problems
for the student and their research.

Conflicting or Unrealistic Expectations of Each Other


Students face problems where there is poor communication with their supervisors about
what each person expects of the other. Consequences include misunderstandings
between parties, wasting time, and one or more parties getting frustrated. Another
serious consequence is the student possibly being faced with a project that is too large
to be completed in reasonable timeframe.

Selfishness and Disrespectfulness


Some supervisors display selfishness and a lack of respect for their students. Students
find it difficult to work with supervisors who only look at their own gains from the
37
student's research, push the research down paths that interest them but not necessarily
the student, treat the student as "their property", and expect students to do work that
extends beyond the realms of their PhD or Master's research. Students also find it
concerning when they are not treated as colleagues, despite being at the final stages of
their studies. Students struggle when their supervisors fail to recognise and respect that
they have lives that extend beyond their thesis work.

The Supervisor is not Up-to-Date with the Field


The problem of a supervisor who is not up to date with the field means supervisors are
unable to help problem-solve and advise. This is particularly problematic for students
who also lack access to those who do maintain a current knowledge of the literature. In
some areas, being out-of-date with the field means supervisors are ignorant of the
optimal techniques and theories that exist. This has implications for the quality of
research that can be performed.

The Supervisor lacks Experience in Research and / or Supervision


A lack of experience in research or supervision results in problems for students.
Students commented that an inexperienced supervisor is unclear about the amount and
quality of research that is sufficient for a PhD or Master's. Such supervisors are more
likely to allow the student to do far too much research or to submit the thesis despite it
failing to meet the required standards. In addition, a supervisor who lacks research
experience is likely to allow the conduct of research that is badly-planned.

Personality Clashes
Students find clashes of personality with their supervisors to be problematic for all
concerned. The majority of students saw a personality clash as the reason most likely to
drive them to abandon their studies or to change supervisors.

Conclusion

Selecting a research supervisor is an extremely significant initial step in one’s journey


into the research world. It is therefore important that the decision be made in as informed
a manner as possible. The supervisor is directly responsible for the ethical process and
outcome of the research. The supervisor is charged with ensuring that the student
conducts research in a manner that is as effective, safe, and productive as is possible. In
addition to assisting in preparing the program of studies for the student, the supervisor is
to arrange for and attend all supervisory committee meetings as well as the student’s
comprehensive examination (where applicable) and the oral thesis/dissertation defense.
It is extremely important to get selection right because, once a student is registered; it is
the responsibility of the institution, department and supervisor to provide all reasonable
support. Where the students are not really suitable for the programme for which they are
registered, the stress and outlay of time of everyone involved can be enormous. Most
institutions, departments and supervisors do their best to get selection right, but stories

38
do circulate of institutions which need fees so badly that they accept research students
without giving the designated supervisors the opportunity to have an input into the
decision process.

"We can be absolutely certain only about things we do not understand."


<< Eric Hoffer

39
R
e
f
l
e
c
t A2Z
i
o
n PhD
s
Thesis Chapter V

o
n

A
c
a
d
e
m
i
c

R
e
s
e
a Finalizing the Topic
r
c
h

40
FINALIZING THE TOPIC

"It is really important to do the right research as well as to do the research right.”

A topic is the major organizing principle guiding the analysis of any research study.
Topics offer the scholar an occasion for writing and a focus which governs what is likely
to be said. Topics represent the core subject matter of scholarly communication and the
means by which the scholar arrives at other possible topics of research and discover
new knowledge.

Methods for choosing a topic

Thinking early leads to starting early. If the scholar begins thinking about possible topics
when the assignment is given, he has already begun the arduous, yet rewarding, task of
planning and organization. Once he has made the assignment a priority in her mind, he
may begin to have ideas throughout the day. Brainstorming is often a successful way for
scholars to get some of these ideas down on paper. Seeing one's ideas in writing is
often an impetus for the writing process. Though brainstorming is particularly effective
when a topic has been chosen, it can also benefit the scholar who is unable to narrow a
topic. It consists of a timed writing session during which the scholar jots down—often in
list or bulleted form—any ideas that come to his mind. At the end of the timed period, the
scholar will peruse his list for patterns of consistency. If it appears that something seems
to be standing out in his mind more than others, it may be wise to pursue this as a topic
possibility.

It is important for the scholar to keep in mind that an initial topic may not be the exact
topic about which he ends up writing. Research topics are often fluid, and dictated more
by the scholar's ongoing research than by the original chosen topic. Such fluidity is
common in research, and should be embraced as one of its many characteristics.

Choosing a Topic / How to Begin

Choosing a research topic is not easy as a scholar imagines. One should be thinking
about it right from the start of the research study and a serious continuous process, till
the topic finally finalized. There are generally three ways you are asked to write about a

41
research problem: (a) scholar’s supervisor provides with a general topic from which
scholar picks up a study of particular aspect; (b) supervisor provides with a list of
possible topics; or, (c) he/she leaves it up to the scholar to choose a topic and at later
state, scholar has to obtain his/her permission to write about it before beginning the
formal investigation. Following are some strategies for getting started for each scenario.

[A] A single topic is given to write about


Step 1: Identify concepts and terms that make up the topic statement. For example, your
professor wants the schholar to focus on the following research problem: “The recent Policy of
Government of India to permit foreign universities to the Indian higher education scenario will
really introduce the competitive edge?” The main concepts are: Higher education, Foreign
University, Global competition. [Hint: focus on proper nouns, nouns or noun phrases, and action
verbs].

Step 2: Review related literature to help refine how you will approach focusing on the topic and
finding a way to analyze it. Use the main concept terms already developed in Step 1 to retrieve
relevant articles. This will help refine and refocus the analytical approach. Of course, this exercise
has to be done several times before you finalize how to approach writing about the topic.

Step 3: Since social science research studies are generally designed to get you to develop your
own ideas and arguments, look for sources that can help broaden, modify, or strengthen your
initial ideas and arguments [for example, you have decided to argue that the entry of foreign
universities is ill prepared to take on responsibilities of providing real and factual cost-effective
higher education] There are least four appropriate roles your related literature plays in helping
you formulate how to begin your analysis:

 Sources of Criticism--frequently, you'll find yourself reading materials that are


relevant to your chosen topic, but you disagree with the author's position. Therefore, one
way that you can use a source is to describe the counter-argument, provide evidence
from your review of the literature as to why it is unsatisfactory, and discuss how your own
view is more appropriate based upon your interpretation of the evidence.

 Sources of New Ideas--while a general goal in writing college research papers is to


approach a research problem with some basic idea of what position you'd like to take and
what grounds you'd like to stand upon, it is certainly acceptable (and often encouraged)
to read the literature and extend, modify, and refine your own position in light of the ideas
proposed by others. Just make sure that you cite the source!

 Sources for Historical Context--another role your related literature plays in helping
you formulate how to begin your analysis is to place issues and events in proper
historical context. This can help to demonstrate familiarity with developments in relevent
scholarship about your topic, provide a means of comparing historical versus
contemporary issues and events, and identifying key people, places, and things that had
an important role related to the topic.

 Sources of Interdisciplinary Insights A way to formulate a way to study the topic is to


look at it from a variety of disciplinary perspective. If the topic concerns immigration
reform, ask how do studies from sociological journals vary in their analysis from those in

42
law journals. Another role of related literature is to provide a means of approaching a
topic from multiple perspectives rather than the perspective offered by just one discipline.

NOTE: Always review the references cited by the authors in footnotes, endnotes, or a bibliography to help
locate additional research on the topic. Also, remember to keep careful notes at every stage. You may think
you will remember what you have searched and where you found things, but it’s easy to forget.

Step 4: Assuming you've done a good job of synthesizing and thinking about the results of our
initial search for related literature, you're ready to prepare a detailed outline for your paper that
lays the foundation for a more in-depth and focused review of relevant research literature. [after
consulting with the supervisor, if needed!].

[B] A list of possible topics provided to choose from

Step 1: Started thinking--which topic from this list is the easiest to find the most information on?
An intelligent supervisor should never include a topic that is so obscure or complex that no
research is available to review and begin to design a study. Instead of trying to find the path of
least resistence, begin by choosing a topic that you find interesting in some way, that is
controversial or you have an opinion about, or that has some personal meaning for you. You're
going to be working on your topic for quite some time, so choose one that's interesting or makes
you want to take a position on.

Once you’ve settled on a topic of interest from the list, follow Steps 1 - 4 listed above to further
develop it into a research study.

NOTE: It may be reasonable to review related literature to help refine how you will approach
analyzing a topic, and then discover that the topic is not all that interesting afterall. In that case,
you can choose another from the list. Just don’t wait too long to make a switch and be sure to
consult with your supervisor first.

[C] Your supervisor leaves it up to you to choose a topic

Step 1: The key process here is turning an idea or general thought into a topic that can be cast
as a research problem. When given an assignment where you choose the research topic, don't
begin by thinking about what to write about, but rather, ask yourself the question, "What do I want
to know?" Treat an open-ended assignment as an opportunity to learn about something that's
new or exciting to you.

Step 2: If you lack any more ideas, or wish to gain focus, try some or all of the following
strategies:

 Review your course readings, particularly the suggested readings, for topic
ideas. Don't just review what you've already read but jump ahead in the syllabus to readings
that have not been covered yet in the course.
 Browse through some current journals in your subject discipline. Even if most
of the articles are not relevant, you can skim through the contents quickly. You only need one
to be the spark that begins the process of wanting to learn more about a topic.
 Think about essays and other coursework you have taken or lectures and/or
programs you have attended. Thinking back, what most interested you? What would you like to
know more about?
 Search online resources, to see if your idea has been covered. Use this
coverage to refine your idea into something that you'd like to investigate further but in a more
deliberate, scholarly way based on a problem to research.

43
Step 3: To build upon your initial idea, use the suggestions under this tab to help narrow,
broaden, or increase the timeliness of your idea so you can write it out as a research problem.

Once you are comfortable with having turned your idea into a topic, follow Steps 1 - 4
listed in Part [A] above to further develop it into a research paper.

Some Guidelines for choosing a Research Topic

Here are some critical 11 points to consider in finding and developing a research topic:

1. Can it be enthusiastically pursued?


2. Can interest be sustained by it?
3. Is the problem solvable?
4. Is it worth doing?
5. Will it lead to other research problems?
6. Is it manageable in size?
7. What is the potential for making an original contribution to the literature in the field?
8. If the problem is solved, will the results be reviewed well by scholars in your field?
9. Are you, or will you become, competent to solve it?
10. By solving it, will you have demonstrated independent skills in your discipline?
11. Will the necessary research prepare you in an area of demand or promise for the future?

Developing and Focussing the Research Topic

Any topic will be difficult to research if it is too broad. A great way to fine-tune a topic is to use the
method traditionally used by newspaper reporters: Who?-What?-Where?-When?-Why?
Who is involved?
A particular age group, occupation, ethnic group, men, women, etc. For example, if you are
interested in writing about the environment, you might focus on the effects of air pollution on
infants and children.
What is the problem?
What is the issue facing the "who" in your topic-health concerns, job and economic trends,
contaminated drinking water? Try stating your topic as a question. For example, if you’re
interested in finding out about drinking water, you might ask: Are there preventive measures that
government can take to keep the drinking water supply from being contaminated?
Where is it happening?
A specific country, region, city, physical environment, rural vs. urban? For example: What
environmental issues are most important in the southern plains area of the U.S.
When is this happening?
Is this a current issue or an historical event? Will you discuss the historical development of a
current problem? Example: How does environmental awareness affect business practices today

44
Why is it happening / Why is this a problem?
You may want to focus on causes, or argue the importance of this problem by outlining historical
or current ramifications. Or you may want to persuade your instructor or class why they should
care about the issue. Example: Why are some states seriously investigating wind power
opportunities now? Be flexible. It is common to modify your topic during the research process.

Identify key concepts

 Analyze for concepts


 Keep track of the words used to describe your topic.
 To successfully search online article databases and the Internet you need to be specific
in asking for what you want-and sometimes creative.

I never really know the title of a book until it's finished.<< Mary Wesley

45
A2Z

PhD
Thesis

Reflections on Academic Research

Chapter VI

Research Problem
46
RESEARCH PROBLEM

Introduction

A research problem is the situation that causes the researcher to feel apprehensive,
confused and ill at ease. It is the demarcation of a problem area within a certain context
involving the WHO or WHAT, the WHERE, the WHEN and the WHY of the problem
situation.

There are many problem situations that may give rise to reseach. Three sources
usually contribute to problem identification. Own experience or the experience of others
may be a source of problem supply. A second source could be scientific literature. You
may read about certain findings and notice that a certain field was not covered. This
could lead to a research problem. Theories could be a third source. Shortcomings in
theories could be researched.

Research can thus be aimed at clarifying or substantiating an existing theory, at


clarifying contradictory findings, at correcting a faulty methodology, at correcting the
inadequate or unsuitable use of statistical techniques, at reconciling conflicting opinions,
or at solving existing practical problems.

What Is a Research Problem?


A research problem is the topic you would like to address, investigate, or study, whether
descriptively or experimentally. It is the focus or reason for engaging in your research. It
is typically a topic, phenomenon, or challenge that a research scholar is interested in
and with which the scholar is at least somewhat familiar.

Definition

A research problem is an area of concern, a condition to be improved, a difficulty to be


eliminated, or a troubling question that exists in scholarly literature, in theory, or in
practice that points to the need for meaningful understanding and investigation. In some
social science disciplines the research problem is typically posed in the form of a
question. A research problem does not state how to do something, offer a vague or
broad proposition, or present a value question.

47
Identification of the Problem

The prospective researcher should think on what caused the need to do the research
(problem identification). The question that he/she should ask is: Are there questions
about this problem to which answers have not been found up to the present?

Research originates from a need that arises. A clear distinction between the PROBLEM
and the PURPOSE should be made. The problem is the aspect the researcher worries
about, think about, wants to find a solution for. The purpose is to solve the problem, ie
find answers to the question(s). If there is no clear problem formulation, the purpose
and methods are meaningless.

Keep the following in mind:

• Outline the general context of the problem area.


• Highlight key theories, concepts and ideas current in this area.
• What appear to be some of the underlying assumptions of this area?
• Why are these issues identified important?
• What needs to be solved?
• Read round the area (subject) to get to know the background and to identify unanswered
questions or controversies, and/or to identify the the most significant issues for further
exploration.

The research problem should be stated in such a way that it would lead to analytical
thinking on the part of the researcher with the aim of possible concluding solutions to the
stated problem. Research problems can be stated in the form of either questions or
statements.

• The research problem should always be formulated grammatically correct and as


completely as possible. You should bear in mind the wording (expressions) you use.
Avoid meaningless words. There should be no doubt in the mind of the reader what your
intentions are.
• Demarcating the research field into manageable parts by dividing the main problem into
subproblems is of the utmost importance.

Systematic Approach

Step 1: Brainstorm possible research questions. Be creative. This is your chance to


experiment with a variety of different questions you might pursue.

A. Try using various brainstorming techniques


B. Reflect back on postgraduate dissertation, coursework, and seminar papers

48
Step 2: Do some preliminary research.

A. Search Digital Dissertations and read the abstracts to identify research questions
that have already been pursued in your field
B. Search an index in your field to find relevant articles and focus on the last
sections covering recommended further research
C. Search an index in the proposed field of research that includes conference
proceedings and review titles and abstracts

Step 3: Narrow down the ideas to top two or three.

A. Be careful about research feasibility: If you are doing empirical research, your
research question must be appropriate for the subjects you have to work with
B. Be careful about broad questions: Your research question cannot be so broad
that you can't adequately cover it

Step 4: Discuss with all your supervisor, other faculty, peers.

Step 5: Through this process, reach an agreement with your supervisor on an initial set of
research questions to guide the research. Note that as your research progresses, you
might refine your questions.

The purpose of a problem statement is to:

1. Introduce the reader to the importance of the topic being studied. The reader is
oriented to the significance of the study and the research questions or hypotheses to
follow.

2. Places the problem into a particular context that defines the parameters of what is
to be investigated.

3. Provides the framework for reporting the results and indicates what is probably
necessary to conduct the study and explain how the findings will present this
information.

“So What!”
In the social sciences, the research problem establishes the means by which you must
answer the "so what?" question. The "so what?" question refers to a research problem
surviving the relevancy test. Note that this question requires a commitment on your part
to not only show that you have researched the material, but that you have thought about
its significance.
To survive the "so what" question, Hernon and Schwartz [Library & Information Science
Research 29 (2007): 307-309] noted that problem statements should possess the
following attributes:

• Clarity and precision (a well-written statement does not make sweeping


generalizations and irresponsible statements)
49
• Identification of what would be studied , while avoiding the use of value-laden
words and terms
• Identification of an overarching question and key factors or variables
• Identification of key concepts and terms
• Articulation of the study's boundaries or parameters
• Some generalizability
• Conveyance of the study's importance, benefits, and justification (regardless of
the type of research, it is important to address the “so what” question and to
demonstrate that the research is not trivial)
• No use of unnecessary jargon; and,
• Conveyance of more than the mere gathering of descriptive data providing a
snapshot.

Structure and Writing Style

There are two broad conceptualizations of a research problem in the social sciences:

1. Conventional -- a set of conditions needing discussion, a solution, and


information.
2. Technical -- implies the possibility of empirical investigation, that is, of data
collection and analysis.

Characteristics of a Good Research Problem


A good research problem is compelling.
The problem that you choose to explore must be important to you and to a larger
community you share. The problem chosen must be one that motivates you to address
it.

A good research problem must support multiple perspectives.


The problem most be phrased in a way that avoids dichotomies and instead supports the
generation and exploration of multiple perspectives. A general rule of thumb is that a
good research problem is one that would generate a variety of viewpoints from a
composite audience made up of reasonable people.

A good research problem must be researchable.


It seems a bit obvious, but you don't want to find yourself in the midst of investigating a
complex research project and realize that you don’t have much to draw on for your
research. Choose research problems that can be supported by the resources available
to you. Not sure? Seek out help from a librarian!
Avoid circular reasoning. Don’t state that the research problem as simply the absence of
the thing. For example, if you propose, "The problem in this community is that it has no
hospital."
This only leads to a research problem where:

• The need is for a hospital


• The objective is to create a hospital
50
• The method is to plan for building a hospital, and
• The evaluation is to measure if there is a hospital or not.

This research problem fails the "so what?" test because it does not reveal
the relevance of why you are investigating the problem of having no hospital in the
community [maybe there's a hospital in the community ten miles away] and does not
elucidate the significance of why one should study the fact that no hospital exists in the
community [maybe its because the hospital in the community ten miles away has no
emergency room].

Checklist for Testing the Feasibility of the Research Problem

Is the problem of current interest? Will the research results have social,
1
educational or scientific value?
2 Will it be possible to apply the results in practice?
3 Does the research contribute to the science of education?
4 Will the research opt new problems and lead to further research?
5 Is the research problem important? Will you be proud of the result?
6 Is there enough scope left within the area of reseach (field of research)?
Can you find an answer to the problem through research? Will you be able
7
to handle the research problem?
8 Will it be pratically possible to undertake the research?
9 Will it be possible for another researcher to repeat the research?
10 Is the research free of any ethical problems and limitations?
11 Will it have any value?
Do you have the necessary knowledge and skills to do the research? Are
12
you qualified to undertake the research?
Is the problem important to you and are you motivated to undertake the
13
research?
Is the research viable in your situation? Do you have enough time and
14
energy to complete the project?
15 Do you have the necessary funds for the research?
16 Will you be able to complete the project within the time available?
Do you have access to the administrative, statistic and computer facilities
17
the research necessitates?

51
Some Guidelines for Writing the Research Problem Statement

1. First select your research topic, which is the issue or subject area that you intend to
investigate.

2. Describe the business or management problem based on your topic that you intend to
research. Do this right at the beginning of your research proposal or report as laid out in
the templates (remember to reference any facts that you are basing your research on).
This will set the scene for your Research Problem statement, so that you can write a
clear, stand alone Research Problem.

3. A Research Problem is not the same as a business problem, ie it is not a “problem” in


the normal sense of the word; it is research jargon that happens to be a bit confusing.
You can think of your Research Problem as the unknown part of your business problem.

4. We prefer Research Problem statements to have an outcomes based verb at or near


the beginning. Some good outcome based verbs are:

identify; define; relate; describe; review ; justify ; indicate; formulate;


explain; compare; contrast; suggest; interpret; analyse; assess;
construct; apply; demonstrate; illustrate; categorise; deduce; create;
resolve; debate; propose; differentiate; construct; argue;derive; design; evaluate;
establish; conceptualise; suggest; integrate; compile; develop; challenge; consolidate;
clarify; criticise; formulate; ascertain; appraise; calculate; recommend;

5. Verbs such as “understand”, “explore”, “investigate”, “examine” and “discuss” are poor
verbs as they describe processes, not outcomes, eg you can discuss something
endlessly without ever having to make recommendations, draw conclusions or offer a
result. You might be exploring, examining or discussing as part of your process, but they
cannot be the end result of your research, which should be more tangible.

6. If your Research Problem contains two or more concepts / ideas, then break it down
into subproblems, so that each sub‐problem consists of one idea only. Each
sub‐problem should contain key words that you can use in your literature search (using
the electronic library databases and Google Scholar) on that sub‐problem.

7. Your Research Problem statement should be your sub‐problems added together – no


more and no less. Do not introduce any new ideas when you write your sub‐problems.

For example:

The Main problem is to


Analyse and evaluate the role of entrepreneurship in the establishment of small, medium and
micro enterprises (SMMEs) and ascertain the value of the economic contributions of these firms
in emerging markets.

Sub‐problem 1
Analyse and evaluate the role of entrepreneurship in establishing SMMEs in emerging markets.
(Here your key search terms for your literature review could be “entrepreneurship”, “SMME” and
“emerging markets”).

52
Sub‐problem 2
Evaluate the economic contribution of SMMEs to growth and development in emerging markets.
(Here your search terms could be “economic contribution”, “economic growth”, “and emerging
market development”).

Basic research is what I am doing when I don't know what I am doing.


<< Wernher von Braun

53
A2Z Chapter VII

PhD
Thesis
Reflections on Academic Research

Review of Literature

54
REVIEW OF LITERATURE

A review of literature is intended to summarize and synthesize the pertinent research


on a topic by professionals in the field. The "literature" of a literature review refers to any
collection of materials on a topic, not necessarily the great literary texts of the world.
"Literature" could be anything from a set of government pamphlets on British colonial
methods in Africa to scholarly articles on the treatment of a highly complicated
healthcare problem. A review does not necessarily mean that the research scholar is
expected to analyze and offer his/her own personal/critical opinion of the sources
reviewed.

The format of a review of literature may vary from discipline to discipline and from
assignment to assignment. A review may be a self-contained unit -- an end in itself -- or
a preface to and rationale for engaging in primary research. A review is a required part
of grant and research proposals and often a chapter in theses. Generally, the purpose of
a review is to analyze critically a segment of a published body of knowledge through
summary, classification, and comparison of prior research studies, reviews of literature,
and theoretical articles.

Introduction

A literature review can stand alone or appear as part of a longer work, such as a
research proposal. The quality of the literature review is dependent upon (a) the
thoroughness of the writer's search, (b) the quality and reliability of the writer's sources,
(c) the ability of the writer to relate research studies to one another and to the writer's
own thesis or purpose, (d) the objectivity of the writer in selecting, interpreting,
organizing, and summarizing the research he or she has reviewed.

A literature review discusses published information in a particular subject area, and


sometimes information in a particular subject area within a certain time period. A
literature review can be just a simple summary of the sources, but it usually has an
organizational pattern and combines both summary and synthesis. A summary is a
recap of the important information of the source, but a synthesis is a re-organization, or
a reshuffling, of that information. It might give a new interpretation of old material or
55
combine new with old interpretations. Or it might trace the intellectual progression of the
field, including major debates. And depending on the situation, the literature review may
evaluate the sources and advise the reader on the most pertinent or relevant.

Literature reviews provide the research scholar with a handy guide to a particular topic.
In case a scholar finds shortage of time or limited time, literature reviews can give help
to have an overview or act as a stepping stone. For professionals, they are useful
reports that keep them up to date with what is current in the field. For scholars, the depth
and breadth of the literature review emphasizes the credibility of the writer in his or her
field. Literature reviews also provide a solid background for a research paper's
investigation. Comprehensive knowledge of the literature of the field is essential to most
research papers.

In most of the cases it is better to seek guidelines or clarifications from one’s supervisor
regarding the comprehensiveness of the review, probably as listed hereunder:

 Roughly how many sources should you include?


 What types of sources (books, journal articles, websites)?
 Should you summarize, synthesize, or critique your sources by discussing a common
theme or issue?
 Should you evaluate your sources?
 Should you provide subheadings and other background information, such as definitions
and/or a history?

Look for other literature reviews in your area of interest or in the discipline and read them
to get a sense of the types of themes you might want to look for in your own research or
ways to organize your final review. You can simply put the word "review" in your search
engine along with your other topic terms to find articles of this type on the Internet or in
an electronic database. The bibliography or reference section of sources you've already
read are also excellent entry points into your own research.

There are hundreds or even thousands of articles and books on most areas of study.
The narrower one’s topic, the easier it will be to limit the number of sources one requires
reading in order to get a good survey of the material. Research supervisor may probably
not expect you to read everything that's out there on the topic, but it is always better to
limit the scope of the research at this stage.

56
Some disciplines require that one uses information that is as current as possible. In the
sciences, for instance, treatments for medical problems are constantly changing
according to the latest studies. Information even two years old could be obsolete.
However, if a researcher is writing a review in the humanities, history, or social sciences,
a survey of the history of the literature may be what is needed, because what is
important is how perspectives have changed through the years or within a certain time
period. Try sorting through some other current bibliographies or literature reviews in the
field to get a sense of what one’s discipline expects. Also it is important to consider what
is currently of interest to scholars in this field and what is not.

A literature review is usually organized around ideas, not the sources themselves as an
annotated bibliography would be organized. This means that one will not just simply list
the sources and go into detail about each one of them, one at a time. Not at all. As the
scholar reads widely but selectively in the selected topic area, he/she should consider
identifying what themes or issues connect the sources together. Do they present one or
different solutions? Is there an aspect of the field that is missing? How well do they
present the material and do they portray it according to an appropriate theory? Do they
reveal a trend in the field? One of these themes to focus the organization of the current
review should be identified and finalized.

Organizing the Review

You've got a focus, and you've narrowed it down to a thesis statement. Now what is the
most effective way of presenting the information? What are the most important topics,
subtopics, etc., that your review needs to include? And in what order should you present
them? Develop an organization for your review at both a global and local level.

Basic categories

Just like most academic papers, literature reviews also must contain at least three basic
elements: an introduction or background information section; the body of the review
containing the discussion of sources; and, finally, a conclusion and/or future scope for
potential research scholars.

57
Introduction

In the introduction, a research scholar would:

Define or identify the general topic, issue, or area of concern, thus providing an
appropriate context for reviewing the literature.

Point out overall trends in what has been published about the topic; or conflicts in theory,
methodology, evidence, and conclusions; or gaps in research and scholarship; or a
single problem or new perspective of immediate interest.

Establish the writer's reason (point of view) for reviewing the literature; explain the
criteria to be used in analyzing and comparing literature and the organization of the
review (sequence); and, when necessary, state why certain literature is or is not included
(scope).

Writing the Body

In the body, a research scholar would:

Group research studies and other types of literature (reviews, theoretical articles, case
studies, etc.) according to common denominators such as qualitative versus quantitative
approaches, conclusions of authors, specific purpose or objective, chronology, etc.

Summarize individual studies or articles with as much or as little detail as each merits
according to its comparative importance in the literature, remembering that space
(length) denotes significance.

Provide the reader with strong "umbrella" sentences at beginnings of paragraphs,


"signposts" throughout, and brief "so what" summary sentences at intermediate points in
the review to aid in understanding comparisons and analyses.

Organizing the body

Once the basic categories are in place, then the order/manner in which the sources
themselves are arranged within the body of this section.

58
The following illustration may be considered to have the feel of the problem and studied
as to how three typical ways of organizing the sources are presented.

You've decided to focus your literature review on materials dealing with sperm whales.
This is because you've just finished reading Moby Dick, and you wonder if that whale's
portrayal is really real. You start with some articles about the physiology of sperm whales
in biology journals written in the 1980's. But these articles refer to some British biological
studies performed on whales in the early 18th century. So you check those out. Then you
look up a book written in 1968 with information on how sperm whales have been portrayed
in other forms of art, such as in Alaskan poetry, in French painting, or on whale bone, as
the whale hunters in the late 19th century used to do. This makes you wonder about
American whaling methods during the time portrayed in Moby Dick, so you find some
academic articles published in the last five years on how accurately Herman Melville
portrayed the whaling scene in his novel.

Chronological

If your review follows the chronological method, you could write about the materials
above according to when they were published. For instance, first you would talk about
the British biological studies of the 18th century, then about Moby Dick, published in
1851, then the book on sperm whales in other art (1968), and finally the biology articles
(1980s) and the recent articles on American whaling of the 19th century. But there is
relatively no continuity among subjects here. And notice that even though the sources on
sperm whales in other art and on American whaling are written recently, they are about
other subjects/objects that were created much earlier. Thus, the review loses its
chronological focus.

By publication

Order your sources by publication chronology, then, only if the order demonstrates a
more important trend. For instance, you could order a review of literature on biological
studies of sperm whales if the progression revealed a change in dissection practices of
the researchers who wrote and/or conducted the studies.

By trend

A better way to organize the above sources chronologically is to examine the sources
under another trend, such as the history of whaling. Then your review would have
subsections according to eras within this period. For instance, the review might examine
whaling from pre-1600-1699, 1700-1799, and 1800-1899. Under this method, you would

59
combine the recent studies on American whaling in the 19th century with Moby Dick
itself in the 1800-1899 categories, even though the authors wrote a century apart.

Thematic

Thematic reviews of literature are organized around a topic or issue, rather than the
progression of time. However, progression of time may still be an important factor in a
thematic review. For instance, the sperm whale review could focus on the development
of the harpoon for whale hunting. While the study focuses on one topic, harpoon
technology, it will still be organized chronologically. The only difference here between a
"chronological" and a "thematic" approach is what is emphasized the most: the
development of the harpoon or the harpoon technology.

Note: But more authentic thematic reviews tend to break away from chronological order. For
instance, a thematic review of material on sperm whales might examine how they are portrayed
as "evil" in cultural documents. The subsections might include how they are personified, how their
proportions are exaggerated, and their behaviors misunderstood. A review organized in this
manner would shift between time periods within each section according to the point made.

Methodological

A methodological approach differs from the two above in that the focusing factor usually
does not have to do with the content of the material. Instead, it focuses on the "methods"
of the researcher or writer. For the sperm whale project, one methodological approach
would be to look at cultural differences between the portrayal of whales in American,
British, and French art work. Or the review might focus on the economic impact of
whaling on a community. A methodological scope will influence either the types of
documents in the review or the way in which these documents are discussed.

Once you've decided on the organizational method for the body of the review, the
sections you need to include in the paper should be easy to figure out. They should arise
out of your organizational strategy. In other words, a chronological review would have
subsections for each vital time period. A thematic review would have subtopics based
upon factors that relate to the theme or issue.

Writing the Conclusion

In the conclusion, a research scholar would:


60
Summarize major contributions of significant studies and articles to the body of
knowledge under review, maintaining the focus established in the introduction.

Evaluate the current "state of the art" for the body of knowledge reviewed, pointing out
major methodological flaws or gaps in research, inconsistencies in theory and findings,
and areas or issues pertinent to future study.

Conclude by providing some insight into the relationship between the central topic of the
literature review and a larger area of study such as a discipline, a scientific endeavor, or
a profession.

Meticuluous Revision

Now it is time for a thorough revision. Spending a lot of time revising is a wise idea,
because your main objective is to present the material, not the argument. So check over
your review again to make sure it follows the assignment and/or your outline. Then, just
as you would for most other academic forms of writing, rewrite or rework the language of
your review so that you've presented your information in the most concise manner
possible. Be sure to use terminology familiar to your audience; get rid of unnecessary
jargon or slang. Finally, double check that you've documented your sources and
formatted the review appropriately for your discipline. For tips on the revising and editing
process, see our handout on revising drafts.

Length of “Literature Review”

Quantity is not relevant. Quality is! Always remember the axiom

"I have seen great literature reviews with 100 pages and 100 references and I have seen
poor literature reviews with 100 pages and 100 references. I have seen great literature
reviews with 20 pages and 20 references and I have seen poor literature reviews with 20
pages and 20 references."

This endeavor will be a significant part of your research study. You will learn a great deal
from this exercise, not only about your topic, but also about how to learn from a serious
research study. I doubt that there are many topics that you could choose that would
have fewer than 50-100 recently published articles. Most have several hundred or
thousands. It is your task to find the relevant articles and make sense of them.

61
Reference

Anson, Chris M. and Robert A. Schwegler (2000) The Longman Handbook for Writers and
Readers. Second edition. New York: Longman.

Jones, Robert, Patrick Bizzaro, and Cynthia Selfe (1997) The Harcourt Brace Guide to Writing in
the Disciplines. New York: Harcourt Brace.

Lamb, Sandra E.(1998) How to Write It: A Complete Guide to Everything You'll Ever Write.
Berkeley, Calif.: Ten Speed Press.

Rosen, Leonard J. and Laurence Behrens (2000) The Allyn and Bacon Handbook. Fourth edition.
Boston: Allyn and Bacon.

Troyka, Lynn Quitman. Simon and Schuster (2002) Handbook for Writers. Upper Saddle River,
N.J.: Prentice Hall.

Step-by-step guidelines for writing a Literature Review

These guidelines are adapted primarily from Galvan (2006). Galvan outlines a very
clear, step-by-step approach that is very useful to use as you write your review. I have
integrated some other tips within this guide, particularly in suggesting different
technology tools that you might want to consider in helping you organize your review. In
the sections from Step 6-9 what I have included is the outline of those steps exactly as
described by Galvan. I also provide links at the end of this guide to resources that you
should use in order to search the literature and as you write your review.

In addition to using the step-by-step guide that I have provided below, I also recommend
that you (a) locate examples of literature reviews in your field of study and skim over
these to get a feel for what a literature review is and how these are written (I have also
provided links to a couple of examples at the end of these guidelines (b) read over other

62
guides to writing literature reviews so that you see different perspectives and
approaches.

Step 1: Review APA guidelines

Read APA guidelines so that you become familiar with the common core elements of
how to write in APA style: in particular, pay attention to general document guidelines
(e.g. font, margins, and spacing), title page, abstract, body, text citations, and
quotations.

Step 2: Decide on a topic

It will help you considerably if your topic for your literature review is the one on which
you intend to do your research study, or is in some way related to the topic of your
research paper. However, you may pick any scholarly topic.

Step 3: Identify the literature that you will review:

1. Familiarize yourself with online databases, identifying relevant databases in your


field of study.
2. Using relevant databases, search for literature sources using Google Scholar
and also searching using Furl (search all sources, including the Furl accounts of
other Furl members). Some tips for identifying suitable literature and narrowing
your search :
(a) Start with a general descriptor from the database thesaurus or one that you know
is already a well defined descriptor based on past work that you have done in this
field. You will need to experiment with different searches, such as limiting your
search to descriptors that appear only in the document titles, or in both the
document title and in the abstract.
(b) Redefine your topic if needed: as you search you will quickly find out if the topic
that you are reviewing is too broad. Try to narrow it to a specific area of interest
within the broad area that you have chosen. It is a good idea, as part of your
literature search, to look for existing literature reviews that have already been
written on this topic.
(c) As part of your search, be sure to identify landmark or classic studies and
theorists as these provide you with a framework/context for your study.

Step 4: Analyze the literature

Once you have identified and located the articles for your review, you need to analyze
them and organize them before you begin writing:

63
1. Overview the articles: Skim the articles to get an idea of the general purpose
and content of the article (focus your reading here on the abstract, introduction
and first few paragraphs, the conclusion of each article. Tip: as you skim the
articles, you may want to record the notes that you take on each directly into
RefWorks in the box for User 1. You can take notes onto note cards or into a
word processing document instead or as well as using RefWorks, but having
your notes in RefWorks makes it easy to organize your notes later.
2. Group the articles into categories (e.g. into topics and subtopics and
chronologically within each subtopic). Once again, it's useful to enter this
information into your RefWorks record. You can record the topics in the same
box as before (User 1) or use User 2 box for the topic(s) under which you have
chosen to place this article.
3. Take notes:
(a) Decide on the format in which you will take notes as you read the articles (as
mentioned above, you can do this in RefWorks. You can also do this using a
Word Processor, or a concept mapping program like Inspiration, a data base
program (e.g. Access or File Maker Pro), in an Excel spreadsheet, or the "old-
fashioned" way of using note cards. Be consistent in how you record notes.
(b) Define key terms: look for differences in the way keys terms are defined (note
these differences).
(c) Note key statistics that you may want to use in the introduction to your review.
(d) Select useful quotes that you may want to include in your review. Important: If
you copy the exact words from an article, be sure to cite the page number as you
will need this should you decide to use the quote when you write your review (as
direct quotes must always be accompanied by page references). To ensure that
you have quoted accurately (and to save time in note taking), if you are
accessing the article in a format that allows this, you can copy and paste using
your computer "edit --> copy --> paste" functions. Note: although you may collect
a large number of quotes during the note taking phase of your review, when you
write the review, use quotes very sparingly. The rule I follow is to quote only
when some key meaning would be lost in translation if I were to paraphrase the
original author's words, or if using the original words adds special emphasis to a
point that I am making.
(e) Note emphases, strengths & weaknesses: Since different research studies focus
on different aspects of the issue being studied, each article that you read will
have different emphases, strengths. and weaknesses. Your role as a
reviewer is to evaluate what you read, so that your review is not a mere
description of different articles, but rather a critical analysis that makes sense of
the collection of articles that you are reviewing. Critique the research
methodologies used in the studies, and distinguish between assertions (the
author's opinion) and actual research findings (derived from empirical evidence).
(f) Identify major trends or patterns: As you read a range of articles on your topic,
you should make note of trends and patterns over time as reported in the
literature. This step requires you to synthesize and make sense of what you read,
since these patterns and trends may not be spelled out in the literature, but
rather become apparent to you as you review the big picture that has emerged

64
over time. Your analysis can make generalizations across a majority of studies,
but should also note inconsistencies across studies and over time.
(g) Identify gaps in the literature, and reflect on why these might exist (based on the
understandings that you have gained by reading literature in this field of study).
These gaps will be important for you to address as you plan and write your
review.
(h) Identify relationships among studies: note relationships among studies, such as
which studies were landmark ones that led to subsequent studies in the same
area. You may also note that studies fall into different categories (categories that
you see emerging or ones that are already discussed in the literature). When you
write your review, you should address these relationships and different
categories and discuss relevant studies using this as a framework.
(i) Keep your review focused on your topic: make sure that the articles you find are
relevant and directly related to your topic. As you take notes, record which
specific aspects of the article you are reading are relevant to your topic (as you
read you will come up with key descriptors that you can record in your notes that
will help you organize your findings when you come to write up your review). If
you are using an electronic form of note taking, you might note these descriptors
in a separate field (e.g. in RefWorks, put these under User 2 or User 3; in Excel
have a separate column for each descriptor; if you use Inspiration, you might
attach a separate note for key descriptors.
(j) Evaluate your references for currency and coverage: Although you can always
find more articles on your topic, you have to decide at what point you are finished
with collecting new resources so that you can focus on writing up your findings.
However, before you begin writing, you must evaluate your reference list to
ensure that it is up to date and has reported the most current work. Typically a
review will cover the last five years, but should also refer to any landmark studies
prior to this time if they have significance in shaping the direction of the field. If
you include studies prior to the past five years that arenot landmark studies, you
should defend why you have chosen these rather than more current ones.

Step 5: Summarize the literature in table or concept map format

1. Galvan (2006) recommends building tables as a key way to help you overview,
organize, and summarize your findings, and suggests that including one or more
of the tables that you create may be helpful in your literature review. If
you do include tables as part of your review each must be accompanied by an
analysis that summarizes, interprets and synthesizes the literature that you have
charted in the table. You can plan your table or do the entire summary chart of
your literature using a concept map.
(a) You can create the table using the table feature within Microsoft Word, or can
create it initially in Excel and then copy and paste/import the Excel sheet into
Word once you have completed the table in Excel. The advantage of using Excel
is that it enables you to sort your findings according to a variety of factors (e.g.
sort by date, and then by author; sort by methodology and then date)
(b) Examples of tables that may be relevant to your review:
(i) Definitions of key terms and concepts.
(ii) Research methods
(iii) Summary of research results

65
Step 6: Synthesize the literature prior to writing your review

Using the notes that you have taken and summary tables develop an outline of your final
review. The following are the key steps as outlined by Galvan (2006: 71-79)

1. Consider your purpose and voice before beginning to write. Your initial purpose
is to provide an overview of the topic that is of interest to you, demonstrating your
understanding of key works and concepts within your chosen area of focus. You
are also developing skills in reviewing and writing, to provide a foundation for
your final thesis. In your final thesis your literature review should demonstrate
your command of your field of study and/or establishing context for a study that
you have done.
2. Consider how you reassemble your notes: plan how you will organize your
findings into a unique analysis of the picture that you have captured in your
notes.Important: A literature review is not series of annotations (like an annotated
bibliography). Galvan (2006:72) captures the difference between an annotated
bibliography and a literature review very well: "...in essence, like describing trees
when you really should be describing a forest. In the case of a literature review,
you are really creating a new forest, which you will build by using the trees you
found in the literature you read."
3. Create a topic outline that traces your argument: first explain to the reader your
line or argument (or thesis); then your narrative that follows should explain and
justify your line of argument.
4. Reorganize your notes according to the path of your argument
5. Within each topic heading, note differences among studies.
6. Within each topic heading, look for obvious gaps or areas needing more
research.
7. Plan to describe relevant theories.
8. Plan to discuss how individual studies relate to and advance theory
9. Plan to summarize periodically and, again near the end of the review
10. Plan to present conclusions and implications
11. Plan to suggest specific directions for future research near the end of the review
12. Flesh out your outline with details from your analysis

Step 7: Writing the review (Galvan, 2006: 81-90)

1. Identify the broad problem area, but avoid global statements


2. Early in the review, indicate why the topic being reviewed is important
3. Distinguish between research finding and other sources of information
4. Indicate why certain studies are important
5. If you are commenting on the timeliness of a topic, be specific in describing the
time frame
6. If citing a classic or landmark study, identify it as such
7. If a landmark study was replicated, mention that and indicate the results of the
replication
8. Discuss other literature reviews on your topic
9. Refer the reader to other reviews on issues that you will not be discussing in
details
66
10. Justify comments such as, "no studies were found."
11. Avoid long lists of nonspecific references
12. If the results of previous studies are inconsistent or widely varying, cite them
separately
13. Cite all relevant references in the review section of thesis, dissertation, or journal
article

Step 8: Developing a coherent essay (Galvan, 2006: 91-96)

1. If your review is long, provide an overview near the beginning of the review
2. Near the beginning of a review, state explicitly what will and will not be covered
3. Specify your point of view early in the review: this serves as the thesis statement
of the review.
4. Aim for a clear and cohesive essay that integrates the key details of the literature
and communicates your point of view (a literature is not a series of annotated
articles).
5. Use subheadings, especially in long reviews
6. Use transitions to help trace your argument
7. If your topic teaches across disciplines, consider reviewing studies from each
discipline separately
8. Write a conclusion for the end of the review: Provide closure so that the path of
the argument ends with a conclusion of some kind. How you end the review,
however, will depend on your reason for writing it. If the review was written to
stand alone, as is the case of a term paper or a review article for publication, the
conclusion needs to make clear how the material in the body of the review has
supported the assertion or proposition presented in the introduction. On the other
hand, a review in a thesis, dissertation, or journal article presenting original
research usually leads to the research questions that will be addressed.
9. Check the flow of your argument for coherence.

Reference:

Galvan, J. (2006). Writing literature reviews: a guide for students of the behavioral sciences ( 3rd
ed.). Glendale, CA: Pyrczak Publishing.

67
Model: review of literaure

Just like most academic papers, literature reviews also must contain at least three basic
elements: an Introduction or background information section; the Body of the review
containing the discussion of sources; and, finally, a Conclusion detailing the future
scope for potential research scholars.

Introduction
Define or identify the general topic, issue, or area of concern, thus providing an appropriate
context for reviewing the literature.

Point out overall trends in what has been published about the topic; or conflicts in theory,
methodology, evidence, and conclusions; or gaps in research and scholarship; or a single
problem or new perspective of immediate interest.

Establish the writer's reason (point of view) for reviewing the literature; explain the criteria to be
used in analyzing and comparing literature and the organization of the review (sequence); and,
when necessary, state why certain literature is or is not included (scope).
Body
Design the review of literature in any ONE of three broad sections:
(a) Choronological [either by publication or by trend]
(b) Thematic
(c) Methodological
Conclusion
Summarize major contributions of significant studies and articles to the body of knowledge under
review, maintaining the focus established in the introduction.

Evaluate the current "state of the art" for the body of knowledge reviewed, pointing out major
methodological flaws or gaps in research, inconsistencies in theory and findings, and areas or
issues pertinent to future study.

Conclude by providing some insight into the relationship between the central topic of the literature
review and a larger area of study such as a discipline, a scientific endeavor, or a profession.

“Review your goals twice every day in order to be focused on achieving them.”
<< Les Brown

68
A2Z

PhD
Thesis

Reflections on Academic Research

Chapter VIII

Scope of Research Study

69
Scope of Research Study
What should be in the scope of study? Scope is simply boundary of the research. What
can be bounded are of course coverage, data, analytical method and applicability of
output. Hence the scope of study should have at least 4 paragraphs and one for each of
the four type of research boundary above. Scope of coverage defines what areas around
the subject matter the research covers and what it did not.

The scope of the study means the specific areas that the particular researcher wants to
cover in his/her study. Also it is an established fact that searching for resources on
researcher’s subject will be more effective if he/she already defined the scope of the
research. The questions to consider in the research scope should be:

(i) Does the research cover a particular time period?


(ii) Does the study cover a specific geographical area?
(iii) If the study involves people, what age group, gender and place of origin are to be included?
(iv) Are all dates of publication to be included?
(v) Is the research going to cover publications from other countries?
(vi) Will the research include other languages and scripts?
(vii) Are all perspectives to be considered? For example, philosophical, political, psychological,
etc.

This means that the scope of the study may be referred to the specific element and
content that the researcher wants to explore in his/her study. And he/she is responsible
to set the scope realistically, the broader scope will make the study takes longer time
while too little scope can make the study not worthy. The scope of study for a research is
usually one of the first sections to the thesis. It sets out the scope of your work and
limitations. The research scholar should add in as much detail as possible when describing
what is under research, why the research is being done and how it is being done. The
examiners will want to know why a particular area is under the research topic and what is
proposed to find out or expected to discover. This will set out an idea of the purpose of the
research as well as giving the reader your expectations of the paper.

70
Difference between limitation and scope

What is the scope and limitation of the study and what is the difference between the
scope and the limitation of the study. Walonick (2005) explained that all research studies
should have limitations and a finite scope. Limitations literally mean kind of restraint or
obstacles that the study might be gone through during the period. In most of the
research, limitations are often imposed by time and budget constraints, both time and
budget are the most commonly restraining factors that will give its significant impacts on
the study. If the limitation is not clearly defined and tackled carefully, the study may end
up being invalid. The researcher should precisely list the limitations of the study and
describe the extent to which he/she believe the limitations degrade the quality of the
research. This will help the researcher and the audience understands the actual situation
that the study encounters.

We can conclude that the differences between limitation and scope of the research are:
(a) The research scope is the specific are that the researcher want to cover in his/her study
while the research limitations is the constraint and obstacle that the research expects to
encounter during the study; and
(b) The research limitation is beyond the researcher’s control since it involves external factors
which outside researcher’s authority while the research scope is under researcher’s control and
manageable by him/her. The researcher will determine what scope to be cover and what scope to
be left

“Academic success depends on research and publications.”


<< Philip Zimbardo

71
A2Z

PhD
Thesis
Reflections on Academic Research

Chapter IX

Limitations

72
LIMITATIONS

Introduction

Without exception, all research is limited in several ways. There are internal or formal
limitations, such as the materials and procedures used, the ways in which critical terms
are defined, the scope of the problem explored and of the applicability of the results. And
there are external limitations as well, governed by constraints upon one’s time or
pocketbook; the inability to travel to special collections, museums, or libraries, or to
speak or read other languages; or to consider an evolving political situation beyond a
certain date. These limitations should be acknowledged; indeed, identifying them may
help the scholar to focus the topic. However, problems such as time and money
difficulties do not relieve the scholar of the responsibility of designing a study that can
adequately test the hypothesis and measure its results. Proposals that include no
mention of limitations suggest that the scholar has not really gone beyond a superficial
consideration of the subject. This section of the thesis, therefore, will require
considerable thought. But close attention now to these and related questions will save
the scholar much time and discomfort in later stages of research and writing.

There is no ‘one best way’ to structure the Research Limitations section of the thesis.
However, a structure based on three ‘moves’ is suggested: [a] the announcing, [b]
reflecting, and [c] forward looking move. The announcing move immediately allows the
scholar to identify the limitations of the thesis and explain how important each of these
limitations is. The reflecting move provides greater depth, helping to explain the nature
of the limitations and justify the choices that the scholar made during the research
process. Finally, the forward looking move enables the scholar to suggest how such
limitations could be overcome in future. The collective aim of these three moves is to
help the scholar walk the reader through the Research Limitations section in a succinct
and structured way. This will make it clear to the reader that the scholar (a) recognises
the limitations of your own research, (b) understands why such factors are limitations,
and (c) can point to ways of combating these limitations if future research was
carried out.

73
[a] Announcing Move
Idenifying limitations: What they are and how important they are

Overall, the announcing move should be around 10-15% of the total word count of
the Research Limitations section.

There are many possible limitations that the research may have faced Four
main types of research limitation are:

(a) An inability to answer your research questions


(b) Theoretical and conceptual problems
(c) Limitations of your research strategy
(d) Problems of research quality

Even though there may be a large number of limitations in any thesis, it is not necessary to
discuss all of these limitations in the Research Limitations section. After all, the scholar is not
writing a 2000 word critical review of the limitations of the thesis, just a 400-700 word critique
that is just one section long (i.e. the Research Limitations section within
your Conclusions chapter). Therefore, in this first announcing move, it is recommended that
the scholar identifies only those limitations that had the greatest potential impact on:

(a) the quality of the findings


(b) the ability to effectively answer the research questions and/or hypotheses

We use the word potential when we talk about the impact that these research limitations
could have had on the thesis because we often do not know the degree to which
different factors limited the findings or our ability to effectively answer the research
questions and/or hypotheses.

For example, we know that when adopting a quantitative research design, a failure to
use aprobability sampling technique significantly limits our ability to make
broader generalisations from our results (i.e. our ability to make statistical
inferences from our sample to the population being studied). However, the degree to
which this reduces the quality of our findings is a matter of debate. Also, whilst the lack
of a probability sampling technique when using a quantitative research design is a very
obvious example of a research limitation, other limitations are far less clear.

74
Therefore, the key point is to focus on those limitations that you feel had the greatest
impact on your findings, as well as your ability to effectively answer your research
questions and/or hypotheses.

You may already know which of these limitations applies to your dissertation. However, if
you are unsure about the potential weaknesses in your research, we would recommend
that you read:

Overall, the announcing move should be around 10-15% of the total word count of
the Research Limitations section.

[b] Reflecting Move


Explaining the nature of the limitations and justifying the choices you made

Having identified the most important limitations to the thesis in the announcing move, the
reflecting move focuses on explaining the nature of these limitations and justifying the
choices that the scholar made during the research process. This part should be around
60-70% of the total word count of the Research Limitations section.

It is important to remember at this stage that all research suffers from limitations,
whether it is performed by undergraduate and postgraduate level dissertation students,
or seasoned academics. Acknowledging such limitations should not be viewed as a
weakness, highlighting to the person marking evaluating the thesis. Instead, the reader
is more likely to accept that the scholar recognise the limitations of one’s own research if
the scholar writes a high quality reflecting move. This is because explaining the
limitations of the research and justifying the choices the scholar made during the
research process demonstrates the command that the scholar had over the research.

We talk about explaining the nature of the limitations in the thesis because such
limitations are highly research specific. Let’s take the example of potential limitations to
the sampling strategy.

Whilst the scholar may have a number of potential limitations in sampling strategy, let’s
focus on the lack of probability sampling; that is, of all the different types of sampling
technique that one could have used, the scholar choose not to use a probability

75
sampling technique (e.g. simple random sampling, systematic random sampling,
stratified random sampling). As mentioned, if the scholar used a quantitative research
design in the thesis, the lack of probability sampling is an important, obvious limitation to
one’s research. This is because it prevents the scholar from making generalisations
about the population under study (e.g. Facebook usage at a single university of 20,000
students) from the data the scholar has collected (e.g. a survey of 400 students at the
same university). Since an important component of quantitative research is such
generalisation, this is a clear limitation. However, the lack of a probability sampling
technique is not viewed as a limitation if one used a qualitative research design. In
qualitative research designs, a non-probability sampling technique is typically selected
over a probability sampling technique.

And the problem is not yet over, but ...

Even if the scholar used a quantitative research design, but failed to employ a probability
sampling technique, there are still many perfectly justifiable reasons why he could have
made such a choice. For example, it may have been impossible (or near on impossible)
to get a list of the population one was studying (e.g. a list of all the 20,000 students at
the single university one was interested in. Since probability sampling is only
possible when we have such a list, the lack of such a list or inability to attain such a list is
a perfectly justifiable reason for not using a probability sampling technique; even if such
a technique is the ideal.

[c] Forward looking Move


Suggesting how such limitations could be overcome in future

Finally, the forward looking move builds on the reflecting move by suggesting how the
limitations that have been discussed could be overcome through future research. Whilst
a lot could be written in this part of the Research Limitations section, we would
recommend that it is only around 10-20% of the total word count for this section.

Every study, no matter how well it is conducted, has some limitations.


This is why it does not seem reasonable to use the words "prove" and "disprove" with
respect to research findings. It is always possible that future research may cast doubt
on the validity of any hypothesis or the conclusions from a study.
76
Limitations of Case Studies

Case studies may be viewed as having the most limitations. You cannot make
causal conclusions from case studies. This is true because we cannot rule out
alternative explanations. It is always unclear about the generality of the findings of a
case study. A case study involves the behavior of one person. The behavior of one
person may not reflect the behavior of most people. Thus, we do not know how others
may behave.

Limitations of Correlational Studies

Correlational research also has the same limitations as case studies. Correlational
research merely demonstrates that we can predict a variable from another variable. It is
demonstrating that two variables are associated. However, two variables can be
associated without there being a causal relationship between the variables. We cannot
make causal conclusions from correlational findings because we cannot rule out all
alternative explanations for correlational findings. Thus, making causal conclusions from
correlational findings is a logical error. If we find that A is associated with B, it could
mean that A caused B, B caused A, or some third variable caused both A and B without
there being any causal relationship between A and B. Even if we could rule out one of
the possible relationships (e.g., B caused A), we cannot rule out all alternative
explanations from correlational studies. For every correlational study, there is the
possibility that some third variable caused the two variables without there being a causal
relationship between the variables. Correlational research may also have limitations with
respect to the generality of the findings. Perhaps the study involved a specific group of
people, or the relation between the variables was only investigated in some situations.
Thus, it may be uncertain whether the correlational findings may generalize to other
people or situations.

Limitations of Randomized Experiments

Experiments involving the random assignment of participants to conditions may allow us


to make causal conclusions if the variables that are manipulated are not confounded
with other variables. However, there still may be limitations with respect to the generality
of the findings. The experiment may have involved a specific group of people, certain

77
situations, and only some of the possible conceptualizations of variables. Thus, we may
not know whether the findings will generalized to other people, situations, or
conceptualizations of the variables.

Regardless of whether the proposal is intended to secure funding for research or


approval from a doctoral student's dissertation committee, it is usually expected that at
some point the proposal will include a section to make explicit what the researcher does
not intend to accomplish (or what the design of the study inherently will not allow). Like
other sections of the proposal, such a statement is as much for the benefit of the writer
as it is for the benefit of the reader.

Once a statement of limitations has been prepared, the question about where in the
proposal to place it arises. A logical place is near the end of the problem statment
section, somewhere after the statement of purpose. Elsewhere in the proposal, the
researcher may have repeated a general statement of purpose, "the purpose of this
project is...", which presents another opportunity for including the limitations and
delimitations of the study. Again, that may have been at the end of the problem
statement, preceding a justification for selecting the problem in the first place. Another
juncture may have occurred somewhere in the proximity of the section devoted to the
conceptual framework (design of the study).

“Once we accept our limits, we go beyond them.”<< Albert Einstein

78
A2Z

PhD
Thesis
Reflections on Academic Research

Chapter X

Objectives
79
OBJECTIVES

Introduction

Objectives are specific, observable, and measurable learning outcomes. Objectives


emphasize major points and reduce non-essential material. Objectives simplify note
taking and cue the scholars to emphasize major points. Objectives assist scholars in
organizing and studying content material. They guide the scholars to what is expected
from them and help them to study important information. The objectives of a research
project summarize what is to be achieved by the study. These objectives should be
closely related to the research problem. The general objective of a study states what
researchers expect to achieve by the study in general terms. It is possible (and
advisable) to break down a general objective into smaller, logically connected parts.
These are normally referred to as specific objectives. Specific objectives should
systematically address the various research questions. They should specify what the
scholar will do in his/her study, where and for what purpose.

Objectives of the Objectives

The objective of the research should be closely related to the research study of the
thesis.The main purpose of the research objective is to focus on research problem,
avoid the collection of unnecessary data and provide direction to research study.
Research is related to the aspiration and objectives are related to the battle-plan.
Objectives should be specific, measurable, achievable, realistic and timely, so that
research problem could be explored effectively.

Specific: Objective should be clear and well defined. It helps to specify the research problems
and provide proper guideline to find the solution of research problem (Alexander, 2008). Specific
objective identify the methods of collecting necessary information related to the research
problem.

Measurable: Objectives should be measurable. It improves quality and quantitative of the


research study to achieve its goal. The measurable research objectives provide guidelines for the
improvement of research design. It is important element to achieve research objectives.

Achievable: Objectives should also be achievable in the time and it should provide accurate
result from the use of sufficient resources in the specific time frame. It is related to effective
measure of research problem (Atkinson, 2001). Achievable objectives ensure that every process
of research is finished in accurate time will help to achieve the goals.

80
Realistic: Objective should be realistic, so that available resources like as men, money and
machines could be used effectively. Objectives are most useful, when they accurately define the
problem and take various steps that can be implemented with a specific time period.

Timely: Objective should be measured and achievable into the time frame. The research takes
enough time in finding the solution of research problem. Timeline indicate when the objective will
be accomplished (Frey & Osterloh, 2002).

Formulating the Objectives

The formulation of objectives will help you to:

 Focus the study (narrowing it down to essentials);


 Avoid the collection of data which are not strictly necessary for understanding
and solving the problem you have identified;
 Organize the study in clearly defined parts or phases.

Properly formulated, specific objectives will facilitate the development of your research
methodology and will help to orient the collection, analysis, interpretation and utilization
of data.

It is important that the objectives, especially in a research study, are stated in a good
way. It is ensured that the objectives of the research study:

 Cover the different aspects of the problem and its contributing factors in a
coherent way and in a logical sequence;
 Are clearly phrased in operational terms, specifying exactly what you are going to
do, where, and for what purpose;
 Are realistic considering local conditions;
 Use action verbs that are specific enough to be evaluated (Examples of action
verbs are: to determine, to compare, to verify, to calculate, to describe, and to
establish). Avoid the use of vague non-action verbs (Examples of non-action
verbs: to appreciate, to understand, or to study).

It is note worthy to remember that when the thesis is evaluated, the results will be
compared to the objectives. If the objectives have not been spelled out clearly or stated
ambiguously, the thesis will suffer during the evaluation leading, in some cases, to
rejection of the research thesis.

Designing the Objectives

There are four components of an objective: (a) the action verb, (b) conditions, (c)
standard, and (d) the intended audience (always the research scholars). The action verb
is the most important element of an objective and can never be omitted. The action verb
81
states precisely what the scholars will do following instruction. Verbs are categorized by
domains of learning and various hierarchies. The three domains of learning are the
cognitive domain that emphasizes thinking; the affective domain highlighting attitudes
and feelings; and the psychomotor domain featuring doing.

The cognitive domain is further divided into six levels or hierarchies. They are:

Cognitive (Thinking) Domain

Knowledge
Comprehension
Application
Analysis
Synthesis
Evaluation

Sometimes these six hierarchies or levels listed above are grouped into three
categories:

Level 1. Recall – Knowledge and Comprehension


Level 2. Interpretation – Application and Analysis
Level 3. Problem-Solving – Synthesis and Evaluation

Recall objectives are at the basic taxonomic level and involve recall or description of
information. Interpretation is a higher level of learning and involves application and
examination of knowledge. Problem-solving skills test the highest level of learning and
involve construction and assessment of knowledge.
Scholars should remember that the objectives of a research study form and define the
direction and path of the research journey. Greater and meticulous attention given here
would make the scholars feel at ease at the later stages of the reaearch study.

Some Guidelines for designing the Objectives:

Objectives should:

1. be presented concisely and briefly


2. be interrelated. The aim is what you want to achieve, and the objective describes
how you are going to achieve that aim i.e.:
a. make sure that each aim is matched with specific objectives
3. be realistic about what you can accomplish in the duration of the project and the
other scommitments you have i.e.:

82
a. the scope of the research study must be consistent with the time frame
and level of effort available for research study.
4. provide the scholar and at a later stage, thesis evaluators with indicators of how
the scholar:

a. intends to approach the literature and theoretical issues related to


research
b. intends to access chosen research areas, respondents, units, goods or
services and develop a sampling frame and strategy or a rationale for
their selection
c. will develop a strategy and design for data collection and analysis
d. deals with ethical and practical problems in the research undertaken

Objectives should not:

1. be too vague, ambitious or broad in scope:


a. though aims are more general in nature than objectives it is the viability
and feasibility of the research study that the scholar has to demonstrate
and aims often present an over-optimistic picture of what the study can
achieve
2. just repeat each other in different terms
3. just be a list of things related to the research topic
4. spend time discussing details of the research job or research site i.e.:

a. it is your research study, the thesis evaluators are interested in and a


scholar should keep this in mind at all times.
5. contradict methods, that is, they should not imply methodological goals or
standards of measurement, proof or generalizability of findings that the methods
cannot sustain.

“Failure comes only when we forget our ideals and objectives and principles.”
<< Jawaharlal Nehru

83
Chapter XI

Research Design
A2 Z

PhD
Thesis
Reflections on Academic Research

84
RESEARCH DESIGN

Introduction

The research design refers to the strategy a scholar chooses to integrate the different
components of the study in a cohesive and coherent way in order to address the
research problem; it constitutes the blueprint for the collection, measurement, and
analysis of data.

Note: Research problem determines the type of design one can use, not the other way around!

I - Action Research Design

Definition and Purpose

The essentials of action research design follow a characteristic cycle whereby initially an
exploratory stance is adopted, where an understanding of a problem is developed and
plans are made for some form of interventionary strategy. Then the intervention is
carried out (the action in Action Research) during which time, pertinent observations are
collected in various forms. The new interventional strategies are carried out, and the
cyclic process repeats, continuing until a sufficient understanding of (or implement able
solution for) the problem is achieved. The protocol is iterative or cyclical in nature and is
intended to foster deeper understanding of a given situation, starting with
conceptualizing and particularizing the problem and moving through several
interventions and evaluations.

Advantages

1. A collaborative and adaptive research design that lends itself to use in work or
community situations.
2. Design focuses on pragmatic and solution-driven research rather than testing theories.
3. When practitioners use action research it has the potential to increase the amount they
learn consciously from their experience. The action research cycle can also be regarded
as a learning cycle.
4. Action search studies often have direct and obvious relevance to practice.
5. There are no hidden controls or preemption of direction by the researcher.

85
Disadvantages

1. It is harder to do than conducting conventional studies because the researcher takes on


responsibilities for encouraging change as well as for research.
2. Action research is much harder to write up because you probably can’t use a standard
format to report your findings effectively.
3. Personal over-involvement of the researcher may bias research results.
4. The cyclic nature of action research to achieve its twin outcomes of action (e.g. change)
and research (e.g. understanding) is time-consuming and complex to conduct.

II - Case Study Design

Definition and Purpose

A case study is an in-depth study of a particular research problem rather than a


sweeping statistical survey. It is often used to narrow down a very broad field of research
into one or a few easily researchable examples. The case study research design is also
useful for testing whether a specific theory and model actually applies to phenomena in
the real world. It is a useful design when not much is known about a phenomenon.

Advantages

1. Approach excels at bringing us to an understanding of a complex issue through detailed


contextual analysis of a limited number of events or conditions and their relationships.
2. A researcher using a case study design can apply a vaiety of methodologies and rely on
a variety of sources to investigate a research problem.
3. Design can extend experience or add strength to what is already known through previous
research.
4. Social scientists, in particular, make wide use of this research design to examine
contemporary real-life situations and provide the basis for the application of concepts and
theories and extension of methods.
5. The design can provide detailed descriptions of specific and rare cases.

Disadvantages

1. A single or small number of cases offers little basis for establishing reliability or to
generalize the findings to a wider population of people.
2. The intense exposure to study of the case may bias a researcher's interpretation of the
findings.
3. Design does not facilitate assessment of cause and effect relationships.
4. Vital information may be missing, making the case hard to interpret.
5. The case may not be representative or typical of the larger problem being investigated.
6. If the criterion for selecting a case is because it represents a very unusual or unique
phenomenon or problem for study, then your intepretation of the findings can only apply
to that particular case.

86
III - Causal Design

Definition and Purpose

Causality studies may be thought of as understanding a phenomenon in terms of


conditional statements in the form, “If X, then Y.” This type of research is used to
measure what impact a specific change will have on existing norms and assumptions.
Most social scientists seek causal explanations that reflect tests of hypotheses. Causal
effect (nomothetic perspective) occurs when variation in one phenomenon, an
independent variable, leads to or results, on average, in variation in another
phenomenon, the dependent variable.

Conditions necessary for determining causality:

 Empirical association--a valid conclusion is based on finding an association


between the independent variable and the dependent variable.
 Appropriate time order--to conclude that causation was involved, one must see
that cases were exposed to variation in the independent variable before variation in
the dependent variable.
 Nonspuriousness--a relationship between two variables that is not due to
variation in a third variable.

Advantages

1. Causality research designs helps researchers understand why the world works the way it
does through the process of proving a causal link between variables and eliminating
other possibilities.
2. Replication is possible.
3. There is greater confidence the study has internal validity due to the systematic subject
selection and equity of groups being compared.

Disadvantages

1. Not all relationships are casual! The possibility always exists that, by sheer coincidence,
two unrelated events appear to be related [e.g., Punxatawney Phil could accurately
predict the duration of Winter for five consecutive years but, the fact remains, he's just a
big, furry rodent].
2. Conclusions about causal relationships are difficult to determine due to a variety of
extraneous and confounding variables that exist in a social environment. This means
causality can only be inferred, never proven.
3. If two variables are correlated, the cause must come before the effect. However, even
though two variables might be causally related, it can sometimes be difficult to determine
which variable comes first and therefore to establish which variable is the actual cause
and which is the actual effect.

87
IV - Cohort Design

Definition and Purpose

Often used in the medical sciences, but also found in the applied social sciences, a
cohort study generally refers to a study conducted over a period of time involving
members of a population which the subject or representative member comes from, and
who are united by some commonality or similarity. Using a quantitative framework, a
cohort study makes note of statistical occurrence within a specialized subgroup, united
by same or similar characteristics that are relevant to the research problem being
investigated, rather than studying statistical occurrence within the general population.
Using a qualitative framework, cohort studies generally gather data using methods of
observation. Cohorts can be either "open" or "closed."

 Open Cohort Studies [dynamic populations, such as the population of Los


Angeles] involve a population that is defined just by the state of being a part of the
study in question (and being monitored for the outcome). Date of entry and exit from
the study is individually defined; therefore, the size of the study population is not
constant. In open cohort studies, researchers can only calculate rate based data, such
as, incidence rates and variants thereof.
 Closed Cohort Studies [static populations, such as patients entered into a clinical
trial] involve participants who enter into the study at one defining point in time and
where it is presumed that no new participants can enter the cohort. Given this, the
number of study participants remains constant (or can only decrease).

Advantages

1. The use of cohorts is often mandatory because a randomized control study may be
unethical. For example, you cannot deliberately expose people to asbestos, you can only
study its effects on those who have already been exposed. Research that measures risk
factors often relies on cohort designs.
2. Because cohort studies measure potential causes before the outcome has occurred, they
can demonstrate that these “causes” preceded the outcome, thereby avoiding the debate
as to which is the cause and which is the effect.
3. Cohort analysis is highly flexible and can provide insight into effects over time and related
to a variety of different types of changes [e.g., social, cultural, political, economic, etc.].
4. Either original data or secondary data can be used in this design.

88
Disadvantages

1. In cases where a comparative analysis of two cohorts is made [e.g., studying the effects
of one group exposed to asbestos and one that has not], a researcher cannot control for
all other factors that might differ between the two groups. These factors are known as
confounding variables.
2. Cohort studies can end up taking a long time to complete if the researcher must wait for
the conditions of interest to develop within the group. This also increases the chance that
key variables change during the course of the study, potentially impacting the validity of
the findings.
3. Because of the lack of randominization in the cohort design, its external validity is lower
than that of study designs where the researcher randomly assigns participants.

V - Cross-Sectional Design

Definition and Purpose

Cross-sectional research designs have three distinctive features: no time dimension, a


reliance on existing differences rather than change following intervention; and, groups
are selected based on existing differences rather than random allocation. The cross-
sectional design can only measure diffrerences between or from among a variety of
people, subjects, or phenomena rather than change. As such, researchers using this
design can only employ a relative passive approach to making causal inferences based
on findings.

Advantages

1. Cross-sectional studies provide a 'snapshot' of the outcome and the characteristics


associated with it, at a specific point in time.
2. Unlike the experimental design where there is an active intervention by the researcher to
produce and measure change or to create differences, cross-sectional designs focus on
studying and drawing inferences from existing differences between people, subjects, or
phenomena.
3. Entails collecting data at and concerning one point in time. While longitudinal studies
involve taking multiple measures over an extended period of time, cross-sectional
research is focused on finding relationships between variables at one moment in time.
4. Groups identified for study are purposely selected based upon existing differences in the
sample rather than seeking random sampling.
5. Cross-section studies are capable of using data from a large number of subjects and,
unlike observational studies, is not geographically bound.
6. Can estimate prevalence of an outcome of interest because the sample is usually taken
from the whole population.
7. Because cross-sectional designs generally use survey techniques to gather data, they
are relatively inexpensive and take up little time to conduct.

89
Disadvantages

1. Finding people, subjects, or phenomena to study that are very similar except in one
specific variable can be difficult.
2. Results are static and time bound and, therefore, give no indication of a sequence of
events or reveal historical contexts.
3. Studies cannot be utilized to establish cause and effect relationships.
4. Provide only a snapshot of analysis so there is always the possibility that a study could
have differing results if another time-frame had been chosen.
5. There is no follow up to the findings.

VI - Descriptive Design

Definition and Purpose

Descriptive research designs help provide answers to the questions of who, what, when,
where, and how associated with a particular research problem; a descriptive study
cannot conclusively ascertain answers to why. Descriptive research is used to obtain
information concerning the current status of the phenomena and to describe "what
exists" with respect to variables or conditions in a situation.

Advantages

1. The subject is being observed in a completely natural and unchanged natural


environment. True experiments, whilst giving analyzable data, often adversely influence
the normal behavior of the subject.
2. Descriptive research is often used as a pre-cursor to more quantitatively research
designs, the general overview giving some valuable pointers as to what variables are
worth testing quantitatively.
3. If the limitations are understood, they can be a useful tool in developing a more focused
study.
4. Descriptive studies can yield rich data that lead to important recommendations.
5. Appoach collects a large amount of data for detailed analysis.

Disadvantages

1. The results from a descriptive research can not be used to discover a definitive answer or
to disprove a hypothesis.
2. Because descriptive designs often utilize observational methods [as opposed to
quantitative methods], the results cannot be replicated.
3. The descriptive function of research is heavily dependent on instrumentation for
measurement and observation.

90
VII - Experimental Design

Definition and Purpose

A blueprint of the procedure that enables the researcher to maintain control over all
factors that may affect the result of an experiment. In doing this, the researcher attempts
to determine or predict what may occur. Experimental Research is often used where
there is time priority in a causal relationship (cause precedes effect), there is consistency
in a causal relationship (a cause will always lead to the same effect), and the magnitude
of the correlation is great. The classic experimental design specifies an experimental
group and a control group. The independent variable is administered to the experimental
group and not to the control group, and both groups are measured on the same
dependent variable. Subsequent experimental designs have used more groups and
more measurements over longer periods. True experiments must have control,
randomization, and manipulation.

Advantages

1. Experimental research allows the researcher to control the situation. In so doing, it allows
researchers to answer the question, “what causes something to occur?”
2. Permits the researcher to identify cause and effect relationships between variables and to
distinguish placebo effects from treatment effects.
3. Experimental research designs support the ability to limit alternative explanations and to
infer direct causal relationships in the study.
4. Approach provides the highest level of evidence for single studies.

Disadvantages

1. The design is artificial, and results may not generalize well to the real world.
2. The artificial settings of experiments may alter subject behaviors or responses.
3. Experimental designs can be costly if special equipment or facilities are needed.
4. Some research problems cannot be studied using an experiment because of ethical or
technical reasons.
5. Difficult to apply ethnographic and other qualitative methods to experimental designed
research studies.

VIII - Exploratory Design

Definition and Purpose

An exploratory design is conducted about a research problem when there are few or no
earlier studies to refer to. The focus is on gaining insights and familiarity for later
investigation or undertaken when problems are in a preliminary stage of investigation.
91
The goals of exploratory research are intended produce the following possible insights:

 Familiarity with basic details, settings and concerns.


 Well grounded picture of the situation being developed.
 Generation of new ideas and assumption, development of tentative theories or
hypotheses.
 Determination about whether a study is feasible in the future.
 Issues get refined for more systematic investigation and formulation of new
research questions.
 Direction for future research and techniques get developed.

Advantages

1. Design is a useful approach for gaining background information on a particular topic.


2. Exploratory research is flexible and can address research questions of all types (what,
why, how).
3. Provides an opportunity to define new terms and clarify existing concepts.
4. Exploratory research is often used to generate formal hypotheses and develop more
precise research problems.
5. Exploratory studies help establish research priorities.

Disadvantages

1. Exploratory research generally utilizes small sample sizes and, thus, findings are typically
not generalizable to the population at large.
2. The exploratory nature of the research inhibits an ability to make definitive conclusions
about the findings.
3. The research process underpinning exploratory studies is flexible but often unstructured,
leading to only tentative results that have limited value in decision-making.
4. Design lacks rigorous standards applied to methods of data gathering and analysis
because one of the areas for exploration could be to determine what method or
methodologies best fit the research problem.

IX - Historical Design

Definition and Purpose

The purpose of a historical research design is to collect, verify, and synthesize evidence
from the past to establish facts that defend or refute your hypothesis. It uses secondary
sources and a variety of primary documentary evidence, such as, logs, diaries, official
records, reports, archives, and non-textual information [maps, pictures, audio and visual
recordings]. The limitation is that the sources must be both authentic and valid.

Advantages

1. The historical research design is unobtrusive; the act of research does not affect the
results of the study.

92
2. The historical approach is well suited for trend analysis.
3. Historical records can add important contextual background required to more fully
understand and interpret a research problem.
4. There is no possibility of researcher-subject interaction that could affect the findings.
5. Historical sources can be used over and over to study different research problems or to
replicate a previous study.

Disadvantages

1. The ability to fulfill the aims of your research are directly related to the amount and quality
of documentation available to understand the research problem.
2. Since historical research relies on data from the past, there is no way to manipulate it to
control for contemporary contexts.
3. Interpreting historical sources can be very time consuming.
4. The sources of historical materials must be archived consistentally to ensure access.
5. Original authors bring their own perspectives and biases to the interpretation of past
events and these biases are more difficult to ascertain in historical resources.
6. Due to the lack of control over external variables, historical research is very weak with
regard to the demands of internal validity.
7. It rare that the entirety of historical documentation needed to fully address a research
problem is available for interpretation; therefore, gaps need to be acknowledged.

X - Longitudinal Design

Definition and Purpose

A longitudinal study follows the same sample over time and makes repeated
observations. With longitudinal surveys, for example, the same group of people is
interviewed at regular intervals, enabling researchers to track changes over time and to
relate them to variables that might explain why the changes occur. Longitudinal research
designs describe patterns of change and help establish the direction and magnitude of
causal relationships. Measurements are taken on each variable over two or more distinct
time periods. This allows the researcher to measure change in variables over time. It is a
type of observational study and is sometimes referred to as a panel study.

Advantages

1. Longitudinal data allow the analysis of duration of a particular phenomenon.


2. Enables survey researchers to get close to the kinds of causal explanations usually
attainable only with experiments.
3. The design permits the measurement of differences or change in a variable from one
period to another [i.e., the description of patterns of change over time].
4. Longitudinal studies facilitate the prediction of future outcomes based upon earlier
factors.

93
Disadvantages

1. The data collection method may change over time.


2. Maintaining the integrity of the original sample can be difficult over an extended period of
time.
3. It can be difficult to show more than one variable at a time.
4. This design often needs qualitative research to explain fluctuations in the data.
5. A longitudinal research design assumes present trends will continue unchanged.
6. It can take a long period of time to gather results.
7. There is a need to have a large sample size and accurate sampling to reach
representativness.

XI - Observational Design

Definition and Purpose

This type of research design draws a conclusion by comparing subjects against a control
group, in cases where the researcher has no control over the experiment. There are two
general types of observational designs. In direct observations, people know that you are
watching them. Unobtrusive measures involve any method for studying behavior where
individuals do not know they are being observed. An observational study allows a useful
insight into a phenomenon and avoids the ethical and practical difficulties of setting up a
large and cumbersome research project.

What do these studies tell you?

1. Observational studies are usually flexible and do not necessarily need to be structured
around a hypothesis about what you expect to observe (data is emergent rather than pre-
existing).
2. The researcher is able to collect a depth of information about a particular behavior.
3. Can reveal interrelationships among multifaceted dimensions of group interactions.
4. You can generalize your results to real life situations.
5. Observational research is useful for discovering what variables may be important before
applying other methods like experiments.
6. Observation researchd esigns account for the complexity of group behaviors.

What these studies don't tell you?

1. Reliability of data is low because seeing behaviors occur over and over again may be a
time consuming task and difficult to replicate.
2. In observational research, findings may only reflect a unique sample population and,
thus, cannot be generalized to other groups.
3. There can be problems with bias as the researcher may only "see what they want to
see."
4. There is no possiblility to determine “cause and effect” relationships since nothing are
manipulated.
5. Sources or subjects may not all be equally credible.

94
6. Any group that is studied is altered to some degree by the very presence of the
researcher, therefore, skewing to some degree any data collected (the Heisenburg
Uncertainty Principle).

XII - Philosophical Design

Definition and Purpose

Understood more as an broad approach to examining a research problem than a


methodological design, philosophical analysis and argumentation is intended to
challenge deeply embedded, often intractable, assumptions underpinning an area of
study. This approach uses the tools of argumentation derived from philosophical
traditions, concepts, models, and theories to critically explore and challenge, for
example, the relevance of logic and evidence in academic debates, to analyze
arguments about fundamental issues, or to discuss the root of existing discourse about a
research problem. These overarching tools of analysis can be framed in three ways:

 Ontology -- the study that describes the nature of reality; for example, what is
real and what is not, what is fundamental and what is derivative?
 Epistemology -- the study that explores the nature of knowledge; for example, on
what does knowledge and understanding depend upon and how can we be certain of
what we know?
 Axiology -- the study of values; for example, what values does an individual or
group hold and why? How are values related to interest, desire, will, experience, and
means-to-end? And, what is the difference between a matter of fact and a matter of
value?

Advantages

1. Can provide a basis for applying ethical decision-making to practice.


2. Functions as a means of gaining greater self-understanding and self-knowledge about
the purposes of research.
3. Brings clarity to general guiding practices and principles of an individual or group.
4. Philosophy informs methodology.
5. Refine concepts and theories that are invoked in relatively unreflective modes of thought
and discourse.
6. Beyond methodology, philosophy also informs critical thinking about epistemology and
the structure of reality (metaphysics).
7. Offers clarity and definition to the practical and theoretical uses of terms, concepts, and
ideas.

95
Disadvantages

1. Limited application to specific research problems [answering the "So What?" question in
social science research].
2. Analysis can be abstract, argumentative, and limited in its practical application to real-life
issues.
3. While a philosophical analysis may render problematic that which was once simple or
taken-for-granted, the writing can be dense and subject to unnecessary jargon,
overstatement, and/or excessive quotation and documentation.
4. There are limitations in the use of metaphor as a vehicle of philosophical analysis.
5. There can be analytical difficulties in moving from philosophy to advocacy and between
abstract thought and application to the phenomenal world.

XIII - Sequential Design

Definition and Purpose

Sequential research is that which is carried out in a deliberate, staged approach [i.e.
serially] where one stage will be completed, followed by another, then another, and so
on, with the aim that each stage will build upon the previous one until enough data is
gathered over an interval of time to test your hypothesis. The sample size is not
predetermined. After each sample is analyzed, the researcher can accept the null
hypothesis, accept the alternative hypothesis, or select another pool of subjects and
conduct the study once again. This means the researcher can obtain a limitless number
of subjects before finally making a decision whether to accept the null or alternative
hypothesis. Using a quantitative framework, a sequential study generally utilizes
sampling techniques to gather data and applying statistical methods to analze the
data.Using a qualitative framework, sequential studies generally utilize samples of
individuals or groups of individuals [cohorts] and use qualitative methods, such as
interviews or observations, to gather information from each sample.

Advantages

1. The researcher has a limitless option when it comes to sample size and the sampling
schedule.
2. Due to the repetitive nature of this research design, minor changes and adjustments can
be done during the initial parts of the study to correct and hone the research method.
Useful design for exploratory studies.
3. There is very little effort on the part of the researcher when performing this technique. It is
generally not expensive, time consuming, or workforce extensive.
4. Because the study is conducted serially, the results of one sample are known before the
next sample is taken and analyzed.

96
Disadvantages

1. The sampling method is not representative of the entire population. The only possibility of
approaching representativeness is when the researcher chooses to use a very large
sample size significant enough to represent a significant portion of the entire population.
In this case, moving on to study a second or more sample can be difficult.
2. Because the sampling technique is not randomized, the design cannot be used to create
conclusions and interpretations that pertain to an entire population. Generalizability from
findings is limited.
3. Difficult to account for and interpret variation from one sample to another over time,
particularly when using qualitative methods of data collection.

Characteristics of Good Research Design

Throughout the design construction task, it is important to have in mind some endpoint,
some criteria which we should try to achieve before finally accepting a design strategy.
The criteria discussed below are only meant to be suggestive of the characteristics
found in good research design. It is worth noting that all of these criteria point to the
need to individually tailor research designs rather than accepting standard textbook
strategies as is.

1. Theory-Grounded. Good research strategies reflect the theories which are being
investigated. Where specific theoretical expectations can be hypothesized these are
incorporated into the design. For example, where theory predicts a specific treatment
effect on one measure but not on another, the inclusion of both in the design improves
discriminant validity and demonstrates the predictive power of the theory.
2. Situational. Good research designs reflect the settings of the investigation. This was
illustrated above where a particular need of teachers and administrators was explicitly
addressed in the design strategy. Similarly, intergroup rivalry, demoralization, and
competition might be assessed through the use of additional comparison groups who are
not in direct contact with the original group.
3. Feasible. Good designs can be implemented. The sequence and timing of events are
carefully thought out. Potential problems in measurement, adherence to assignment,
database construction and the like, are anticipated. Where needed, additional groups or
measurements are included in the design to explicitly correct for such problems.
4. Redundant. Good research designs have some flexibility built into them. Often, this
flexibility results from duplication of essential design features. For example, multiple
replications of a treatment help to insure that failure to implement the treatment in one
setting will not invalidate the entire study.
5. Efficient. Good designs strike a balance between redundancy and the tendency to
overdesign. Where it is reasonable, other, less costly, strategies for ruling out potential
threats to validity are utilized.

97
This is by no means an exhaustive list of the criteria by which we can judge good
research design. Nevertheless, goals of this sort help to guide the researcher toward a
final design choice and emphasize important components which should be included.

“Design is what you do when you don't [yet] know what you are doing.”<< George Stiny

98
A2Z

PhD
Business Name

Thesis
Reflections on Academic Research

Chapter XII

Sampling
99
SAMPLING
In the language of sampling:

-a population is the entire collection of people or things you are interested in;
-a census is a measurement of all the units in the population;
-a population parameter is a number that results from measuring all the units in
the population;
-a sampling frame is the specific data from which the sample is drawn, e.g., a
telephone book;
-a unit of analysis is the type of object of interest, e.g., arsons, fire departments,
firefighters;
-a sample is a subset of some of the units in the population;
-a statistic is a number that results from measuring all the units in the sample;
-statistics derived from samples are used to estimate population parameters.

For example, to find out the average age of all motor vehicles in the state in 2011:

Population=all motor vehicles in the state in 2011


Sampling frame=all motor vehicles registered with the DMV on 1 July 2011
Design=probability sampling
Unit of analysis=motor vehicle
Sample=300 motor vehicles
Data gathered=the age of each of the 300 motor vehicles selected in the sample
Statistic=the average age of the 300 motor vehicles in the sample
Parameter=the estimate of the average age of all motor vehicles in the state-2011

Why Sample?

Sometimes "measuring" or "testing" something destroys it. The government requires


automakers who want to sell cars in the U.S. to demonstrate that their cars can survive
certain crash tests. Obviously, the company can't be expected to crash every car, to see
if it survives! So the company crashes only a sample of cars. Another reason for
sampling is that not all units in the population can be identified, such as all the air
molecules in the LA basin. So to measure air pollution, you take a sample of air
molecules. Also, even if all those air molecules could be identified, it would be too
expensive and too time consuming to measure them all.

Types of Samples:
Non-probability (non-random) samples: These samples focus on volunteers, easily
available units, or those that just happen to be present when the research is done. Non-
100
probability samples are useful for quick and cheap studies, for case studies, for
qualitative research, for pilot studies, and for developing hypotheses for future research.

Convenience sample: also called an "accidental" sample or "man-in-the-street" samples.


The researcher selects units that are convenient, close at hand, easy to reach, etc.

Purposive sample: the researcher selects the units with some purpose in mind, for
example, students who live in dorms on campus, or experts on urban development.

Quota sample: the researcher constructs quotas for different types of units. For example,
to interview a fixed number of shoppers at a mall, half of whom are male and half of
whom are female.

Other samples that are usually constructed with non-probability methods include library
research, participant observation, marketing research, consulting with experts, and
comparing organizations, nations, or governments.

Probability-based (random) samples: These samples are based on probability theory.


Every unit of the population of interest must be identified, and all units must have a
known, non-zero chance of being selected into the sample.

Simple random sample: Each unit in the population is identified, and each unit has an
equal chance of being in the sample. The selection of each unit is independent of the
selection of every other unit. Selection of one unit does not affect the chances of any
other unit.

For example, to select a sample of 25 people who live in your college dorm, make a list of all the
250 people who live in the dorm. Assign each person a unique number, between 1 and 250. Then
refer to a table of random numbers. Starting at any point in the table, read across or down and
note every number that falls between 1 and 250. Use the numbers you have found to pull the
names from the list that correspond to the 25 numbers you found. These 25 people are your
sample. This is called the table of random numbers method.

Another way to select this simple random sample is to take 250 ping-pong balls and number then
from 1 to 250. Put them into a large barrel and mix them up, and then grab 25 balls. Read off the
numbers. Those are the 25 people in your sample. This is called the lottery method.

Systematic random sampling: Each unit in the population is identified, and each unit has
an equal chance of being in the sample.

For example, to select a sample of 25 dorm rooms in your college dorm, make a list of all the
room numbers in the dorm. Say there are 100 rooms. Divide the total number of rooms (100) by
the number of rooms you want in the sample (25). The answer is 4. This means that you are
going to select every fourth dorm room from the list. But you must first consult a table of random
numbers. Pick any point on the table, and read across or down until you come to a number
between 1 and 4. This is your random starting point. Say your random starting point is "3". This

101
means you select dorm room 3 as your first room, and then every fourth room down the list (3, 7,
11, 15, 19, etc.) until you have 25 rooms selected.

This method is useful for selecting large samples, say 100 or more. It is less
cumbersome than a simple random sample using either a table of random numbers or a
lottery method. For example, you might have to sample files in a large filing cabinet. It is
easier to select every 17th file than to pull out all the files and number them, etc.

However, you must be aware of problems that can arise in systematic random sampling.
If the selection interval matches some pattern in the list (e.g., each 4th dorm room is a
single unit, where all the others are doubles) you will introduce systematic bias into your
sample.

Stratified random sampling: Each unit in the population is identified, and each unit has a
known, non-zero chance of being in the sample. This is used when the researcher
knows that the population has sub-groups (strata) that are of interest.

For example, if you wanted to find out the attitudes of students on your campus about
immigration, you may want to be sure to sample students who are from every region of the
country as well as foreign students. Say your student body of 10,000 students is made up of
8,000 - West; 1,000 - East; 500 - Midwest; 300 - South; 200 - Foreign.

If you select a simple random sample of 500 students, you might not get any from the
Midwest, South, or Foreign. To make sure that you get some students from each group,
you can divide the students into these five groups, and then select the same percentage
of students from each group using a simple random sampling method. This is
proportional stratified random sampling.

However, you may still have too few of some types of students. Instead, you may divide
students into the five groups and then select the same number of students from each
group using a simple random sampling method. This is disproportionate stratified
random sampling. This allows you to have enough students in each sub-group so that
you can perform some meaningful statistical analyses of the attitudes of students in each
sub-group. In order to say something about the attitudes of the total student population
of the university, however, you will have to apply weights to the findings for each sub-
group, proportional to its presence in the total student body.

102
Cluster sampling: cluster sampling views the units in a population as not only being
members of the total population but as members also of naturally-occurring in clusters
within the population. For example, city residents are also residents of neighborhoods,
blocks, and housing structures.Cluster sampling is used in large geographic samples
where no list is available of all the units in the population but the population boundaries
can be well-defined. For example, to obtain information about the drug habits of all high
school students in a state, you could obtain a list of all the school districts in the state
and select a simple random sample of school districts. Then, within in each selected
school district, list all the high schools and select a simple random sample of high
schools. Within each selected high school, list all high school classes, and select a
simple random sample of classes. Then use the high school students in those classes
as your sample. Cluster sampling must use a random sampling method at each stage.
This may result in a somewhat larger sample than using a simple random sampling
method, but it saves time and money. It is also cheaper to administer than a statewide
sample of high school seniors, because there are many fewer sites to obtain information
from.

How Big a Sample Do I Need?

The size of the sample depends on the type of research design being used; the desired
level of confidence in the results; the amount of accuracy wanted; and the characteristics
of the population of interest. Sample size has little to do with the size of the population,
however.

Random sampling procedures are based on probability theory; this is why they are also
called probability sampling methods. Say we are interested in knowing what is the
average monthly income of all the full-time students at our university. There are 5 full-
time students each with a different monthly income as follows: Rs.500; Rs.650; Rs.400;
Rs.700; Rs.600. This is our population of students. Say we take a simple random
sample of 2 students and figure the average for the sample.

It is entirely possible that we could take a simple random sample 2 students from the 5
students above and get an average as low as Rs.450 per month. It is equally possible
that we could take a different simple random sample of 2 students and get an average

103
as high as Rs.675 per month. Try it with the following figures. There are 10 possible
samples of two students:

Rs.500 + Rs.650 = Rs.575


Rs.500 + Rs.400 = Rs.450
Rs.500 + Rs.700 = Rs.600
Rs.500 + Rs.600 = Rs.550
Rs.650 + Rs.400 = Rs.525
Rs.650 + Rs.700 = Rs.675
Rs.650 + Rs.600 = Rs.625
Rs.400 + Rs.700 = Rs.550
Rs.400 + Rs.600 = Rs.500
Rs.700 + Rs.600 = Rs.650

We know from probability theory that if we took all possible combinations of samples of 2
full-time students from our population of 5, found the average monthly wage for all
possible samples, and took the average of all those averages, we would find the exact
typical monthly income of all 5 students.

The average monthly wage of the 5 students in the population = Rs.570.


The average of the 10 samples of 2 students each = Rs.570.

Now in this example, of course it would be easier to just find the average monthly wage
for all five students in the population. However, we can apply this same principle to much
larger populations, where it would be nearly impossible to measure every unit in the
population.

Say we wanted to find the average monthly wage of all 10,000 full-time students at our
university. We can take a simple random sample of 150 students, find the average
monthly wage for the 150 students in the sample, and then use that number (a sample
statistic) to estimate the average monthly wage for the entire population of students (a
population parameter).

We know from probability theory that if we took a very large number of simple random
samples of 150 students from our student population, and found the average monthly
wage for each sample, that those averages would tend to distribute themselves in the
pattern of a "bell-shaped" curve, also called "the normal curve." That curve has well-
established properties.

104
For example, approximately 68% of the sample averages would fall within plus or minus
one standard deviation of the true population average. We also know that approximately
95% of the sample averages would fall within plus or minus two standard deviations of
the true population average. And finally, we know that approximately 99% of the sample
averages would fall within plus or minus three standard deviations of the true population
average.

Using these established principles, we do not have to take repeated simple random
samples (fortunately!). Instead, we can use these principles to estimate how well our
sample statistic estimates the population parameter. We can also use these principles to
select an adequate sample size for our research.

Say we want to know what proportion of the support of students at our university support
the death penalty. To calculate sample size, we must make four decisions:

First, are we doing a true experimental design (e.g., control-group, pretest-posttest


design) or a non-experimental design (e.g., a cross-sectional survey)? The former can
use smaller sample sizes, while the latter require larger sample sizes. In this case we
are doing a survey.

Second, how sure do we want to be that we could get the same results if we did the
study multiple times? Do we want to be 50% sure, 90% sure, 95% sure, or 99% sure?
This is called the confidence level. The more sure we want to be, the larger the sample
size needs to be. In this case, we want a confidence level of 95%.

Third, how accurate do we want to be at estimating the population parameter? Will a


margin of error of (plus or minus) 5% be acceptable, or 4%, 3%, 2%, or 1%? This is also
called the confidence interval. In this case, we want an accuracy of plus or minus 4%.
This means that if we find that 66% of the students oppose the death penalty, we really
mean that we have found that 66% plus or minus 4% oppose the death penalty.

Fourth, how is the population distributed on the variable of interest? That is, in a yes/no
situation, how many do we think will say yes? How many will say no? The most
conservative way to approach this is to guess that the population is split 50/50 on the

105
question. In this case we guess that 50% of the students will support the death penalty,
and 50% will oppose it.

If we are doing a survey of a population, and are not interested in sub-samples within the
population, and will accept a 95% confidence level, and a 4% margin of error, and
assume a probability of .5 on the variable (.5 will say yes), then the formula for sample
size is as follows:

the square root of = square root of x confidence level divided


the sample size (p) x (1-p) by the margin of error

Solving for the sample size, we have

the square root of sample size = [the square root of (.5) x (1-.5)] x 1.96/.05 =
the square root of sample size = the square root of .25 x 1.96/.05 =
the square root of sample size = the square root of (.5) x 49 =
the square root of sample size = the square root of 24.5
Squaring both sides, we have
the sample size = 24.5 squared =
the sample size = 600.25 (round off to 600)

As the margin of error decreases, the sample size will need to increase (and vice versa).
If we wanted to change the margin of error to plus or minus 3%, (keeping the confidence
level at 95%), the required sample size increases to 1,067. If we could afford to use a
margin of error of plus or minus 5%, the sample size would decrease to 384.

Similarly, if the confidence level increases, the sample size will need to increase. If we
increase the confidence level to 99%, the sample size increases to 1,036 (with the
margin of error remaining at 4%). If the confidence level decreases to 90%, the sample
size decreases to 413.

If you have a fixed sample size, you can increase the confidence level and decrease the
accuracy, or you can increase the accuracy and decrease the confidence level, but you
cannot do both.

As the variability in the population on the variable of interest increases, the sample size
increases. A probability of 50/50 demonstrates the greatest variability in the population.
If the variability decreases to 60/40, or 70/30, then a smaller sample size will result.

106
The following table summarizes the calculations for sample sizes for survey research,
assuming a probability of 50/50 on a dichotomous question, and no sub-populations.

Accuracy (+/-) Confidence Level

90% 95% 99%


(Margin of error)

1 6,765 9,604 16,576

2 1,691 2,401 4,144

3 752 1,067 1,848

4 413 600 1,036

5 271 384 663

10 68 96 166

20 17 24 41

If the researcher wants to study sub-populations as well as the whole population, then
larger sample sizes will be needed. In addition, if more than one variable is being studied
at the same time, then the rule of thumb is to have a total of at least 10 cases per
variable.

If the research is to be a controlled experiment, then smaller sample sizes can be used.
However, it is recommended to use samples of no smaller than 30 for each group in the
experiment (e.g., experimental and control groups). Many common statistics are based
on sample sizes of a minimum of 30; for sample sizes of less than 30, other special
statistics must be used.

Sample Quality

Sampling error arises from two principal sources: random error, and non-random error.
Random error results from taking a sample from a population, instead of measuring the
entire population. It is predictable, using probability theory. It is the reason that sample
statistics only provide estimates of population parameters, but the amount of random
error is known.

107
Non-random error results from bias being introduced into the sample from some flaw in
the design or implementation of the sample. For example, using a telephone book as the
sampling frame for all the residents of a city will result in some bias, because some
people are not listed in the directory or do not have telephones. People who refuse to
take part in a study (which is their right) also may introduce bias into the sample. Some
people may provide erroneous information, which also biases the results. Finally,
mistakes in computing the required sample size, in identifying the actual units to be
included in the sample, or other errors can introduce bias into the sample.

Adequate Sample?

To assess whether an adequate sample was used in a piece of research, ask the
following questions:

Size--was the size adequate for the purpose of the study, especially if there were many
sub-groups included in the analysis, or many variables used simultaneously?

Representativeness--was the sample selected randomly from the population, using


probability theory? Was the sampling frame adequate?

Implementation--was the sampling plan carried out carefully, was it adequately


supervised, was there some quality control plan, did it result in a good response rate?

Minimum Sample Size

Once the scholar has all the information, the following formula can be used to calculate
the minimum sample size:

2
 z 
n  p%  q%  
 e% 

Where

n is the minimum sample size required


p% is the proportion belonging to the specified category
q % is the proportion not belonging to the specified category
z is the z value corresponding to the level of confidence required ( see table )
e % is the margin of error required

108
Table

Levels of confidence and associated z values

Level of confidence z value

90 % certain 1.65

95 % certain 1.96

99 % certain 2.57

Where your population is less than 10000, a smaller sample size can be used without
affecting the accuracy. This is called the adjusted minimum sample size. It is calculated
using the following formula

n
ni , 
n
1  
N

Where
ni is the adjusted minimum sample size
n is the minimum sample (as calculated above)
N is the total population

It's a sampling phenomenon, rather than an accurate number.

<< Sean Snaith

109
Chapter XIII

Designing a Questionnaire

A2Z

PhD
Thesis

Reflections on Academic Research

110
DESIGNING A QUESTIONNAIRE
fulcrum of research

Introduction

This is the information age. More information has been published in the last decade than in all
previous history. Everyone uses information to make decisions about the future. If our
information is accurate, we have a high probability of making a good decision. If our information
is inaccurate, our ability to make a correct decision is diminished. Better information usually
leads to better decisions. The most of used form of collecting information is Questionnaire. Ask
yourself, why should I use a questionnaire? It is worth being self reflective when beginning to
construct your own questionnaire, by writing down your reasons for choosing such a research
instrument rather than another (say interviews or observation), for inventing your own rather
than using one already available in the literature, and for posing the sorts of questions you want
to use. Such notes may be useful when you come to write the ‘methods’ chapter/section of your
research report. The fundamental question that must then be asked is, what are you trying to
find out? Every questionnaire must have a purpose, ie it must draw from some underlying
hypotheses about what are the important facts or opinions and even make some predictions
about which facts may be relevant in explaining the opinions expressed.

Questionnaire Design

Perhaps the most important stage of the survey process is the creation of questions that
accurately measure the opinions, experiences and behaviors of the public. Accurate random
sampling and high response rates will be wasted if the information gathered is built on a shaky
foundation of ambiguous or biased questions. Creating good measures involves both writing
good questions and organizing them to form the questionnaire.

Questionnaire design is a multiple-stage process that requires attention to many details at the
same time. Designing the questionnaire is a complicated process because surveys can ask
about topics in varying degrees of detail, questions can be asked in different ways, and
questions asked earlier in a survey may influence how people respond to later questions.
Researchers are also often interested in measuring change over time and therefore must be
attentive to how opinions or behaviors have been measured in prior surveys. Surveyors may
111
conduct pilot tests or focus groups in the early stages of questionnaire development in order to
better understand how people think about an issue or comprehend a question. Finally,
pretesting a survey to evaluate how people respond to the overall questionnaire and specific
questions is an essential step in the questionnaire design process.

Principles of Wording

The wording of a question is extremely important. Researchers strive for objectivity in surveys
and, therefore, must be careful not to lead the respondent into giving a desired answer.
Unfortunately, the effects of question wording are one of the least understood areas of
questionnaire research.

Many investigators have confirmed that slight changes in the way questions are worded can
have a significant impact on how people respond (Arndt and Crane, 1975; Belkin and
Lieverman, 1967; Cantril, 1944; Kalton, Collins, and Brook, 1978; Petty, Rennier and Cacioppo,
198; Rasinski, 1989; Schuman and Presser, 1981, 1977; ). Several authors have reported that
minor changes in question wording can produce more than a 25 percent difference in people's
opinions (Payne, 1951; Rasinski, 1989).

One important area of question wording is the effect of the interrogation and assertion question
formats. The interrogation format asks a question directly, where the assertion format asks
subjects to indicate their level of agreement or disagreement with a statement. Schuman and
Presser (1981) reported no significant differences between the two formats, however, other
researchers hypothesized that the interrogation format is more likely to encourage subjects to
think about their answers (Burnkrant and Howard, 1984; Petty, Cacioppo, and Heesacker, 1981;
Swasy and Munch, 1985; Zillman, 1972). Petty, Rennier and Cacioppo (1987) found that the
interrogation format caused greater polarization in subjects' responses, suggesting that there
was greater cognition than the assertion format.

Other investigators have looked at the effects of modifying adjectives and adverbs (Bradburn
and Miles, 1979; Hoyt, 1972; Schaeffer, 1991). Words
like usually, often, sometimes, occasionally, seldom, and rarely are "commonly" used in
questionnaires, although it is clear that they do not mean the same thing to all people. Simpson
(1944), and a replication by Hakel (1968), looked at twenty modifying adjectives and adverbs.

112
These researchers found that the precise meanings of these words varied widely between
subjects, and between the two studies. However, the correlation between the two studies with
respect to the relative ranking of the words was .99. Some adjectives have high variability and
others have low variability. The following adjectives have highly variable meanings and should
be avoided in surveys: a clear mandate, most, numerous, a substantial majority, a minority of, a
large proportion of, a significant number of, many, a considerable number of, and several. Other
adjectives produce less variability and generally have more shared meaning. These are: lots,
almost all, virtually all, nearly all, a majority of, a consensus of, a small number of, not very
many of, almost none, hardly any, a couple, and a few.

Characteristics of a Good Question

There are good and bad questions. The qualities of a good question are as follows:

1. Evokes the truth. Questions must be non-threatening. When a respondent is concerned about the
consequences of answering a question in a particular manner, there is a good possibility that the answer
will not be truthful. Anonymous questionnaires that contain no identifying information are more likely to
produce honest responses than those identifying the respondent. If your questionnaire does contain
sensitive items, be sure to clearly state your policy on confidentiality.

2. Asks for an answer on only one dimension. The purpose of a survey is to find out information. A
question that asks for a response on more than one dimension will not provide the information you are
seeking. For example, a researcher investigating a new food snack asks "Do you like the texture and
flavor of the snack?" If a respondent answers "no", then the researcher will not know if the respondent
dislikes the texture or the flavor, or both. Another questionnaire asks, "Were you satisfied with the quality
of our food and service?" Again, if the respondent answers "no", there is no way to know whether the
quality of the food, service, or both were unsatisfactory. A good question asks for only one "bit" of
information.

3. Can accommodate all possible answers. Multiple choice items are the most popular type of survey
questions because they are generally the easiest for a respondent to answer and the easiest to analyze.
Asking a question that does not accommodate all possible responses can confuse and frustrate the
respondent.

4. Has mutually exclusive options. A good question leaves no ambiguity in the mind of the respondent.
There should be only one correct or appropriate choice for the respondent to make.

5. Produces variability of responses. When a question produces no variability in responses, we are left
with considerable uncertainty about why we asked the question and what we learned from the
information. If a question does not produce variability in responses, it will not be possible to perform any
statistical analyses on the item.

6. Follows comfortably from the previous question. Writing a questionnaire is similar to writing anything
else. Transitions between questions should be smooth. Grouping questions that are similar will make the
questionnaire easier to complete, and the respondent will feel more comfortable. Questionnaires that
jump from one unrelated topic to another feel disjointed and are not likely to produce high response rates.

113
7. Does not presuppose a certain state of affairs. Among the most subtle mistakes in questionnaire
design are questions that make an unwarranted assumption.

8. Does not imply a desired answer. The wording of a question is extremely important. We are striving for
objectivity in our surveys and, therefore, must be careful not to lead the respondent into giving the answer
we would like to receive. Leading questions are usually easily spotted because they use negative
phraseology.

9. Does not use emotionally loaded or vaguely defined words. This is one of the areas overlooked by both
beginners and experienced researchers. Quantifying adjectives (e.g., most, least, majority) are frequently
used in questions. It is important to understand that these adjectives mean different things to different
people.

10. Does not use unfamiliar words or abbreviations. Remember who your audience is and write your
questionnaire for them. Do not use uncommon words or compound sentences. Write short sentences.
Abbreviations are okay if you are absolutely certain that every single respondent will understand their
meanings. If there is any doubt at all, do not use the abbreviation.

11. Is not dependent on responses to previous questions. Branching in written questionnaires should be
avoided. While branching can be used as an effective probing technique in telephone and face-to-face
interviews, it should not be used in written questionnaires because it sometimes confuses respondents.

12. Does not ask the respondent to order or rank a series of more than five items. Questions asking
respondents to rank items by importance should be avoided. This becomes increasingly difficult as the
number of items increases, and the answers become less reliable. This becomes especially problematic
when asking respondents to assign a percentage to a series of items. In order to successfully complete
this task, the respondent must mentally continue to re-adjust his answers until they total one hundred
percent. Limiting the number of items to five will make it easier for the respondent to answer.

Question Hierarchy

Items on a questionnaire should be grouped into logically coherent sections. Grouping


questions that are similar will make the questionnaire easier to complete, and the respondent
will feel more comfortable. Questions that use the same response formats, or those that cover a
specific topic, should appear together.

Each question should follow comfortably from the previous question. Writing a questionnaire is
similar to writing anything else. Transitions between questions should be smooth.
Questionnaires that jump from one unrelated topic to another feel disjointed and are not likely to
produce high response rates.

Most investigators have found that the order in which questions are presented can affect the
way that people respond. One study reported that questions in the latter half of a questionnaire
were more likely to be omitted, and contained fewer extreme responses. Some researchers
114
have suggested that it may be necessary to present general questions before specific ones in
order to avoid response contamination. Other researchers have reported that when specific
questions were asked before general
questions, respondents tended to exhibit greater interest in the general questions. It is not clear
whether or not question-order affects response. A few researchers have reported that question-
order does not affect responses, while others have reported that it does. Generally, it is believed
that question-order effects exist in interviews, but not in written surveys.

Types of Questionnaire

There are two types of questionnaires: structured and unstructured. The design of a
questionnaire differs according to how it is administered; in particular the amount of contact
researcher has with respondents.

[A] Structured Questionnaire

These contains concrete, definite and preordained questions. Additional questions may be
thought of and asked only when some clarification is needed or additional administration is
sought from the respondents. Answers to these questions are usually very precise without any
vagueness and ambiguity. The structured questionnaire is divided into two categories:

[a] Closed-ended questionnaires: Questions are set in such a manner that leaves only a few alternative
answers. For example, yes or no, with a limited number of answers for a respondent to choose from.

[b] Open-ended questionnaires: Respondents have the choice of using their own style, diction, expression
of language, length and perception. The respondents are restricted in their replies to the question and
their answers may be free and spontaneous. Though ample freedom is available to the respondents, it
creates problems of proper classifications, tabulation and analysis.

[B] Unstructured Questionnaire


These contain a set of questions that are not structured in advance. It gives sufficient scope for
a variety of answers. It is used mainly for conducting interviews. Its merit is flexibility. It aims to
secure the maximum possible information from the respondents.

[C] Pictorial Questionnaire

115
In a pictorial questionnaire, alternative answers in the form of pictures are given and the
respondents are required to tock the picture concerned to indicate their selection. This type of
questionnaire is useful for illiterate and less knowledgeable respondents.

The Length of a Questionnaire

Generally it has been observed that, long questionnaires get less response than short
questionnaires. However, some studies have shown that the length of a questionnaire does not
necessarily affect response (Berdie, 1973; Champion and Sear, 1979; Childers and Ferrell,
1979; Duncan, 1979; Layne and Thompson, 1981; Mason Dressel, and Bain, 1961). "Seemingly
more important than length is question content." (Berdie, Anderson, and Niebuhr, 1986, p. 53) A
subject is more likely to respond if they are involved and interested in the research topic (Bauer,
1947; Brown and Wilkins, 1978; Reid, 1942; Speer and Zold, 1971). Questions should be
meaningful and interesting to the respondent. Finally, simple, short questions are preferable to
long ones. As a rule of thumb, a question or a statement in the questionnaire should not exceed
20 words, or exceed one full line in print.

Characteristics of a Good Questionnaire

The physical appearance of a written survey may largely determine if the respondent will return
it (Levine and Gordon, 1958). Therefore, it is important to use professional production methods
for the questionnaire--either desktop publishing or typesetting and keylining (Robinson and
Agisim, 1951; Robinson, 1952; Sletto, 1940; Toops, 1937). Every questionnaire should have a
title that is short and meaningful to the respondent (Berdie, Anderson, and Niebuhr, 1986). The
rationale is that a questionnaire with a title will be perceived as more credible than one without
atitle.

Well-designed questionnaires include clear and concise instructions on how they should be
completed. These must be very easy to understand, so use short sentences and basic
vocabulary. The questionnaire itself should have the return address printed on it since
questionnaires often get separated from the reply envelopes (Berdie, Anderson, and Niebuhr,
1986).

Questionnaires should use simple and direct language (Norton, 1930). The questions must be
clearly understood by the respondent, and have the same meaning that the researcher intended
116
(Freed, 1964; Huffman, 1948). The wording of a question should be simple, to the point, and
familiar to the target population (Freed, 1964; Moser and Kalton, 1971). Surprisingly, several
researchers (Blair et al., 1977; Laurent, 1972) have found that longer questions elicit more
information than shorter ones, and that the information tends to be more accurate. However, it is
generally accepted that questionnaire items should be simply stated and as brief as possible
(Payne, 1951). The rationale is that this will reduce misunderstandings and make the
questionnaire appear easier to complete. One way to eliminate misunderstandings is to
emphasize crucial words in each item by using bold, italics or underlining (Berdie, Anderson,
Niebuhr, 1986).

Uncommon words, jargon, and abbreviations may be included in a questionnaire provided that
they are familiar to the population being investigated (Bartholomew, 1963). Slang is often
ambiguous, and should be excluded from all questionnaires (Payne, 1951). Questionnaires
should leave adequate space for respondents to make comments. One criticism of
questionnaires is their inability to retain the "flavor" of a response. Leaving space for comments
will provide valuable information not captured by the response categories. Leaving white space
also makes the questionnaire look easier and this might increase response (Berdie, Anderson,
and Neibuhr, 1986).

Researchers should design the questionnaire so it holds the respondent's interest. The goal is
to make the respondent want to complete the questionnaire. One way to keep a questionnaire
interesting is to provide variety in the type of items used. Varying the questioning format will also
prevent respondents from falling into "response sets". If a questionnaire is more than a few
pages and is held together by a staple, include some identifying data on each page (such as a
respondent ID number). Pages often accidentally separate (Berdie, Anderson, and Neibuhr,
1986).

Advantages of Written Questionnaires


Questionnaires are very cost effective when compared to face-to-face interviews. This is
especially true for studies involving large sample sizes and large geographic areas. Written
questionnaires become even more cost effective as the number of research questions
increases. Questionnaires are easy to analyze. Data entry and tabulation for nearly all surveys
can be easily done with many computer software packages. Questionnaires are familiar to most

117
people. Nearly everyone has had some experience completing questionnaires and they
generally do not make people apprehensive.
Questionnaires reduce bias. There is uniform question presentation and no middle-man bias.
The researcher's own opinions will not influence the respondent to answer questions in a certain
manner. There are no verbal or visual clues to influence the respondent. Questionnaires are
less intrusive than telephone or face-to-face surveys. When a respondent receives a
questionnaire in the mail, he is free to complete the questionnaire on his own time-table. Unlike
other research methods, the respondent is not interrupted by the research instrument.

Disadvantages of Written Questionnaires

One major disadvantage of written questionnaires is the possibility of low response rates. Low
response is the curse of statistical analysis. It can dramatically lower our confidence in the
results. Response rates vary widely from one questionnaire to another (10% - 90%), however,
well designed studies consistently produce high response rates. Another disadvantage of
questionnaires is the inability to probe responses.

Questionnaires are structured instruments. They allow little flexibility to the respondent with
respect to response format. In essence, they often lose the "flavor of the response" (i.e.,
respondents often want to qualify their answers). By allowing frequent space for comments, the
researcher can partially overcome this disadvantage. Comments are among the most helpful of
all the information on the questionnaire, and they usually provide insightful information that
would have otherwise been lost. Nearly ninety percent of all communication is visual. Gestures
and other visual cues are not available with written questionnaires. The lack of personal contact
will have different effects depending on the type of information being requested. A questionnaire
requesting factual information will probably not be affected by the lack of personal contact. A
questionnaire probing sensitive issues or attitudes may be severely affected. When returned
questionnaires arrive in the mail, it's natural to assume that the respondent is the same person
you sent the questionnaire to. This may not actually be the case. Many times business
questionnaires get handed to other employees for completion. Housewives sometimes respond
for their husbands. Kids respond as a prank. For a variety of reasons, the respondent may not
be who you think it is. It is a confounding error inherent in questionnaires. Finally,
questionnaires are simply not suited for some people. For example, a written survey to a group

118
of poorly educated people might not work because of reading skill problems. More frequently,
people are turned off by written questionnaires because of misuse.

Anonymity and Confidentiality

An anonymous study is one in which nobody (not even the researcher) can identify who
provided data. It is difficult to conduct an anonymous questionnaire through the mail because of
the need to follow-up on non-responders. The only way to do a follow-up is to mail another
survey or reminder postcard to the entire sample. However, it is possible to guarantee
confidentiality, where those conducting the study promise not to reveal the information to
anyone. For the purpose of follow-up, identifying numbers on questionnaires are generally
preferred to using respondents' names. It is important, however, to explain why the number is
there and what it will be used for.

Some studies have shown that response rate is affected by the anonymity/confidentiality policy
of a study (Jones, 1979; Dickson et al., 1977; Epperson and Peck, 1977). Klein, Maher, and
Dunnington (1967) reported that responses became more distorted when subjects felt
threatened that their identities would become known. Others have found that
anonymity/confidentiality issues do not affect response rates or responses (Butler, 1973; Fuller,
1974; Futrell and Swan, 1977; Skinner and Childers, 1980; Watkins, 1978; Wildman, 1977).
One researcher reported that the lack of anonymity actually increased response (Fuller, 1974).

Pre-notification Letters

Many researchers have studied pre-notification letters to determine if they increase response
rate. A meta-analysis of these studies revealed an aggregate increase in response rate of 7.7
percent. Pre-notification letters might help to establish the legitimacy of a survey, thereby
contributing to a respondent's trust. Another possibility is that a pre-notification letter builds
expectation and reduces the possibility that a potential respondent might disregard the survey
when it arrives. Pre-letters are seldom used in marketing research surveys. They are an
excellent (but expensive) way to increase response. The researcher needs to weigh the
additional cost of sending out a pre-letter against the probability of a lower response rate. When
sample sizes are small, every response really counts and a pre-letter is highly recommended.

119
Cover Letters

The cover letter is an essential part of the survey. To a large degree, the cover letter will affect
whether or not the respondent completes the questionnaire. It is important to maintain a friendly
tone and keep it as short as possible. The importance of the cover letter should not be
underestimated. It provides an opportunity to persuade the respondent to complete the survey.
If the questionnaire can be completed in less than five minutes, the response rate can be
increased by mentioning this in the cover letter. Flattering the respondent in the cover letter
does not seem to affect response. Altruism or an appeal to the social utility of a study has
occasionally been found to increase response, but more often, it is not an effective motivator.

Signature on the Cover Letter

The signature of the person signing the cover letter has been investigated by several
researchers. Ethnic sounding names and the status of the researcher (professor or graduate
student) do not affect response (Friedman and Goldstein, 1975; Horowitz and Sedlacek, 1974).
One investigator found that a cover letter signed by the owner of a marina produced better
response than one signed by the sales manager (Labrecque, 1978). The literature is mixed
regarding whether a hand-written signature works better than one that is mimeographed. Two
researchers (Blumenfeld, 1973 ; Kawash and Aleamoni, 1971) reported that mimeographed
signatures worked as well as a hand-written one, while another reported that hand-written
signatures produced better response (Reeder, 1960). Another investigator (Smith, 1977) found
that cover letters signed with green ink increased response by over 10 percent.

Piloting the questionnaires

Even after the researcher has proceeded along the lines suggested, the draft questionnaire is a
product evolved by one or two minds only. Until it has actually been used in interviews and with
respondents, it is impossible to say whether it is going to achieve the desired results. For this
reason it is necessary to pre-test the questionnaire before it is used in a full-scale survey, to
identify any mistakes that need correcting.

The purpose of pretesting the questionnaire is to determine:


120
whether the questions as they are worded will achieve the desired results

· whether the questions have been placed in the best order

· whether the questions are understood by all classes of respondent

· whether additional or specifying questions are needed or whether some questions should be eliminated

· whether the instructions to interviewers are adequate.

Usually a small number of respondents are selected for the pre-test. The respondents selected
for the pilot survey should be broadly representative of the type of respondent to be interviewed
in the main survey.

If the questionnaire has been subjected to a thorough pilot test, the final form of the questions
and questionnaire will have evolved into its final form. All that remains to be done is the
mechanical process of laying out and setting up the questionnaire in its final form. This will
involve grouping and sequencing questions into an appropriate order, numbering questions, and
inserting interviewer instructions.

Response Rate

A common criticism of mail surveys is that they often have low response rates (Benson, 1946;
Phillips, 1941; Robinson, 1952). Low response is the curse of statistical analysis, and it can
dramatically lower confidence in the results. While response rates vary widely from one
questionnaire to another, well-designed studies consistently produce high response rates.

When returned questionnaires arrive in the mail, it's natural to assume that the respondent is
the same person you sent the questionnaire to. A number of researchers have reported that this
may not actually be the case (Clausen and Ford, 1947; Franzen and Lazersfeld, 1945; Moser
and Kalton, 1971; Scott, 1961). Many times business questionnaires get handed to other
employees for completion. Housewives sometimes respond for their husbands. Kids respond as
a prank. For a variety of reasons, the respondent may not be who you think it is. In a summary
of five studies sponsored by the British Government, Scott (1961) reports that up to ten percent
of the returned questionnaires had been completed by someone other than the intended person.

121
Conclusion

While collecting primary data, selecting the right tools for collecting data is of the utmost
importance. For that, the researcher must have a clear understanding of the context in which
different tools are used. Not only it is important to address issues of wording and measurement
in questionnaire design, but it is also necessary to pay attention to how the questionnaire looks.
An attractive and neat questionnaire with appropriate introduction and well-arrayed set of
questions and response alternatives will make it easier for the respondents to answer them.
Questionnaire is especially useful and economical in situations where the geographical
dispersal of respondents is wide. Questionnaires also isolate respondents from external
influence. The respondents are totally free to express their views according to their knowledge,
views and attitudes in an unbiased manner. Data obtained without external influence is more
valid and reliable.

Some Guidelines for designing a Questionnaire

01] BEGIN WITH EASY, GENERAL QUESTIONS

By asking easy, non-threatening questions at the beginning of the questionnaire, you will put the
respondent at ease, establish interest and build rapport.

02] BE BRIEF

Long, complex questions can confuse respondents and produce inaccurate results. Generally,
the more words to a question, the more likely that the wording itself will influence the response.
Try breaking up a long question into two shorter ones.

03] USE LANGUAGE EVERYONE CAN UNDERSTAND

Word questions that everyone can understand easily. For example, some people have trouble
understanding double negatives ("Are you against not requiring test?"). Be careful, however, not
to talk down to people.

122
04] DO NOT ASSUME KNOWLEDGE

If you were to ask the question, "Do you approve or disapprove of changing from letter grades
to a portfolio system of assessment?" some (or perhaps many) people will answer without really
understanding what that means. It doesn't help to give examples because people will probably
respond only to the examples you give them. It is better to ask about attitudes only toward
specific and clearly identified proposals.

05] AVOID "YES" AND "NO" ANSWERS

In many cases, two alternatives cannot adequately measure the range of opinion on a subject.
Instead of asking "Are you satisfied with the food in the school cafeteria?" ask "How satisfied
are you with the food in the school cafeteria -- very satisfied, fairly satisfied, not too satisfied or
not satisfied at all?" This makes it easier for respondents to answer and allows you to measure
the various gradations of opinion. You may sometimes want to use a scale to measure intensity
of opinion: "With a +3 being the highest ranking and -3 the lowest, how would you rate the
following...?"

06] DO YOU WANT TO ALLOW NEUTRAL GROUND OR NOT?

When you design your answer categories, whether in words or in numbers, there is no fixed rule
about whether you should allow people to choose from among four (forcing them to choose
whether they are more positive or more negative), or whether you should give them five choices
(providing them with neutral ground). Think about how you will use the results, and then provide
an even or odd number of choices, depending on those uses.

07] PLACE SENSITIVE QUESTIONS, SUCH AS ASKING ABOUT ACADEMIC ACHIEVEMENT OR FAMILY INCOME AT THE
END OF THE QUESTIONNAIRE.

Some people are uncomfortable being asked about their family income or how well they are
doing in school and other personal questions. By saving these questions until last, you have a
better chance of getting answers since the respondent feels more comfortable after answering
other questions. It is also helpful to explain that this information is asked for statistical purposes
only. This information is useful 1) in describing and understanding the characteristics of your
sample; 2) in examining responses by background categories such as gender or age; 3) in
determining whether those who completed the survey are representative of the sample as a
whole.

08] USE CAREFUL "SKIP LOGIC."

Some questionnaires include questions aimed at a specific group of respondents, such as


freshmen. In such cases, use "filter" or "screener" questions to sort the respondents into
appropriate groups. Those who answer "Yes" should be told also to answer the next question,
but those who answer "No" should be told to "skip" over to the next questions that are
inappropriate for all.

09] USE "FILTER" QUESTIONS TO SEPARATE INFORMED FROM UNINFORMED OPINIONS ON COMPLEX SUBJECTS

Ask a question like "Have you heard or read about Plan X?" and ask follow-up questions only to
those who answer "Yes."
123
10] TRY NOT TO INCLUDE MANY "OPEN-ENDED" QUESTIONS

It's always tempting to ask "open-ended" questions; that is, instead of including a list of
responses from which respondents have to choose, the respondent is asked to explain his or
her position. However, you should try not to include many such questions, but try to limit
yourself to one or two. There are several reasons for this advice:

First, including too many such questions will seriously change your response rate. Answering
open-ended questions requires more time and thought than selecting answers from a pre-
existing list of alternatives. With many open-ended questions, more people will decide that it is
too much trouble to complete the interview.

If your questionnaire is self-administered, including too many open-ended questions will change
the mix of people who complete the questionnaire. For example, you discourage those with poor
verbal skills, and you may encourage those with more free time.

Finally, processing the answers from open-ended questions is very time-consuming. Including
too many such questions transforms your survey from an interesting project into a major
enterprise.

11] AVOID LEADING QUESTIONS.

A leading question suggests an answer. For example, you might want to ask: "In order to
improve the quality of education, should teachers be paid higher salaries?" However, this
question presents a widely accepted goal (improving the quality of education) accompanied by
the assumption that the means suggested (raising teachers salaries) will accomplish the goal --
thus influencing the respondent to answer "Yes."

12] AVOID "DOUBLE-BARRELLED" QUESTIONS

A "double-barrelled" question contains two or more distinct questions but allows only one
answer. For example, if you ask, "Should the school reduce paperwork required of teachers by
hiring more administrators?" you don't know whether a "Yes" answer means the respondent
favors reducing paperwork or hiring more administrators or both.

13] AVOID AMBIGUOUS QUESTIONS

A question containing ambiguous terms can be easily misunderstood and misinterpreted. For
example, if you ask, "Do you think equipment safety could be improved," you will not know
whether people interpret this to mean reduction of damage in transit and workshop or that more
care should be taken in operating the equipment in a factory.

14] AVOID VALUE-LADEN OR EMOTIONAL TERMS

When a question contains a value-laden term, respondents may answer emotionally, without
regard for the context in which the term is used. If you ask people's attitudes toward a specific
government social program and characterize it as "liberal" or "conservative," people are likely to
react to their feelings about "liberal" or "conservative," and not about the program itself.

124
15] AVOID QUESTIONS WITH "SOCIALLY ACCEPTABLE" ANSWERS

When people are asked about their participation in generally approved activities such as
attending classes, they tend to give socially acceptable answers that may or may not be true.
You should only ask questions such as these when you understand their limitations and when
you have a control to make the answers more meaningful. For example, if you ask students
about their attendance in class, you might ask them how many regularly scheduled classes they
missed last week for reasons other than illness.

16] TELL RESPONDENTS HOW SPECIFIC THEIR ANSWERS SHOULD BE; USE RANGES WHERE APPROPRIATE

If you ask, "How long have you lived in this community, you may get answers ranging from "all
my life" to "13 weeks." To improve efficiency, provide ranges so people can know how specific
to make their answers. For example, you might ask, "How long have you lived in this community
-- less than one year, two to five years, six to ten years, or more than ten years?

17] MAKE CERTAIN THAT ANSWER CATEGORIES DON'T OVERLAP

A common mistake in questionnaire design is setting up answer categories this way:

a. Under Rs.10,000 income


b. Rs.10,000-Rs.15,000 income
c. Rs.15,000-Rs.25,000 income
d. Rs.25,000-Rs.40,000 income
e. Over Rs.40,000 income

In this set of answers, someone with an income of Rs.10,000 could choose either of two
categories, as could those with incomes of Rs.15,000 and Rs.25,000. Set up the answers so
that they are separate: for example, Rs.10,001-Rs.15,000, Rs.15,001-Rs.25,000.

18] STATE BOTH SIDES OF THE ISSUE; GIVE REALISTIC ALTERNATIVES

Probably the most difficult question to frame is one that gives the respondent several
alternatives. It is difficult to find mutually exclusive alternatives or to provide enough to cover an
entire range of options. You must also be careful not to word the alternatives in such a way that
it makes one alternative appear better than others.

19] INCLUDE A FINAL OPEN-ENDED QUESTION

In many surveys, and especially in any self-administered questionnaire, it may be useful


sometimes to include a final question. "Is there anything you would like to tell us that we have
not asked you?" Anticipate a real grab bag here, but this final open-ended question can
compensate for a question in which the alternatives given may not have satisfied a respondent;
it allows those who feel the questionnaire "missed the real point" to tell you about it; it surfaces
new issues you may want to address at other times in other ways, etc. While you may want to
cluster these responses into groups, you will want to study them carefully for insights they give
you into the spectrum of issues in the population you are surveying.

20] IT IS OK TO "STEAL" QUESTIONS

125
Unlike many other professionals, survey researchers encourage plagiarism when it comes to
copying question wordings; however, they call it "replication." Indeed, the researcher considers
it a compliment when peers use his or her questions in their own studies and for very good
reason.

You will gain a number of benefits from this replication. First, if you use the exact same question
wording, then at least you do not have to worry if differing results have been caused by the
inconsistencies in the wordings of questions.

Secondly, reusing questions means you may be able to compare the opinions of your group
with another group -- a previous survey in the same community or perhaps a national group for
additional insight. So, by all means, build upon the efforts of others who have asked similar
questions.

Physical appearance of the questionnaire

The physical appearance of a questionnaire can have a significant effect upon both the quantity
and quality of marketing data obtained. The quantity of data is a function of the response rate.
Ill-designed questionnaires can give an impression of complexity, medium and too big a time
commitment. Data quality can also be affected by the physical appearance of the questionnaire
with unnecessarily confusing layouts making it more difficult for interviewers, or respondents in
the case of self-completion questionnaires, to complete this task accurately. Attention to just a
few basic details can have a disproportionately advantageous impact on the data obtained
through a questionnaire.

Use of booklets The use of booklets, in the place of loose or stapled sheets of paper, make it easier for
interviewer or respondent to progress through the document. Moreover, fewer pages tend
to get lost.
Simple, clear The clarity of questionnaire presentation can also help to improve the ease with which
formats interviewers or respondents are able to complete a questionnaire.
Creative use of In their anxiety to reduce the number of pages of a questionnaire these is a tendency to put
space and too much information on a page. This is counter-productive since it gives the questionnaire
typeface the appearance of being complicated. Questionnaires that make use of blank space appear
easier to use, enjoy higher response rates and contain fewer errors when completed.
Use of colour Colour coding can help in the administration of questionnaires. It is often the case that
coding several types of respondents are included within a single survey (e.g. wholesalers and
retailers). Printing the questionnaires on two different colours of paper can make the
handling easier.
Interviewer Interviewer instructions should be placed alongside the questions to which they pertain.
instructions Instructions on where the interviewers should probe for more information or how replies
should be recorded are placed after the question.

In general it is best for a questionnaire to be as short as possible. A long questionnaire leads to


a long interview and this is open to the dangers of boredom on the part of the respondent (and
poorly considered, hurried answers), interruptions by third parties and greater costs in terms of
126
interviewing time and resources. In a rural situation an interview should not last longer than 30-
45 minutes.

“It's sort of an unprecedented thing to have to send a questionnaire back.”


<< Patrick Leahy

127
A2Z

PhD
Thesis

Reflections on Academic Research

Data Collection
Chapter XIV

128
DATA COLLECTION
Introduction

Data collection is the process of gathering and measuring information on variables of


interest, in an established systematic fashion that enables one to answer stated
research questions, test hypotheses, and evaluate outcomes. The data collection
component of research is common to all fields of study including physical and social
sciences, humanities, business, etc. While methods vary by discipline, the emphasis on
ensuring accurate and honest collection remains the same. Data Collection is an
important aspect of any type of research study. Inaccurate data collection can impact the
results of a study and ultimately lead to invalid results.Data collection methods for impact
evaluation vary along a continuum. At the one end of this continuum are quantatative
methods and at the other end of the continuum are Qualitative methods for data
collection.

The importance of ensuring accurate and appropriate data collection

Regardless of the field of study or preference for defining data (quantitative, qualitative),
accurate data collection is essential to maintaining the integrity of research. Both the
selection of appropriate data collection instruments (existing, modified, or newly
developed) and clearly delineated instructions for their correct use reduce the likelihood
of errors occurring.

Consequences from improperly collected data include

 inability to answer research questions accurately


 inability to repeat and validate the study
 distorted findings resulting in wasted resources
 misleading other researchers to pursue fruitless avenues of investigation
 compromising decisions for public policy
 causing harm to human participants and animal subjects

While the degree of impact from faulty data collection may vary by discipline and the
nature of investigation, there is the potential to cause disproportionate harm when these
research results are used to support public policy recommendations.

129
Issues related to maintaining integrity of data collection:

The primary rationale for preserving data integrity is to support the detection of errors in
the data collection process, whether they are made intentionally (deliberate falsifications)
or not (systematic or random errors).

Most, Craddick, Crawford, Redican, Rhodes, Rukenbrod, and Laws (2003) describe
‘quality assurance’ and ‘quality control’ as two approaches that can preserve data
integrity and ensure the scientific validity of study results. Each approach is implemented
at different points in the research timeline (Whitney, Lind, Wahl, 1998):

1. Quality assurance - activities that take place before data collection begins
2. Quality control - activities that take place during and after data collection

Quality Assurance

Since quality assurance precedes data collection, its main focus is 'prevention' (i.e.,
forestalling problems with data collection). Prevention is the most cost-effective activity
to ensure the integrity of data collection. This proactive measure is best demonstrated by
the standardization of protocol developed in a comprehensive and detailed procedures
manual for data collection. Poorly written manuals increase the risk of failing to identify
problems and errors early in the research endeavor. These failures may be
demonstrated in a number of ways:

 Uncertainty about the timing, methods, and identify of person(s) responsible for reviewing
data
 Partial listing of items to be collected
 Vague description of data collection instruments to be used in lieu of rigorous step-by-
step instructions on administering tests
 Failure to identify specific content and strategies for training or retraining staff members
responsible for data collection
 Obscure instructions for using, making adjustments to, and calibrating data collection
equipment (if appropriate)
 No identified mechanism to document changes in procedures that may evolve over the
course of the investigation.

An important component of quality assurance is developing a rigorous and detailed


recruitment and training plan. Implicit in training is the need to effectively communicate
the value of accurate data collection to trainees (Knatterud, Rockhold, George, Barton,
Davis, Fairweather, Honohan, Mowery, O'Neill, 1998). The training aspect is particularly
130
important to address the potential problem of staff who may unintentionally deviate from
the original protocol. This phenomenon, known as ‘drift’, should be corrected with
additional training, a provision that should be specified in the procedures manual.

Given the range of qualitative research strategies (non-participant/ participant


observation, interview, archival, field study, ethnography, content analysis, oral history,
biography, unobtrusive research) it is difficult to make generalized statements about how
one should establish a research protocol in order to facilitate quality assurance.
Certainly, researchers conducting non-participant/participant observation may have only
the broadest research questions to guide the initial research efforts. Since the
researcher is the main measurement device in a study, many times there are little or no
other data collecting instruments. Indeed, instruments may need to be developed on the
spot to accommodate unanticipated findings.

Quality Control

While quality control activities (detection/monitoring and action) occur during and after
data collection, the details should be carefully documented in the procedures manual. A
clearly defined communication structure is a necessary pre-condition for establishing
monitoring systems. There should not be any uncertainty about the flow of information
between principal investigators and staff members following the detection of errors in
data collection. A poorly developed communication structure encourages lax monitoring
and limits opportunities for detecting errors.

Detection or monitoring can take the form of direct staff observation during site visits,
conference calls, or regular and frequent reviews of data reports to identify
inconsistencies, extreme values or invalid codes. While site visits may not be
appropriate for all disciplines, failure to regularly audit records, whether quantitative or
quantitative, will make it difficult for investigators to verify that data collection is
proceeding according to procedures established in the manual. In addition, if the
structure of communication is not clearly delineated in the procedures manual,
transmission of any change in procedures to staff members can be compromised

131
Quality control also identifies the required responses, or ‘actions’ necessary to correct
faulty data collection practices and also minimize future occurrences. These actions are
less likely to occur if data collection procedures are vaguely written and the necessary
steps to minimize recurrence are not implemented through feedback and education
(Knatterud, et al, 1998)

Examples of data collection problems that require prompt action include:

 errors in individual data items


 systematic errors
 violation of protocol
 problems with individual staff or site performance
 fraud or scientific misconduct

In the social/behavioral sciences where primary data collection involves human subjects,
researchers are taught to incorporate one or more secondary measures that can be
used to verify the quality of information being collected from the human subject. For
example, a researcher conducting a survey might be interested in gaining a better
insight into the occurrence of risky behaviors among young adult as well as the social
conditions that increase the likelihood and frequency of these risky behaviors.

To verify data quality, respondents might be queried about the same information but
asked at different points of the survey and in a number of different ways. Measures of
‘ Social Desirability’ might also be used to get a measure of the honesty of responses.
There are two points that need to be raised here, (a) cross-checks within the data
collection process and (b) data quality being as much an observation-level issue as it is
a complete data set issue. Thus, data quality should be addressed for each individual
measurement, for each individual observation, and for the entire data set.

Each field of study has its preferred set of data collection instruments. The hallmark of
laboratory sciences is the meticulous documentation of the lab notebook while social
sciences such as sociology and cultural anthropology may prefer the use of detailed field
notes. Regardless of the discipline, comprehensive documentation of the collection
process before, during and after the activity is essential to preserving data integrity.

132
Quantitative and Qualitative Data collection methods

The Quantitative data collection methods rely on random sampling and structured data
collection instruments that fit diverse experiences into predetermined response
categories. They produce results that are easy to summarize, compare, and generalize.

Quantitative research is concerned with testing hypotheses derived from theory and/or
being able to estimate the size of a phenomenon of interest. Depending on the research
question, participants may be randomly assigned to different treatments. If this is not
feasible, the researcher may collect data on participant and situational characteristics in
order to statistically control for their influence on the dependent, or outcome, variable. If
the intent is to generalize from the research participants to a larger population, the
researcher will employ probability sampling to select participants.

Typical quantitative data gathering strategies include:

 Experiments/clinical trials.
 Observing and recording well-defined events (e.g., counting the number of patients
waiting in emergency at specified times of the day).
 Obtaining relevant data from management information systems.
 Administering surveys with closed-ended questions (e.g., face-to face and telephone
interviews, questionnaires etc).(http://www.achrn.org/quantitative_methods.htm)

Interviews

In Quantitative research (survey research), interviews are more structured than in


Qualitative research.(http://www.stat.ncsu.edu/info/srms/survpamphlet.html.In a
structured interview,the researcher asks a standard set of questions and nothing
more.(Leedy and Ormrod, 2001)

Face -to -face interviews have a distinct advantage of enabling the researcher to
establish rapport with potential partiocipants and therefor gain their cooperation.These
interviews yield highest response rates in survey research.They also allow the
researcher to clarify ambiguous answers and when appropriate, seek follow-up
information. Disadvantages include impractical when large samples are involved time
consuming and expensive.(Leedy and Ormrod, 2001)

133
Telephone interviews are less time consuming and less expensive and the researcher
has ready access to anyone on the planet who hasa telephone.Disadvantages are that
the response rate is not as high as the face-to- face interview but cosiderably higher
than the mailed questionnaire.The sample may be biased to the extent that people
without phones are part of the population about whom the researcher wants to draw
inferences.

Computer Assisted Personal Interviewing (CAPI) is a form of personal interviewing, but


instead of completing a questionnaire, the interviewer brings along a laptop or hand-held
computer to enter the information directly into the database. This method saves time
involved in processing the data, as well as saving the interviewer from carrying around
hundreds of questionnaires. However, this type of data collection method can be
expensive to set up and requires that interviewers have computer and typing skills.

Questionnaires

Paper-pencil-questionnaires can be sent to a large number of people and saves the


researcher time and money.People are more truthful while responding to the
questionnaires regarding controversial issues in particular due to the fact that their
responses are anonymous. But they also have drawbacks.Majority of the people who
receive questionnaires don't return them and those who do might not be representative
of the originally selected sample.(Leedy and Ormrod, 2001)

Web based questionnaires A new and inevitably growing methodology is the use of
Internet based research. This would mean receiving an e-mail on which you would click
on an address that would take you to a secure web-site to fill in a questionnaire. This
type of research is often quicker and less detailed.Some disadvantages of this method
include the exclusion of people who do not have a computer or are unable to access a
computer.Also the validity of such surveys are in question as people might be in a hurry
to complete it and so might not give accurate responses.
(http://www.statcan.ca/english/edu/power/ch2/methods/methods.htm)

Questionnaires often make use of Checklist and rating scales.These devices help
simplify and quantify people's behaviors and attitudes.A checklist is a list of
behaviors,characteristics,or other entities that te researcher is looking for.Either the
134
researcher or survey participant simply checks whether each item on the list is observed,
present or true or vice versa.A rating scale is more useful when a behavior needs to be
evaluated on a continuum.They are also known as Likert scales. (Leedy and Ormrod,
2001)

Qualitative data collection methods play an important role in impact evaluation by


providing information useful to understand the processes behind observed results and
assess changes in people’s perceptions of their well-being.Furthermore qualitative
methods can beused to improve the quality of survey-based quantitative evaluations by
helping generate evaluation hypothesis; strengthening the design of survey
questionnaires and expanding or clarifying quantitative evaluation findings. These
methods are characterized by the following attributes:

 they tend to be open-ended and have less structured protocols (i.e., researchers may
change the data collection strategy by adding, refining, or dropping techniques or
informants)
 they rely more heavily on iteractive interviews; respondents may be interviewed several
times to follow up on a particular issue, clarify concepts or check the reliability of data
 they use triangulation to increase the credibility of their findings (i.e., researchers rely on
multiple data collection methods to check the authenticity of their results)
 generally their findings are not generalizable to any specific population, rather each case
study produces a single piece of evidence that can be used to seek general patterns
among different studies of the same issue

Regardless of the kinds of data involved,data collection in a qualitative study takes a


great deal of time.The researcher needs to record any potentially useful data
thououghly,accurately, and systematically,using field
notes,sketches,audiotapes,photographs and other suitable means.The data collection
methods must observe the ethical principles of research.

Sources of Data

The sources of data may be classified into (a) primary sources and (b) secondary
sources.

Primary Sources

Primary sources are original sources from which the researcher directly collects data
that have not been previously collected, e.g., collection of data directly by the researcher
on brand awareness, brand preference, brand loyalty and other aspects of consumer

135
behaviour from a sample of consumers by interviewing them. Primary data are first-hand
information collected through various methods such as observation, interviewing, mailing
etc.

Secondary Sources

These are sources containing data that have been collected and compiled for another
purpose. The secondary sources consist of readily available compendia and already
compiled statistical statements and reports whose data may be used by researches for
their studies, e.g., census reports, annual reports and financial statements of companies,
Statistical statements, Reports of Government Departments, Annual Reports on
currency and finance published by the National Bank for Ethiopia, Statistical Statements
relating to Cooperatives, Federal Cooperative Commission, Commercial Banks and
Micro Finance Credit Institutions published by the National Bank for Ethiopia, Reports of
the National Sample Survey Organisation, Reports of trade associations, publications of
international organisations such as UNO, IMF, World Bank, ILO, WHO, etc., Trade and
Financial Journals, newspapers, etc.

Secondary sources consist of not only published records and reports, but also
unpublished records. The latter category includes various records and registers
maintained by firms and organisations, e.g., accounting and financial records, personnel
records, register of members, minutes of meetings, inventory records, etc.

Features of Secondary Sources:


Though secondary sources are diverse and consist of all sorts of materials, they have
certain common charac-teristics. First, they are readymade and readily available, and do
not require the trouble of constructing tools and administering them. Second, they
consist of data over which a researcher has no original control over collection and
classification. Others shape both the form and the content of secondary sources.
Clearly, this is a feature, which can limit the research value of secondary sources.
Finally, secondary sources are not limited in time and space. That is, the researcher
using them need not have been present when and where they were gathered.

136
Conclusion

Data collection is to be considered as an art as well as science. It playes a very crucial


role in any research study. Selecting a correct method of data collection helps a
research scholar in correct path and yields a desired quality results. It is always
advisable for any scholar to have a detailed discussion with his supervisor before
deciding and finalizing the method of data collection required for any selected area of
research.

“It is a capital mistake to theorize before one has data.”


<< Sir Arthur Conan Doyle

137
A2Z

PhD
Thesis
Reflections on Academic Research

Chapter XV

Statistical Tools for Research

138
STATISTICAL TOOLS FOR RESEARCH
Introduction

Scholars frequently use statistics to analyze their results. Why do researchers use
statistics? Statistics can help understand a phenomenon by confirming or rejecting a
hypothesis. It is vital to how we acquire knowledge to most scientific theories.

Statistical calculations

When analyzing data, your goal is simple: You wish to make the strongest possible
conclusion from limited amounts of data. To do this, you need to overcome two
problems:

 Important differences can be obscured by biological variability and experimental


imprecision. This makes it difficult to distinguish real differences from random variability.
 The human brain excels at finding patterns, even from random data. Our natural
inclination (especially with our own data) is to conclude that differences are real, and to
minimize the contribution of random variability. Statistical rigor prevents you from making
this mistake.

Statistical analyses are necessary when observed differences are small compared to
experimental imprecision and biological variability. When you work with experimental
systems with no biological variability and little experimental error, heed these aphorisms:

If you need statistics to analyze your experiment, then you've done the wrong experiment.

If your data speak for themselves, don't interrupt!

But in many fields, scientists can't avoid large amounts of variability, yet care about
relatively small differences. Statistical methods are necessary to draw valid conclusions
from such data.

Probability
What is probability? Probability can be a complex field of mathematics, but in its simplest
definition it is the likelhood of an event being true divided by the total number of
possibilities. For example, flipping a coin has two possibilities: heads or tails. There is
only one way for a coin to land on heads, so the answer to this probability question is
1:2.

139
Why Is the Study of Probability Important?
Probability and statistics are useful to us in many ways. Knowing the likelihood of an
event happening is important information in decision making that is used in nearly every
field. For example, research studies use probability in determining whether or not a new
drug is worth putting on the market. Does the effectiveness of the drug outweigh the
harm it causes to a patient's body? Probability can help answer that question.

Review of Nonparametric Tests

Choosing the right test to compare measurements is a bit tricky, as you must choose
between two families of tests: parametric and nonparametric. Many -statistical test are
based upon the assumption that the data are sampled from a Gaussian distribution.
These tests are referred to as parametric tests. Commonly used parametric tests are
listed in the first column of the table and include the t test and analysis of variance.

Tests that do not make assumptions about the population distribution are referred to as
nonparametric- tests. You've already learned a bit about nonparametric tests in previous
chapters. All commonly used nonparametric tests rank the outcome variable from low to
high and then analyze the ranks. These tests are listed in the second column of the table
and include the Wilcoxon, Mann-Whitney test, and Kruskal-Wallis tests. These tests are
also called distribution-free tests.

Choosing between parametric and nonparametric tests: Easy Cases

Choosing between parametric and nonparametric tests is sometimes easy. You should
definitely choose a parametric test if you are sure that your data are sampled from a
population that follows a Gaussian distribution (at least approximately). You should
definitely select a nonparametric test in three situations:

• The outcome is a rank or a score and the population is clearly not Gaussian. Examples
include class ranking of students, the Apgar score for the health of newborn babies
(measured on a scale of 0 to IO and where all scores are integers), the visual analogue
score for pain (measured on a continuous scale where 0 is no pain and 10 is unbearable
pain), and the star scale commonly used by movie and restaurant critics (* is OK, ***** is
fantastic).
• Some values are "off the scale," that is, too high or too low to measure. Even if the
population is Gaussian, it is impossible to analyze such data with a parametric test since
you don't know all of the values. Using a nonparametric test with these data is simple.
Assign values too low to measure an arbitrary very low value and assign values too high
140
to measure an arbitrary very high value. Then perform a nonparametric test. Since the
nonparametric test only knows about the relative ranks of the values, it won't matter that
you didn't know all the values exactly.
• You are sure that the population is not distributed in a Gaussian manner. If the data are
not sampled from a Gaussian distribution, consider whether you can transformed the
values to make the distribution become Gaussian. For example, you might take the
logarithm or reciprocal of all values. There are often biological or chemical reasons (as
well as statistical ones) for performing a particular transform.

Choosing between parametric and nonparametric tests: Hard Cases

It is not always easy to decide whether a sample comes from a Gaussian population.
Consider these points:

• If you collect many data points (over a hundred or so), you can look at the distribution of
data and it will be fairly obvious whether the distribution is approximately bell shaped. A
formal statistical test (Kolmogorov-Smirnoff test, not explained in this book) can be used
to test whether the distribution of the data differs significantly from a Gaussian
distribution. With few data points, it is difficult to tell whether the data are Gaussian by
inspection, and the formal test has little power to discriminate between Gaussian and
non-Gaussian distributions.
• You should look at previous data as well. Remember, what matters is the distribution of
the overall population, not the distribution of your sample. In deciding whether a
population is Gaussian, look at all available data, not just data in the current experiment.
• Consider the source of scatter. When the scatter comes from the sum of numerous
sources (with no one source contributing most of the scatter), you expect to find a roughly
Gaussian distribution. When in doubt, some people choose a parametric test (because
they aren't sure the Gaussian assumption is violated), and others choose a
nonparametric test (because they aren't sure the Gaussian assumption is met).

Choosing between parametric and nonparametric tests: Does it matter?

Does it matter whether you choose a parametric or nonparametric test? The answer
depends on sample size. There are four cases to think about:

• Large sample. What happens when you use a parametric test with data from a
nongaussian population? The central limit theorem (discussed in Chapter 5) ensures that
parametric tests work well with large samples even if the population is non-Gaussian. In
other words, parametric tests are robust to deviations from Gaussian distributions, so
long as the samples are large. The snag is that it is impossible to say how large is large
enough, as it depends on the nature of the particular non-Gaussian distribution. Unless
the population distribution is really weird, you are probably safe choosing a parametric
test when there are at least two dozen data points in each group.
• Large sample. What happens when you use a nonparametric test with data from a
Gaussian population? Nonparametric tests work well with large samples from Gaussian
populations. The P values tend to be a bit too large, but the discrepancy is small. In other
words, nonparametric tests are only slightly less powerful than parametric tests with large
samples.
• Small samples. What happens when you use a parametric test with data from
nongaussian populations? You can't rely on the central limit theorem, so the P value may
be inaccurate.

141
• Small samples. When you use a nonparametric test with data from a Gaussian
population, the P values tend to be too high. The nonparametric tests lack statistical
power with small samples.

Thus, large data sets present no problems. It is usually easy to tell if the data come from
a Gaussian population, but it doesn't really matter because the nonparametric tests are
so powerful and the parametric tests are so robust. Small data sets present a dilemma. It
is difficult to tell if the data come from a Gaussian population, but it matters a lot. The
nonparametric tests are not powerful and the parametric tests are not robust.

One- or two-sided p value?

With many tests, you must choose whether you wish to calculate a one- or two-sided P
value (same as one- or two-tailed P value). Let's review the difference in the context of a
t test. The P value is calculated for the null hypothesis that the two population means are
equal, and any discrepancy between the two sample means is due to chance. If this null
hypothesis is true, the one-sided P value is the probability that two sample means would
differ as much as was observed (or further) in the direction specified by the hypothesis
just by chance, even though the means of the overall populations are actually equal. The
two-sided P value also includes the probability that the sample means would differ that
much in the opposite direction (i.e., the other group has the larger mean). The two-sided
P value is twice the one-sided P value.

A one-sided P value is appropriate when you can state with certainty (and before
collecting any data) that there either will be no difference between the means or that the
difference will go in a direction you can specify in advance (i.e., you have specified
which group will have the larger mean). If you cannot specify the direction of any
difference before collecting data, then a two-sided P value is more appropriate. If in
doubt, select a two-sided P value.

If you select a one-sided test, you should do so before collecting any data and you need
to state the direction of your experimental hypothesis. If the data go the other way, you
must be willing to attribute that difference (or association or correlation) to chance, no
matter how striking the data. If you would be intrigued, even a little, by data that goes in
the "wrong" direction, then you should use a two-sided P value. For reasons discussed
in Chapter 10, I recommend that you always calculate a two-sided P value.
142
Paired or Unpaired Test?

When comparing two groups, you need to decide whether to use a paired test. When
comparing three or more groups, the term paired is not apt and the term repeated
measures is used instead.

Use an unpaired test to compare groups when the individual values are not paired or
matched with one another. Select a paired or repeated-measures test when values
represent repeated measurements on one subject (before and after an intervention) or
measurements on matched subjects. The paired or repeated-measures tests are also
appropriate for repeated laboratory experiments run at different times, each with its own
control.

You should select a paired test when values in one group are more closely correlated
with a specific value in the other group than with random values in the other group. It is
only appropriate to select a paired test when the subjects were matched or paired before
the data were collected. You cannot base the pairing on the data you are analyzing.

Fisher's Test or the Chi-square Test?

When analyzing contingency tables with two rows and two columns, you can use either
Fisher's exact test or the chi-square test. The Fisher's test is the best choice as it always
gives the exact P value. The chi-square test is simpler to calculate but yields only an
approximate P value. If a computer is doing the calculations, you should choose Fisher's
test unless you prefer the familiarity of the chi-square test. You should definitely avoid
the chi-square test when the numbers in the contingency table are very small (any
number less than about six). When the numbers are larger, the P values reported by the
chi-square and Fisher's test will he very similar.

The chi-square test calculates approximate P values, and the Yates' continuity correction
is designed to make the approximation better. Without the Yates' correction, the P
values are too low. However, the correction goes too far, and the resulting P value is too
high. Statisticians give different recommendations regarding Yates' correction. With large
sample sizes, the Yates' correction makes little difference. If you select Fisher's test, the
P value is exact and Yates' correction is not needed and is not available.

143
Regression or Correlation?

Linear regression and correlation are similar and easily confused. In some situations it
makes sense to perform both calculations. Calculate linear correlation if you measured
both X and Y in each subject and wish to quantity how well they are associated. Select
the Pearson (parametric) correlation coefficient if you can assume that both X and Y are
sampled from Gaussian populations. Otherwise choose the Spearman nonparametric
correlation coefficient. Don't calculate the correlation coefficient (or its confidence
interval) if you manipulated the X variable.

Calculate linear regressions only if one of the variables (X) is likely to precede or cause
the other variable (Y). Definitely choose linear regression if you manipulated the X
variable. It makes a big difference which variable is called X and which is called Y, as
linear regression calculations are not symmetrical with respect to X and Y. If you swap
the two variables, you will obtain a different regression line. In contrast, linear correlation
calculations are symmetrical with respect to X and Y. If you swap the labels X and Y,
you will still get the same correlation coefficient.

Selecting a Statistical Test

Type of Data
Goal Measurement Rank, Score, or Binomial Survival Time
(from Gaussian Measurement (Two
Population) (from Non- Possible
Gaussian Outcomes)
Population)
Describe one group Mean, SD Median, interquartile Proportion Kaplan Meier
range survival curve
Compare one group One-sample ttest Wilcoxon test Chi-square
to a hypothetical or
value Binomial test
Compare two Unpaired t test Mann-Whitney test Fisher's test Log-rank test
unpaired groups (chi-square or Mantel-
for large Haenszel
samples)
Compare two paired Paired t test Wilcoxon test McNemar's Conditional
groups test proportional
hazards
regression
Compare three or One-way ANOVA Kruskal-Wallis test Chi-square Cox

144
more unmatched test proportional
groups hazard
regression
Compare three or Repeated- Friedman test Cochrane Q Conditional
more matched measures proportional
groups ANOVA hazards
regression
Quantify association Pearson Spearman Contingency
between two correlation correlation coefficients
variables
Predict value from Simple linear Nonparametric Simple logistic Cox
another measured regression regression regression proportional
variable or hazard
Nonlinear regression
regression
Predict value from Multiple linear Multiple Cox
several measured or regression logistic proportional
binomial variables or regression hazard
Multiple nonlinear regression
regression

SPSS

SPSS is the statistical package most widely used by social scientists. There are several
reasons:
1. Force of habit: SPSS has been around since the late 1960s.
2. Of the major packages, it seems to be the easiest to use for the most widely used statistical
techniques;
3. One can use it with either a Windows point-and-click approach or through syntax (i.e.,
writing out of SPSS commands.) Each has its own advantages, and the user can switch
between the approaches;
4. Many of the widely used social science data sets come with an easy method to translate
them into SPSS; this significantly reduces the preliminary work needed to explore new data.

Two important limitations:

1. SPSS users have less control over statistical output than, for example, Stata or Gauss
users. For novice users, this hardly causes a problem. But, once a researcher wants greater
control over the equations or the output, she or he will need to either choose another
package or learn techniques for working around SPSS’s limitations;

2. SPSS has problems with certain types of data manipulations, and it has some built in quirks
that seem to reflect its early creation. The best known limitation is its weak lag functions, that
is, how it transforms data across cases. For new users working off of standard data sets, this
is rarely a problem. But, once a researcher begins wanting to significantly alter data sets, he
or she will have to either learn a new package or develop greater skills at manipulating
SPSS.

145
Overall, SPSS is a good first statistical package for people wanting to perform
quantitative research in social science because it is easy to use and because it can be a
good starting point to learn more advanced statistical packages.

Conclusion

It is the scholar’s primary responsibility to identify and use the relevant types of statistical
tools that suit his nature of research study. Once these are finalized, the standard [i.e.,
SPSS] statistical software may take care of analysis and further process. However, the
scholar is supposed to have some preliminary knowledge about the salient features of
the software. It is not always safe to rely entirely on statisticians.

“Statistics are like bikinis. What they reveal is suggestive, but what they conceal is vital.”
<< Aaron Levenstein

146
Reflections on Academic Research

A2Z

PhD
Thesis

Chapter XVI

Reliability & Validity

147
RELIABILITY AND VALIDITY

Measurement experts (and many educators) believe that every measurement device
should possess certain qualities. Perhaps the two most common technical concepts in
measurement are reliability and validity. Any kind of assessment, whether traditional or
"authentic," must be developed in a way that gives the assessor accurate information
about the performance of the individual.

A. Reliability:

Definition

• The degree of consistency between two measures of the same thing. (Mehrens and Lehman,
1987).

• The measure of how stable, dependable, trustworthy, and consistent a test is in measuring the
same thing each time (Worthen et al., 1993)

The idea behind reliability is that any significant results must be more than a one-off
finding and be inherently repeatable. Other researchers must be able to perform exactly
the same experiment, under the same conditions and generate the same results. This
will reinforce the findings and ensure that the wider scientific community will accept
the hypothesis. Without this replication of statistically significant results,
the experiment and research have not fulfilled all of the requirements of testability. This
prerequisite is essential to a hypothesis establishing itself as an accepted scientific truth.

For example, if you are performing a time critical experiment, you will be using some
type of stopwatch. Generally, it is reasonable to assume that the instruments are
reliable and will keep true and accurate time. However, diligent scientists
take measurements many times, to minimize the chances of malfunction and maintain
validity and reliability. At the other extreme, any experiment that uses human judgment is
always going to come under question.

For example, if observers rate certain aspects, like in Bandura’s Bobo Doll Experiment,
then the reliability of the test is compromised. Human judgment can vary wildly

148
between observers, and the same individual may rate things differently depending upon
time of day and current mood.
This means that such experiments are more difficult to repeat and are inherently less
reliable. Reliability is a necessary ingredient for determining the overall validity of a
scientific experiment and enhancing the strength of the results.

B. Validity

Definition:

• Truthfulness: Does the test measure what it purports to measure? the extent to which certain
inferences can be made from test scores or other measurement. (Mehrens and Lehman, 1987)

• The degree to which they accomplish the purpose for which they are being used. (Worthen et
al., 1993)

Validity encompasses the entire experimental concept and establishes whether the
results obtained meet all of the requirements of the scientific research method.

For example, there must have been randomization of the sample groups and appropriate
care and diligence shown in the allocation of controls. Internal validity dictates how an
experimental design is structured and encompasses all of the steps of the scientific
research method. Even if your results are great, sloppy and inconsistent design will
compromise your integrity in the eyes of the scientific community. Internal validity and
reliability are at the core of any experimental design. External validity is the process of
examining the results and questioning whether there are any other
possible causal relationships. Control groups and randomization will lessen external
validity problems but no method can be completely successful. This is why the statistical
proofs of a hypothesis called significant, not absolute truth. Any scientific research
design only puts forward a possible cause for the studied effect. There is always the
chance that another unknown factor contributed to the results and findings. This
extraneous causal relationship may become more apparent, as techniques are refined
and honed.

149
Reliability & Validity

We often think of reliability and validity as separate ideas but, in fact, they're related to
each other. Here, the following example illustrates the point:

The favorite metaphor for the relationship between reliability are that of the target. Think
of the center of the target as the concept that you are trying to measure. Imagine that for
each person you are measuring, you are taking a shot at the target. If you measure the
concept perfectly for a person, you are hitting the center of the target. If you don't, you
are missing the center. The more you are off for that person, the further you are from the
center.

The figure above shows four possible situations. In the first one, you are hitting the
target consistently, but you are missing the center of the target. That is, you are
consistently and systematically measuring the wrong value for all respondents. This
measure is reliable, but no valid (that is, it's consistent but wrong). The second, shows
hits that are randomly spread across the target. You seldom hit the center of the target
but, on average, you are getting the right answer for the group (but not very well for
individuals). In this case, you get a valid group estimate, but you are inconsistent. Here,
you can clearly see that reliability is directly related to the variability of your measure.
The third scenario shows a case where your hits are spread across the target and you
are consistently missing the center. Your measure in this case is neither reliable nor
valid. Finally, we see the "Robin Hood" scenario -- you consistently hit the center of the
target. Your measure is both reliable and valid.

150
Conclusion

Always remember that your ability to answer your research question is only as good as
the instruments you develop or your data collection procedure. Well-trained and
motivated observers or a well-developed survey instrument will better provide you with
quality data with which to answer a question or solve a problem. Finally, be aware that
reliability is necessary but not sufficient for validity. That is, for something to be valid it
must be reliable but it must also measure what it is intended to measure.

The only relevant test of the validity of a hypothesis is


comparison of prediction with experience.
<< Milton Friedman

151
Chapter XVII

Data Analysis

A2Z

PhD
Thesis
Reflections on Academic Research

152
DATA ANALYSIS

Introduction

Before you decide what to wear in the morning, you collect a variety of data: the
season of the year, what the forecast says the weather is going to be like, which
clothes are clean and which are dirty, and what you will be doing during the day. You
then analyze that data. Perhaps you think, “It’s summer, so it’s usually warm.” That
analysis helps you determine the best course of action, and you base your apparel
decision on your interpretation of the information. You might choose a t-shirt and shorts
on a summer day when you know you’ll be outside, but bring a sweater with you if you
know you’ll be in an air-conditioned building.

Though this example may seem simplistic, it reflects the way scientists or any
researcher for that matter, pursue data collection, analysis, and interpretation. Data
(the plural form of the word datum) are scientific observations and measurements that,
once analyzed and interpreted, can be developed into evidence to address a question.
Data lie at the heart of any research study, and all researchers collect data in one form
or another. The weather forecast that helped you decide what to wear, for example,
was an interpretation made by a meteorologist who analyzed data collected by
satellites. Data may take the form of the number of bacteria colonies growing in soup
broth, a series of drawings or photographs of the different layers of rock that form a
mountain range a tally of lung cancer victims in populations of cigarette smokers and
non-smokers , or the changes in average annual temperature predicted by a model of
global climate. Scientific data collection involves more care than you might use in a
casual glance at the thermometer to see what you should wear. Because scientists
build on their own work and the work of others, it is important that they are systematic
and consistent in their data collection methods and make detailed records so that
others can see and use the data they collect. The thoughtful and systematic collection,
analysis, and interpretation of data allow it to be developed into evidence that supports
scientific ideas, arguments, and hypotheses.

153
Definition

“Data analysis is a body of methods that help to describe


facts, detect patterns, develop explanations, and test hypotheses.
It is used in all of the sciences. It is used in business, in
administration,and in policy”.

The numerical results provided by a data analysis are usually simple: It finds the number
that describes a typical value and it finds differences among numbers. Data analysis
finds averages, like theaverage income or the average temperature, and it finds
differences like the difference in income from group to group or the differences in
average temperature from year to year. Fundamentally, the numerical answers provided
by data analysis are that simple. But data analysis is not about numbers — it uses them.
Data analysis is about the world, asking, always asking, “How does it work?” And that’s
where data analysis gets tricky. Carefully study the following two examples:

Example:
Between 1790 and 1990 the population of the United States increased by 245 million people,
from 4 million to 249 million people. Those are the facts. But if I were to interpret those numbers
and report that the population grew at an average rate of 1.2 million people per year, 245 million
people divided by 200 years, the report would be wrong. The facts would be correct and the
arithmetic would be correct — 245 million people divided by 200 years is approximately 1.2
million people per year. But the interpretation “grew at an average rate of 1.2 million people per
year” would be wrong, dead wrong. The U.S. population did not grow that way, not even
approximately.

Example:
The average number of students per class at my university is 16. That is a fact. It is also a fact
that the average number of classmates a student will find in his or her classes is 37. That too is a
fact. The numerical results are correct in both cases, both 16 and 37 are correct even though one
number is twice the magnitude of the other — no tricks. But the two different numbers respond to
two subtly different questions about how the world (my university) works subtly different questions
that lead to large differences in the result.

Anatomy of Data Analysis

By the time you get to the analysis of your data, most of the really difficult work has been
done. It's much more difficult to: define the research problem; develop and implement a
sampling plan; conceptualize, operationalize and test your measures; and develop a

154
design structure. If you have done this work well, the analysis of the data is usually a
fairly straightforward affair.

In most social research the data analysis involves three major steps, done in roughly this
order:

 Data Preparation [Cleaning and organizing the data for analysis]


 Descriptive Statistics [Describing the data]
 Inferential Statistics [Testing Hypotheses and Models]

Data Preparation involves checking or logging the data in; checking the data for
accuracy; entering the data into the computer; transforming the data; and developing
and documenting a database structure that integrates the various measures.

Descriptive Statistics are used to describe the basic features of the data in a study. They
provide simple summaries about the sample and the measures. Together with simple
graphics analysis, they form the basis of virtually every quantitative analysis of data.
With descriptive statistics you are simply describing what is, what the data shows.

Inferential Statistics investigate questions, models and hypotheses. In many cases, the
conclusions from inferential statistics extend beyond the immediate data alone. For
instance, we use inferential statistics to try to infer from the sample data what the
population thinks. Or, we use inferential statistics to make judgments of the probability
that an observed difference between groups is a dependable one or one that might have
happened by chance in this study. Thus, we use inferential statistics to make inferences
from our data to more general conditions; we use descriptive statistics simply to describe
what's going on in our data.

In most research studies, the analysis section follows these three phases of analysis.
Descriptions of how the data were prepared tend to be brief and to focus on only the
more unique aspects to your study, such as specific data transformations that are
performed. The descriptive statistics that you actually look at can be voluminous. In most
write-ups, these are carefully selected and organized into summary tables and graphs
that only show the most relevant or important information. Usually, the researcher links
each of the inferential analyses to specific research questions or hypotheses that were

155
raised in the introduction, or notes any models that were tested that emerged as part of
the analysis. In most analysis write-ups it's especially critical to not "miss the forest for
the trees." If you present too much detail, the reader may not be able to follow the
central line of the results. Often extensive analysis details are appropriately relegated to
appendices, reserving only the most critical analysis summaries for the body of the
report itself.

Rules of Data Analysis

[A] First Method

1. Look at the Data / Think About the Data / Think About the Problem / Ask what it is you
Want to Know Think about the data. Think about the problem. Think about what it is you
are trying to discover. That would seem obvious, “Think.” But, it is the most important
step and often omitted as if, somehow, human intervention in the processes of science
were a threat to its objectivity and to the solidity of the science. But, no, thinking is
required: You have to interpret evidence in terms of your experience. You have to
evaluate data in terms of your prior expectations (and you had better have some
expectations). You have to think about data in terms of concepts and theories, even
though the concepts and theories may turn out to be wrong.

2. Estimate the Central Tendency of the Data. The “central tendency” can be something
as simple as an average: The average weight of these people is 150 pounds. Or it can
be something more complicated like a rate: The rate of growth of the population is two
percent per annum. Or it can be something sophisticated, something based on a theory:
The orbit of this planet is an ellipse. And why would you have thought to estimate
something as specific as a rate of growth or the trace of an ellipse? Because you
thought about the data, about the problem, and about where you were going (Rule 1).

3. Look at the Exceptions to the Central Tendency If you’ve measured a median, look at
the exceptions that lie above and below the median. If you’ve estimated a rate, look at
the data that
are not described by the rate. The point is that there is always, or almost always,
variation: You may have measured the average but, almost always, some of the cases
are not average. You may have measured a rate of change but, almost always, some
156
numbers are large compared to the average rate, and some are small. And these
exceptions are not usually just the result of embarrassingly human error or regrettable
sloppiness: On the contrary, often the exceptions contain information about the process
that generated the data. And sometimes they tell you that the original idea (to which the
variations are the exception) is wrong, or in need of refinement. So, look at the
exceptions which, as you can see, brings us back to rule 1, except that this time the data
we look at are the exceptions. That circle of three rules describes one of the constant
practices of analysis, cycling between the central tendencies and the exceptions as you
revise the ideas that are guiding your analysis.

[B] Second Method

Trying to describe the Rules from another angle, another theme that organizes the rules
of evidence can be introduced by three key words: falsifiability, validity, and parsimony.

[a] Falsifiability

Falsifiability requires that there be some sort of evidence which, had it been found, your
conclusions would have had to be judged false. Even though it’s your theory and your
evidence, it’s up to you to go the additional step and formulate your ideas so they can be
tested — and falsified if they are false. More, you yourself have to look for the counter
evidence. This is another way to describe one of the previous rules which was “Look at
the Exceptions”.

[b] Validity

Validity in the scientific sense, requires that conclusions be more than computationally
correct. Conclusions must also be “sensible” and true statements about the world: For
example, I noted earlier that it would be wrong to report that the population of the United
States had grown at an average rate of 1.2 million people per year. — Wrong, even
though the population grew by 245 million people over an interval of 200 years. Wrong
even though 245 divided by 200 is (approximately) 1.2. Wrong because it is neither
sensible nor true that the American population of 4 million people in the United States in
1790 could have increased to 5.1 million people in just twelve months. That would have
been a thirty percent increase in one year — which is not likely

157
(and didn’t happen). It would be closer to the truth, more valid, to describe the annual
growth using a percentage, stating that the population increased by an average of 2
percent per year — 2 percent per year when the population was 4 million (as it was in
1790), 2 percent per year when the population was 250 million (as it was in 1990). That’s
better.

[c] Parsimony

Parsimony is the analyst’s version of the phrase “Keep It Simple.” It means getting the
job done with the simplest tools, provided that they work. In military terms you might
think about weapons that provide the maximum “bang for the buck”. In the sciences our
“weapons” are ideas and we favor simple ideas with maximum effect. This means that
when we choose among equations that predict something or use them to describe facts,
we choose the simplest equation that will do the job. When we construct explanations or
theories we choose the most general principles that can explain the detail of particular
events. That’s why sociologists are attracted to broad concepts like social class and why
economists are attracted to theories of rational individual behavior — except that a
simple explanation is no explanation at all unless it is also falsifiable and valid.

Conclusion

But make no mistake, it is these broad and not-well-specified principles that generate the
specific rules we follow: Think about the data. Look for the central tendency. Look for the
variation. Strive for falsifiability, validity, and parsimony. Perhaps the most powerful rule
is the first one, “Think”. The data are telling us something about the real world, but what?
Think about the world behind the numbers and let good sense and reason guide the
analysis.

Some Guidelines for Data Analysis:

The following tips and questions from Calhoun (1994), Mills (1999), and Padak and
Padak (1994) are helpful for assisting with data analysis:

 Continue to ask questions: who, what, where, when, why, and how?

 Determine what important points the data reveal.

158
 Identify or look for themes: Sort data into piles so that each pile shares a broad
characteristic. Write a summary statement for each pile.

 Determine what patterns or trends show up. Can they be explained?

 Determine how data from various sources--test scores, grades, surveys, interviews, and
observations, and documents--compare or contrast.

 Develop a concept map.

 Review the data. Do any correlations seem important?

 Determine if the results are different from what was expected.

 Decide what actions are indicated.

Note:

For Guidance to SPSS, vide Appendix II

“The analysis of character is the highest human entertainment.”


<< Isaac Singer

159
Chapter XVIII

A2Z

PhD
Thesis
Reflections on Academic Research

Findings of Research Study

160
FINDINGS OF RESEARCH STUDY

After the completion of literature review, collection of data and analysis of data, the findings
of the thesis are the heart of the research thesis. The value of a scholar’s thesis will stand or
fall on the validity and quality of the thesis findings. Critical as well as the most significant
stage of the thesis is identifying and finalizing the findings of the thesis. The following steps
may be considered when the scholar takes up the findings section:

 Understand what thesis findings are. Thesis findings consist two broad categories. One is
the aggregate data you collect, such as totals from surveys in social science or
observations of plant populations in botany. The other is the results of data analysis, such
as statistics generated from the raw data. Thesis findings do not include interpretation of
the results to draw conclusions or formulate theoretical explanations. These are vital
parts of the thesis, but they are distinct and separate from thesis findings.
 Operationalize the hypotheses. This is the process of devising a specific test or tests to
obtain data that either support or fail to support a hypothesis. Consider carefully whether
the test is valid (does it measure what you want it to measure) and is it reliable (under the
test conditions will you get consistent results). Whenever possible, do preliminary runs to
verify the validity and reliability of the testing procedure before collecting the data.
 Collect the data. The watchword here is quality. Adhere strictly to the procedures you've
established. If a deviation is unavoidable, record it. Take the time to be through and
meticulous. Careless execution of your observational procedures will result in invalid data
and can ruin a thesis.
 Perform data analysis. You will need to organize your observations and compile them
into totals, percentages, and other basic information. Follow this up with the more
detailed data analysis, such as generating statistics like standard deviations and
regression analysis. Review your findings and look for gaps in your data. If you are doing
genuinely original research, your findings at this point will almost certainly bring out new
questions you need to answer.
 Revisit the data collecting phase of your research if needed. Gather more data to address
questions brought out in your data analysis and repeat Steps 3 and 4 to process the
additional data.
 Present your findings. Using tables, graphs and text, write up your findings. Remember
tat thesis findings go in a section of your written thesis separate from your literature
review, discussion and other sections. Keep your writing clearly defined and focused.
You should prepare to present your findings to audiences in your department and at
conferences. Finally, discuss your findings with faculty and prepare to answer objections
and challenges to your work.

Some Guidelines for writing Findings:

1. Describe the findings in a manner that allows the reader to gain a clear
understanding of the type of study that was involved in the research. It should be clear to

161
the reader whether the study was a case study, a correlational study, or an experiment.
It would be best to state the type of study when describing the findings. For example, if
it was an experiment, a sentence could start with the words, "In the experiment..."

2. If the findings are from a correlational study, the description of the findings could
involve a brief description of how the variables were measured . For example, if the study
addressed the relationship between empathy and helping behavior, the description of
the findings could involve a description of how empathy was measured in the study.

3. If the findings are from an experiment, the description of the study could involve a
description of the conditions in the experiment. For example, imagine that an experiment
addressed the influence of listening to music on productivity, and there were two
conditions: experimental condition with music and control condition without music. In
this example, it would be will to describe both the experimental condition and the control
condition.

4. If the study was an experiment, it is important to mention whether the participants


were randomly assigned to conditions. Random assignment allows the scholar to make
causal conclusions because he can rule out explanations based on personality and
individual differences.

5. It is important to mention whether the findings are statistically significant . If the


findings are not statistically significant, we would not conclude that there is a difference
between conditions in an experiment, or that variables are associated in a correlational
study. Generally, a finding of a study is considered statistically significant if the chance
probability is less than .05 (the p value for the finding is indicated as less than .05).

6. Causal conclusions should not be made from correlational findings. We cannot


make causal conclusions from correlational findings because we are not able to rule out
alternative explanations. Thus, causal language should not be used when describing
correlational findings. Words such as "effect," "cause," and "influence" should not be
used for correlational findings. However, it would be fine to use the words "relationship,"
"association," and "correlation" for correlational findings. For example, imagine that
empathy was found to be associated with helping behavior in a correlational study. In
this example, it would be acceptable to state that empathy was found to be associated,
correlated, or related to helping behavior. In this example, it would not be acceptable to
state that empathy influenced, caused, or had an effect on helping behavior.

“Life isn't about finding yourself. Life is about creating yourself.”


<< George Bernard Shaw

162
A2Z

PhD
Thesis
Reflections on Academic Research

Chapter XIX

Structure of Thesis

163
STRUCTURE OF THESIS
Introduction

This chapter addresses the problem/issues/difficulties involved in designing and


structuring a PhD Thesis. The structure developed provides a starting point for
understanding what a PhD thesis should set out to achieve, and also provides a basis
for communication between a scholar and the supervisor. A thesis is the acquisition and
dissemination of new knowledge. It is more important that "new" is not only just new to
the academic scholar, but also new to the community. In the past PhDs were sometimes
failed because a paper was published by another scholar a few weeks previously
dealing with the same work. But in the era of ICT, novelty/originality/new
understanding/marshalling existing ideas in ways that provide new insights is more
interesting and challenging to an academic research scholar. The basis and foundations
for this structured model are the author’s own experience of supervising, examining and
adjudicating conflicting examiners' reports of many post-graduate dissertations and PhD
theses in management and related fields.

Characteristics of a PhD Thesis

Ideally, PhD research in management or a related field should:

 cover a field which fascinates the candidate sufficiently for him or her to endure
years of hard and solitary work;
 build on the candidate's previous studies, for example, his or her course work in
a Master's degree;
 be in an area of `warm' research activity rather than in a `cold', overworked area
or in a `hot', too-competitive, soon-to-be extinguished area;
 be in an area near the main streams of a discipline and not at the margins of a
discipline or straddling two disciplines - being near the main streams makes it
easier to find thesis examiners, to gain academic positions, and to get
acceptance of journal articles about the research;
 be manageable, producing interesting results and a thesis in the shortest time
possible;
 have accessible sources of data;
 open into a program of research projects after the PhD is completed; and provide
skills and information for obtaining a job in a non-research field.

Delimitations

This proposed approach may be limited to PhDs in management areas such as


marketing, human resources, strategic management, etc., which involve common
164
quantitative and qualitative methodologies. The structure may not be appropriate for
PhDs in other areas or for management PhDs using relatively unusual methodologies
such as historical research designs. Moreover, this approach is a starting point for
thinking about how to present a thesis rather than the only structure which can be
adopted, and so it is not meant to inhibit the creativity of PhD researchers. Moreover,
adding one or two chapters to the five presented here, can be justified wherever
applicable or so suggested by the supervisors.

Another limitation of the approach is that it is restricted to presenting the final thesis.
This chapter does not address the techniques of actually writing a thesis. Moreover, the
approach discussed here does not refer to the actual sequence of writing the thesis, nor
is it meant to imply that the issues of each chapter have to be addressed by the scholar
in the order shown. For example, the hypotheses at the end of Chapter 2 are meant to
appear to be developed as the chapter progresses, but the scholar might have a good
idea of what they will be before he or she starts to write the chapter. And although the
methodology of Chapter 3 must appear to be selected because it was appropriate for
the research problem identified and carefully justified in Chapter 1, the candidate may
have actually selected a methodology very early in his or her candidature and then
developed an appropriate research problem and justified it.

Moreover, after a scholar has sketched out a draft table of contents for each chapter, he
or she should begin writing the `easiest parts' of the thesis first as they go along,
whatever those parts are - and usually introductions to chapters are the last to be
written. But it should be borne in mind that the research problem, limitations and
research gaps in the literature must be identified and written down before other parts of
the thesis can be written.

Flexibility of Structure

A five chapter structure can be used to effectively present a PhD Thesis, and the thesis
should have a unified structure.

 Chapter I : introduces the core research problem and then `sets the scene' and outlines
the path which the examiner will travel towards the thesis' conclusion.
 Chapter II : the research problem and hypotheses arising from the body of knowledge
developed by the global research community.
165
 Chapter III : methods used in this research to collect data about the hypotheses.
 Chapter IV : results of applying those methods in this research.
 Chapter V : conclusions about the hypotheses and research problem based on the
results of Chapter 4, including their place in the body of knowledge outlined previously in
Chapter 2

This five chapter structure can be justified. Firstly, the structure is a unified and focussed
one, and so addresses the major faults observed in the postgraduate theses and that is,
it clearly addresses the evalators’ difficulty in discerning what was the `thesis' of the
thesis? Supervisors need to emphasise throughout the research process that they are
striving in the thesis to communicate one big idea and that one big idea is the research
problem stated in the earlier pages of the thesis and explicitly solved in Chapter 5.
Easterby-Smith et al. (1991) also emphasise the importance of consistency in a PhD
thesis, and Phillips and Pugh (1987, p. 38) confirm that a thesis must have a thesis or a
`position'. The proposed structure is explicitly or implicitly followed by many writers of
articles in prestigious academic journals such as The Academy of Management Journal
and Strategic Management Journal (for example, Datta et al. 1992). Above all, the
proposed structure is akin to a standard proposal much like that which will be used by
the scholars later in their career, to apply for research grants (Krathwohl 1977; Poole
1993). Finally, by reducing time wasted on unnecessary tasks or on trying to demystify
the PhD process, the five chapter structure provides a mechanism to shorten the time
taken to complete a PhD, an aim becoming desired in many countries (Cude 1989).

Special and salient feature of this five Chapter structure is its inherent astonishing
flexibility – both adaptable and adoptable. For example, a research scholar may find it
convenient to expand the number of chapters to six or seven because of unusual
characteristics of the analysis in his or her research study; for example, a PhD might
consist of two stages: some qualitative research could be positioned in Chapters 3 and
4 of the thesis described below, which is then followed by some quantitative research to
refine the initial findings which could be positioned in Chapters 5 and 6; the Chapter 5
described below would then become Chapter 7. Ultimate aim is, PhD research must
remain an essentially creative exercise. However, the five chapter structure proposed
herein functions as a very good and solid starting point for understanding what a PhD
thesis should set out to achieve, and also provides a basis for communication between a
research scholar and others, viz his/her supervisor, examiners as well as entire

166
academic fraternity. It is earnestly suggested that academic research scholars, or
business researchers for that matter, may find this flexible structure most suitable and
save lot of time and energy in their research study. In other words, by reducing time
wasted on unnecessary tasks or on trying to demystify the PhD process, this five chapter
structure offers a mechanism to shorten the time taken to complete a PhD, an aim
becoming most desired in many countries (Cude 1989).

Links among Chapters

As any PhD thesis should have a unified structure, great care should be taken to ensure
that all the chapers [either 5 or 6 or 7] should stand alone and at the same time, should
be seamlessly linked without compromising the quality and merit of each chaper. Each
chapter (except the first) should have an introductory section linking the chapter to the
main idea of the previous chapter and outlining the aim and the organisation of the
chapter. The introductory section of chapter 5 (that is, section 5.1) will be longer than
those of other chapters, for it will summarise all earlier parts of the thesis prior to making
conclusions about the research described in those earlier parts; that is, section 5.1 will
repeat the research problem and the research questions/hypotheses. Each chapter
should also have a concluding summary section which outlines major themes
established in the chapter, without introducing new material. The five chapters may have
these respective percentages of the thesis' words: 5, 30, 15, 25 and 25 percent. [For six
chapters: these may 5, 20, 15, 15, 20 and 25; for seven chapters: these may be 5, 20,
10, 15, 15, 15, 20. [Note: The percentage is highly flexible and varies depending upon
the areas, scope, depth and nature of research study undertaken.]

Style

Within each of the chapters of the thesis, the spelling, styles, formats, etc should be
followed scrupulously, so that the scholar uses consistent styles from the first draft and
throughout the thesis for processes such as using bold type, underlining with italics,
indenting quotations, single and double inverted commas, making references, spaces
before and after side headings and lists, and gender conventions. Moreover, using the
authoritative APA Style Manual provides a defensive shield against an examiner who
may criticise the thesis from the viewpoint of his or her own idiosyncratic style. A PhD

167
thesis has some style rules of its own. Chapter 1 is usually written in the present tense
with references to literature in the past tense; the rest of the thesis is written in the past
tense as it concerns the research after it has been done, except for the findings in
Chapter 5 which are presented in the present tense. More precisely for Chapters 2 and
3, schools of thought and procedural steps are written of in the present tense and
published researchers and the candidate's own actions are written of in the past tense.

Further, value judgements and words should not be used in the objective pursuit of truth
that a thesis reports. For example, `it is unfortunate', `it is interesting', `it is believed', and
`it is welcome' are inappropriate. Although first person words such as `I' and `my' are
now acceptable in a PhD thesis, their use should be meticulously controlled or preferably
totally avoided. In case, research scholar feels and would like to call any authority’s
opinion as ‘wrong’, instead it could be worded as ‘misleading’. In short, the research
scholar should always be trying to communicate with the examiners in an easily-followed
way.

Another important aspect to be considered is that the word `etc' is too imprecise to be
used in a thesis. Furthermore, words such as `this', `these', `those' and `it' should not be
left dangling - they should always refer to an object; for example, `This rule should be
followed' is preferred to `This should be followed'. Also brackets should be rearely used
or if possible, totally avoided. Paragraphs should be short; as a rule of thumb, two to
three paragraphs should start on each page if the preferred line spacing of 1.5 and Arial
12 point font is used to provide adequate structure and complexity of thought on each
page. Margins should be those suggested by the university.

These above obervations about structure and style adequately imply that a PhD thesis
with its readership of two/three examiners is different from a book which has a very wide
readership (Derricourt 1992), and from shorter conference papers and journal articles
which do not require the burden of proof and references to broader bodies of knowledge
required in PhD theses. Candidates should be aware of these differences and could
therefore consider concentrating on completing the thesis before adapting parts of it for
other purposes.

168
The thesis will have to go through many drafts (Zuber-Skerritt & Knight 1986). The first
draft will be started early in the research process, be crafted after initial mindmapping
and a tentative table of contents of a chapter and a section, through the `right', creative
side of the brain and will emphasise basic ideas without much concern for detail or
precise language. Facilitating the creative first drafts of sections, the relatively visible
and structured `process' of this paper's structure allows the candidate to
be more creative and rigorous with the `content' of the thesis than he or she would
otherwise be. After the first rough drafts, later drafts will be increasingly crafted through
the `left', analytical side of the brain and emphasise fine tuning of arguments, justification
of positions and further evidence gathering from other research literature.

Suggested Sequence of a PhD Thesis [FIVE Chapters]

Title page
Abstract (with keywords)
Table of contents
List of tables
List of figures
Abbreviations
Statement of original authorship
Acknowledgments

1 Introduction [maximum 8 sections, each having 2/3 subsections]

1.1 Background to the research


1.2 Research problem and hypotheses
1.3 Justification for the research
1.4 Methodology
1.5 Outline of the report
1.6 Definitions
1.7 Delimitations of scope and key assumptions
1.8 Conclusion

2 Literature review [maximum 8 sections, each having 3/5 subsections]

2.1 Introduction
2.2 Parent disciplines and classification models
2.3 Developing and Current Literature
2.4 Earlier Literature
2.5 Immediate discipline and analytical models
2.6 Research Gap in the available Literature
2.7 Area identified for the Research Study
2.8 Conclusion

169
3 Methodology [maximum 5 sections, each havig 2/3 subsections]

3.1 Introduction
3.2 Justification of Methodology
3.3 Details of Research Procedures
3.4 Ethical considerations
3.5 Conclusion

4 Analysis of Data [maximum 5 sections, each having 2/6 subsections]

4.1 Introduction
4.2 Statistical Tools used
4.3 Data about subjects
4.4 Detailed Pattern of data
4.5 Conclusion

5 Conclusions [maximum 8 sections, each having 2/4 subsections]

5.1 Introduction
5.2 Discussion about each research question
5.3 Discussion about the research problem
5.4 Implications for theory
5.5 Delimitations
5.6 Major suggestions
5.7 Major recommendations
5.8 Scope for Further Research

Bibliography

Appendices

Note: This five chapter model forms the basis of the structure of the Thesis and, if necessary
could be extended to either six or seven chapters depending upon the circumstances. A research
scholar may find it convenient to expand the number of chapters to six or seven. For example, in
the following cases:

[a] Mixed research, incorporating both quantitative and qualitative methods;


[b] Case study method;
[c] Action research;
[d] Any research study which requires unusual characteristics of the data analysis.

number of chapters, number of sections, number of subsections and also order of sections may
significantly vary. However, this five chapter model remains as a very strong foundation.

”Study without desire spoils the memory, and it retains nothing that it takes in.”
<< Leonardo da Vinci

170
Chapter XX

A2Z
PhD
Thesis
Reflections on Academic Research

Research Discussion
171
RESEARCH DISCUSSION
Introduction

A Discussion section should not be simply a summary of the results the scholar has
found and at this stage he/she will have to demonstrate original thinking. First, the
scholar should highlight and discuss how the research has reinforced what is already
known about the area. Many research scholars make the mistake of thinking that they
should have found something new; in fact, very few research stufies have findings that
are unique. Instead, the scholar is likely to have a number of findings that reinforce what
is already known about the field and the scholar needs to highlight these.

Second, the research scholar may have discovered something different and if this is the
case, he will have plenty to discuss! He should outline what is new and how thi
compares to what is already known. He should also attempt to provide an explanation as
to why the research identified these differences. Third, he needs to consider how the
results extend knowledge about the field. Even if there are similarities between the
results and the existing work of others, the research extends knowledge of the area, by
reinforcing current thinking. It is important that this section is comprehensive and well
structured; making clear links back to the literature you reviewed earlier in the project.
This will allow the scholar the opportunity to demonstrate the value of the research study
and it is therefore very important to discuss the researach work thoroughly.

Definition

The discussion section explains your interpretation of the findings as they relate to the
research problem you have investigated. This section is comprised of all new information
and focuses on the implications of your findings in relation to the overall scope of other
research that has taken place. The significance of the research findings should be
clearly described.

Importance of a Good Discussion

This section is often considered the most important part of a research paper because it
most effectively demonstrates your ability as a researcher to think critically about an
172
issue, to develop creative solutions to problems based on the findings, and to formulate
a deeper, more profound understanding of the issues you are studying.
The discussion section is where you explore the underlying meaning of your research , its
possible implications on other areas of study, and the possible improvements that can
be made in order to further develop the concerns of your research. This is also the
section where you need to present the importance of your study and how it may be able
to contribute to the field.
This part of the paper is not strictly governed by objective reporting of information but,
rather, it is where you can engage in creative thinking about issues through evidence-
based interpretation of findings.

Contents

1. Explanation of results: comment on whether or not the results were expected and
present explanations for the results; go into greater depth when explaning
findings that was unexpected or especially profound.
2. References to previous research: compare your results with those reported in the
literature, or use of the literature to support a claim. This can include re-visiting
key studies already cited in your literature review section, or, save them to cite
later in the discussion section.
3. Deduction: a claim for how the results can be applied more generally. For
example, describing lessons learned or proposing recommendations that can
help improve a situation.
4. Hypothesis: a more general claim or possible conclusion arising from the results
[which may be proved or disproved in subsequent research].

Structure of the Discussion Section

[a] Reiterate the Research Problem/State the Major Findings

Briefly reiterate for your readers the research problem or problems you are investigating
and the methods you used to investigate them, and then move quickly to describe the
major findings of the study. You should write a direct, declarative, and succinct
proclamation of the study results.
[b] Explain the Meaning of the Findings

No one has thought as long and hard about your study as you have. Systematically
explain the meaning of the findings and why you believe they are important. After
reading the discussion section, you want the reader to think about the results [“why
hadn’t I thought of that?”]. You don’t want to force the reader to go through the paper
multiple times to figure out what it all means.
[c] Relate the Findings to Similar Studies

No study is so novel or possesses such a restricted focus that it has absolutely no


relation to other previously published research. The discussion section should relate the

173
research study findings to those of other studies, particularly if questions raised by
previous studies served as the motivation for the research study, the findings of other
studies support the findings [which strengthens the importance of the research study
results], and/or they point out how the current study differs from other similar studies.

[d] Consider Alternative Explanations of the Findings

It is important to remember that the purpose of research is to discover and not to prove.
When writing the discussion section, the scholar should carefully consider all possible
explanations for the study results, rather than just those that fit the prior assumptions or
biases.
[e] Acknowledge the Study’s Limitations

It is far better for you to identify and acknowledge your study’s limitations than to have
them pointed out by the supervisor! Describe the generalizability of the results to other
situations, if applicable to the method chosen, then describe in detail problems that have
been encountered in the method(s) that is used to gather information. Note any
unanswered questions or issues the study did not address.

[f] Make Suggestions for Further Research

Although the research study may offer important insights about the research problem,
other questions related to the problem likely remain unanswered. Moreover, some
unanswered questions may have become more focused because of the current study.
The scholar should make suggestions for further research in the discussion section.
[Note: Recommendations for further research can be included in your conclusion instead, but don't repeat in
both.]

Some Tips regarding the language to be used


The term discussion has a variety of meanings in English. In academic writing, however,
it usually refers to two types of activity: (a) considering both sides of an issue, or
question,
(b) considering the results of research and the implications of these. Discussion sections
in theses and research articles/papers are probably the most complex in terms of their
elements. The most common elements and some of the language that is typically
associated with them are listed below:

Background information
A strong relationship between X and Y has been reported in the literature.
Prior studies that have noted the importance of ......
In reviewing the literature, no data was found on the association between X and Y.
As mentioned in the literature review, ......
Very little was found in the literature on the question of .....
174
This study set out with the aim of assessing the importance of X in ......
The third question in this research was ......
It was hypothesized that participants with a history of ......
The present study was designed to determine the effect of ......

Statements of result
The results of this study show/indicate that .......
This experiment did not detect any evidence for ......
On the question of X, this study found that ......
The current study found that ......
The most interesting finding was that ......
Another important finding was that .....
The results of this study did not show that ....../did not show any significant increase in ......
In the current study, comparing X with Y showed that the mean degree of ......
In this study, Xs were found to cause .....
X provided the largest set of significant clusters of ......
It is interesting to note that in all seven cases of this study......

Unexpected outcome
Surprisingly, X was found to .......
Surprisingly, no differences were found in ......
One unanticipated finding was that .....
It is somewhat surprising that no X was noted in this condition ......
What is surprising is that ......
Contrary to expectations, this study did not find a significant difference between .......
However, the observed difference between X and Y in this study was not significant.
However, the ANOVA (one way) showed that these results were not statistically significant.
This finding was unexpected and suggests that ......

[A] Reference to previous research


This study produced results which corroborate the findings of a great deal of the previous work in
this field.
The findings of the current study are consistent with those of Smith and Jones (2001) who found
......
This finding supports previous research into this brain area which links X and Y.
This study confirms that X is associated with ......
This finding corroborates the ideas of Smith and Jones (2008), who suggested that ......
This finding is in agreement with Smith's (1999) findings which showed .......
It is encouraging to compare this figure with that found by Jones (1993) who found that .....
There are similarities between the attitudes expressed by X in this study and those described by
(Smith, 1987, 1995) and Jones (1986)
These findings further support the idea of .....
Increased activation in the PCC in this study corroborates these earlier findings.
These results are consistent with those of other studies and suggest that ......
The present findings seem to be consistent with other research which found ......
This also accords with our earlier observations, which showed that ......

[B] Reference to previous research


In case, the findings of the current study do not support the previous research.
This study has been unable to demonstrate that ......
However, this result has not previously been described.
In contrast to earlier findings, however, no evidence of X was detected.

175
Although, these results differ from some published studies (Smith, 1992; Jones, 1996), they are
consistent with those of ......
These results differ from X's 2003 estimate of Y, but they are broadly consistent with earlier .....

Explanations for results:


There are several possible explanations for this result.
These differences can be explained in part by the proximity of X and Y.
A possible explanation for this might be that .....
Another possible explanation for this is that ......
This result may be explained by the fact that ...../ by a number of different factors.
It is difficult to explain this result, but it might be related to ......
It seems possible that these results are due to ......
The reason for this is not clear but it may have something to do with ......
It may be that these students benefitted from ......
This inconsistency/discrepancy may be due to ......
This rather contradictory result may be due to ......
These factors may explain the relatively good correlation between X and Y.
There are, however, other possible explanations.
The possible interference of X can not be ruled out.
The observed increase in X could be attributed to .....
The observed correlation between X and Y might be explained in this way. .....
9,30
Some authors have speculated that ......
Since this difference has not been found elsewhere it is probably not due to ......
A possible explanation for some of our results may be the lack of adequate ......

Advising cautious interpretation


These data must be interpreted with caution because ......
These results therefore need to be interpreted with caution.
However, with a small sample size, caution must be applied, as the findings might not be
transferable to ..
These findings cannot be extrapolated to all patients.
Although exclusion of X did not reduce the effect on X, these results should be interpreted with
caution.
However, with a small sample size, caution must be applied, as the findings might not be
transferable to .

Suggesting general hypotheses


The value of X suggests that a weak link may exist between .....
It is therefore likely that such connections exist between .....
It can thus be suggested that ......
It is possible to hypothesise that these conditions are less likely to occur in ......
It is possible/likely/probable therefore that ......
Hence, it could conceivably be hypothesised that ......
These findings suggest that ......
It may be the case therefore that these variations ......
In general, therefore, it seems that ......
It is possible, therefore, that ......
Therefore, X could be a major factor, if not the only one, causing ......
It can therefore be assumed that the ......
This finding, while preliminary, suggests that……

176
Noting implications
This finding has important implications for developing .....
An implication of this is the possibility that ......
One of the issues that emerges from these findings is ......
Some of the issues emerging from this finding relate specifically to ......
This combination of findings provides some support for the conceptual premise that .....

Commenting on findings
However, these results were not very encouraging.
These findings are rather disappointing.
The test was successful as it was able to identify students who ......
The present results are significant in at least major two respects.
The results of this study do not explain the occurrence of these adverse events.

Suggestions for future work


However, more research on this topic needs to be undertaken before the association between X
and Y is more clearly understood.
Further research should be done to investigate the ......
Research questions that could be asked include .....
Future studies on the current topic are therefore recommended.
A further study with more focus on X is therefore suggested.
Further studies, which take these variables into account, will need to be undertaken.
Further work is required to establish this.
In future investigations it might be possible to use a different X in which ......
This is an important issue for future research.

Conclusion
Besides the literature review section, the preponderance of references to sources in the
research paper should be in the discussion section. A few historical references may be
helpful for perspective but most of the references should be relatively recent and
included to aid in the interpretation of the results and to similar studies. If a study that
already cited disagrees with the findings, they should not be ignored, but should be
clearly explained why the study's findings differ from the present research study.

“Discussion is just a tool. You have to aim; the final goal must be a decision.”
<< Harri Holkeri

177
A2Z

PhD
Thesis
Reflections on Academic Research

178
Writing a Thesis
Chapter XXI
WRITING A THESIS
Research scholars encounter many pitfalls when writing a thesis. A well-written thesis is
essentially a sustained analysis of a research topic and even the most careful scholar
can succumb to commonly made mistakes in a work of this magnitude. The primary
problems that research scholars encounter when writing a thesis are related to the
matters of clarity and organization. In an analysis of this length and breadth, it is easy to
lose focus and direction. Because of the substantial research that goes into producing a
thesis, one can veer off track and lose stamina. This chapter discusses the planning of
the writing process, issues/difficulties encounted and mainly, commonly made mistakes
of writing thesiss such as the danger of disorganization, the problem of writing a worthy
conclusion and the problem of writing an analytical literature review and offers some
strategies to overcome them.

The Essence of a Draft Thesis

 A thesis is a hypothesis or conjecture.

 A PhD thesis is a lengthy and formal document.

 Two important adjectives used to describe a thesis are ``original'' and ``substantial.'' The
research performed to support a thesis must be both.

 The scientific method means starting with a hypothesis and then collecting evidence to
support or deny it. Before one can write a thesis, one must collect evidence that supports
it. Thus, the most difficult aspect of writing a thesis consists of organizing the evidence
and associated discussions into a coherent form.

 The essence of a thesis is critical thinking, not experimental data. Analysis and concepts
form the heart of the work.

 A thesis concentrates on principles: it states the lessons learned, and not merely the
facts behind them.

 In general, every statement in a thesis must be supported either by a reference to


published scientific literature or by original work. Moreover, a thesis does not repeat the
details of critical thinking and analysis found in published sources; it uses the results as
fact and refers the reader to the source for further details.

 Each sentence in a thesis must be complete and correct in a grammatical sense.


Moreover, a thesis must satisfy the stringent rules of formal grammar (e.g., no
contractions, no colloquialisms, no slurs, no undefined technical jargon, no hidden jokes,
and no slang, even when such terms or phrases are in common use in the spoken
language). Indeed, the writing in a thesis must be crystal clear. Shades of meaning
179
matter; the terminology and prose must make fine distinctions. The words must convey
exactly the meaning intended, nothing more and nothing less.

 Each statement in a thesis must be correct and defensible in a logical and scientific
sense. Moreover, the discussions in a thesis must satisfy the most stringent rules of logic
applied to mathematics and science.

Commonly made Mistakes

One of the main problems of writing a thesis is maintaining organized trains of thought. It
is all too easy to fail to define concepts clearly and to waste time and energy on only
marginally related topics. A good thesis defines important concepts clearly and concisely
and uses the same terminology and its attendant definitions consistently throughout the
entire thesis. Do not make the mistake of using different words to describe a particular
terminology and do not define the terminology in one way in one instance and an in a
different way in an another. It is important that the writer be consistent with definitions.
Otherwise, the reader will not be able to understand the definitions presented in the
thesis.

Another strategy to prevent disorganization is to write a table contents before actually


starting on a thesis. This way, the student can decide on what sections to work on first
and to make decisions on how to organize the thesis. It also provides an overview of the
thesis and a broad picture of the links between main ideas and concepts. Taking notes
and creating a bibliography are organizational nightmares. Some research scholars
prefer the aid of software to organize their notes and bibliography. Thus, by defining
concepts the same way throughout the thesis, writing a table of contents before starting
the thesis and using note-taking and bibliographic software, research scholars can better
organize their thesis.

The literature review is an important part of any custom written thesis and research
scholars can all too easily fall into the trap of writing summaries of articles. This method
prevents the research scholar from showcasing his/her critical thinking and analytical
skills and may also force the reader to lose interest in the research topic. The literature
review essentially is a chance for the research scholar to demonstrate his/her knowledge
on the research topic and to provide evidence for the arguments presented in the paper.
Because the thesis demands a substantial amount of research, the student, being tired

180
and frustrated, resorts to writing summaries of their research materials. This is a
mistake. The purpose of the literature review is to provide a perspective on the research
topic, to introduce and discuss important theoretical frameworks, to define key concepts
and point out connections between main ideas. In short, it provides a reader with a
model of what is going on in the thesis. By organizing the literature review by categories
of analysis, he/she can avoid organizing the thesis by summary.

By the time that the student reaches the point where he/she is able to write a conclusion,
he/she is drained of energy and willpower. It is important to remember that the
conclusion is important for the reader because it ties together all of the ideas and
concepts analyzed in the custom written thesis. It reafirms what the reader has learned
from the thesis and explains key inferences. It essentially brings home what the thesis is
all about. The writer should avoid repeating the thesis and should concentrate on
explaining what can be inferred from the evidence presented in the thesis and should
also discuss the implications of the points made. A good conclusion leaves the reader
with the feeling that he/she has grasped the main ideas of the thesis.

In conclusion, commonly made mistakes of writing a thesis are disorganization and


poorly constructed literature reviews and conclusions. Because the thesis is a lengthy
work, writers should heed the advice of their colleagues and write everyday. With time,
practice and effort, one can correct these mistakes and improve the quality of his/her
thesis. Writing a thesis is an arduous task and it is not hard to lose clarity and
organization while engaged in this process. These strategies are meant as suggestions
to overcome commonly made mistakes and aim to alleviate feelings of bewilderment that
scholars encounter as they write their thesis.

Avoiding some Terms and Phrases

Terms / Phrases Comments

adverbs Mostly, they are very often overly used. Use strong words instead.
For example, one could say, ``Writers abuse adverbs.''
jokes or puns They have no place in a formal document.
``bad'', ``good'', nice'', A scientific dissertation does not make moral judgements. Use
``terrible'', ``stupid'' ``incorrect/correct'' to refer to factual correctness or errors. Use
precise words or phrases to assess quality (e.g., ``method A
requires less computation than method B''). In general, one
181
should avoid all qualitative judgements.
``true'', ``pure'', In the sense of ``good'' (it is judgemental)

``perfect'' Nothing is.


``an ideal solution'' You're judging again.
``today'', ``modern times'' Today is tomorrow's yesterday.
``soon'' How soon? Later tonight? Next decade?
``we were surprised to Even if you were, so what?
learn...''
``seems'', ``seemingly'' It doesn't matter how something appears
``would seem to show'' all that matters are the facts.
``in terms of'' usually vague
``based on'', ``X-based'', ``as careful; can be vague
the basis of''
``different'' Does not mean ``various''; different than what?
``in light of'', ``due to'' colloquial
``lots of'', ``kind of'', vague & colloquial
``type of'', ``something like'',
``just about''
``number of'' vague; do you mean ``some'', ``many'', or ``most''? A quantative
statement is preferable
``probably'' only if you know the statistical probability (if you do, state it
quantatively)

``obviously, clearly'' be careful: obvious/clear to everyone?


``simple'' Can have a negative connotation, as in ``simpleton''
``along with'' Just use ``with''
``actually, really'' define terms precisely to eliminate the need to clarify
``the fact that'' makes it a meta-sentence; rephrase
``this'', ``that'' As in ``This causes concern.'' Reason: ``this'' can refer to the
subject of the previous sentence, the entire previous sentence,
the entire previous paragraph, the entire previous section, etc.
More important, it can be interpreted in the concrete sense or in
the meta-sense. For example, in: ``X does Y. This means ...'' the
reader can assume ``this'' refers to Y or to the fact that X does it.
Even when restricted (e.g., ``this computation...''), the phrase is
weak and often ambiguous.
``You will read about...'' The second person has no place in a formal thesis.
``I will describe...'' The first person has no place in a formal dissertation. If self-
reference is essential, phrase it as ``Section 8 describes...''
``we'' as in ``we see that'' A trap to avoid. Reason: almost any sentence can be written to
begin with ``we'' because ``we'' can refer to: the reader and
author, the author and advisor, the author and research team,
experimental computer scientists, the entire computer science
community, the science community, or some other unspecified
group.
``...a famous researcher...'' It doesn't matter who said it or who did it. In fact, such statements
prejudice the reader.
``few, most, all, any, every''. A thesis is precise. If a sentence says ``Most computer systems
contain X'', you must be able to defend it. Are you sure you really
know the facts? How many computers were built and sold
182
yesterday?
``must'', ``always'' Absolutely?
``should'' Who says so?
‘`proof'', ``prove'' Would a mathematician agree that it's a proof?
``show'' Used in the sense of ``prove''. To ``show'' something, you need to
provide a formal proof.
``can/may'' Common sense says what is the difference?

Some Guidelines for Writing a Thesis

1. Write up a preliminary version of the background section first. This will serve as
the basis for the introduction in your final paper.

2. As you collect data, write up the methods section. It is much easier to do this
right after you have collected the data. Be sure to include a description of the
research equipment and relevant calibration plots.

3. When you have some data, start making plots and tables of the data. These will
help you to visualize the data and to see gaps in your data collection. If time
permits, you should go back and fill in the gaps. You are finished when you have
a set of plots that show a definite trend (or lack of a trend). Be sure to make
adequate statistical tests of your results.

4. Once you have a complete set of plots and statistical tests, arrange the plots and
tables in a logical order. Write figure captions for the plots and tables. As much
as possible, the captions should stand alone in explaining the plots and tables.
Many scientists read only the abstract, figures, figure captions, tables, table
captions, and conclusions of a paper. Be sure that your figures, tables and
captions are well labeled and well documented.

5. Once your plots and tables are complete, write the results section. Writing this
section requires extreme discipline. You must describe your results, but you must
NOT interpret them. (If good ideas occur to you at this time, save them at the
bottom of the page for the discussion section.) Be factual and orderly in this
section, but try not to be too dry.

6. Once you have written the results section, you can move on to the discussion
section. This is usually fun to write, because now you can talk about your ideas

183
about the data. If you can come up with a good cartoon/schematic showing your
ideas, do so. Many papers are cited in the literature because they have a good
cartoon that subsequent authors would like to use or modify.

7. In writing the discussion session, be sure to adequately discuss the work of other
authors who collected data on the same or related scientific questions. Be sure to
discuss how their work is relevant to your work. If there were flaws in their
methodology, this is the place to discuss it.

8. After you have discussed the data, you can write the conclusions section. In this
section, you take the ideas that were mentioned in the discussion section and try
to come to some closure. If some hypothesis can be ruled out as a result of your
work, say so. If more work is needed for a definitive answer, say that.

9. The final section in the paper is a recommendation section. This is really the end
of the conclusion section in a scientific paper. Make recommendations for further
research or policy actions in this section. If you can make predictions about what
will be found if X is true, then do so. You will get credit from later researchers for
this.

10. After you have finished the recommendation section, look back at your original
introduction. Your introduction should set the stage for the conclusions of the
paper by laying out the ideas that you will test in the paper. Now that you know
where the paper is leading, you will probably need to rewrite the introduction.

11. You must write your abstract last.

“We do not write because we want to; we write because we have to.”
<< Somerset Maugham

184
A2Z

PhD
Thesis
Reflections on Academic Research

Chapter XXII

Chapter 6

Anatomy of an Abstract

185
ANATOMY OF AN ABSTRACT
Dissertation or detailed discourse is a document represents author’s findings and
research relating to a particular field and is submitted in support of such person to obtain
a degree or professional qualification. The thought or theory of one’s dissertation
explains what will be written in his/her dissertation. It must be written in such a way so
that reader finds it interesting to read and also encourages the whole dissertation.
Arousing interest in the mind about one’s dissertation is really important so that it will be
well received afterwards. For this, it is crucial to have a perfect thesis abstract, and it
must be written in such a way in which it is supposed to be written and the way in which
it is expected from your institution.

Many people struggle to write a good abstract because they know that a poor abstract
will wrecked their whole dissertation. Even if the whole dissertation is perfect, a mere
indifference in the quality of abstract will turnoff the mind of the reader from the whole
thesis. Thus you should put your maximum efforts to write a astonishing thesis abstract
so that output can be obtained in form of encouragement and nice suggestions from the
readers.

What is an abstract?
An abstract is a condensed version of a longer piece of writing that highlights the major
points covered, concisely describes the content and scope of the writing, and reviews
the writing's contents in abbreviated form.

What types of abstracts are typically used?


Two types of abstracts are typically used:

1. Descriptive Abstracts
o tell readers what information the report, article, or paper contains.
o include the purpose, methods, and scope of the report, article, or paper.
o do not provide results, conclusions, or recommendations.
o are always very short, usually under 100 words.
o introduce the subject to readers, who must then read the report, article, or paper
to find out the author's results, conclusions, or recommendations .
2. Informative Abstracts
o communicate specific information from the report, article, or paper.
o include the purpose, methods, and scope of the report, article, or paper.
o provide the report, article, or paper's results, conclusions, and recommendations.

186
o are short -- from a paragraph to a page or two, depending upon the length of the
original work being abstracted. Usually informative abstracts are 10% or less of
the length of the original piece.
o allow readers to decide whether they want to read the report, article, or paper.

Why are abstracts so important?


The practice of using key words in an abstract is vital because of today's electronic
information retrieval systems. Titles and abstracts are filed electronically, and key words
are put in electronic storage. When people search for information, they enter key words
related to the subject, and the computer prints out the titles of articles, papers, and
reports containing those key words. Thus, an abstract must contain key words about
what is essential in an article, paper, or report so that someone else can retrieve
information from it.

Qualities of a Good Abstract


An effective abstract has the following qualities:

 uses one or more well developed paragraphs: these are unified, coherent, concise, and
able to stand alone.
 uses an introduction/body/conclusion structure which presents the article, paper, or
report's purpose, results, conclusions, and recommendations in that order.
 follows strictly the chronology of the article, paper, or report.
 provides logical connections (or transitions) between the information included.
 adds no new information, but simply summarizes the report.
 is understandable to a wide audience.
 oftentimes uses passive verbs to downplay the author and emphasize the information.

Steps for Writing Effective Abstracts


To write an effective abstract, follow these steps:

 Reread the article, paper, or report with the goal of abstracting in mind.
o Look specifically for these main parts of the article, paper, or report: purpose,
methods, scope, results, conclusions, and recommendation.
o Use the headings, outline heads, and table of contents as a guide to writing your
abstract.
o If you're writing an abstract about another person's article, paper, or report, the
introduction and the summary are good places to begin. These areas generally
cover what the article emphasizes.
 After you've finished rereading the article, paper, or report, write a rough draft without
looking back at what you're abstracting.
o Don't merely copy key sentences from the article, paper, or report: you'll put in
too much or too little information.
o Don't rely on the way material was phrased in the article, paper, or report:
summarize information in a new way.
 Revise your rough draft to
o correct weaknesses in organization.
o improve transitions from point to point.
o drop unnecessary information.
o add important information you left out.
187
o eliminate wordiness.
o fix errors in grammar, spelling, and punctuation.
 Print your final copy and read it again to catch any glitches that you find.

Some Guidelines for Writing an Abstract

 A good abstract explains in one line why the paper is important. It then goes on
to give a summary of your major results, preferably couched in numbers with
error limits. The final sentences explain the major implications of your work. A
good abstract is concise, readable, and quantitative.
 Length should be ~ 2-3 paragraphs, approx. 150-250 words [1 Page in A4].
 Absrtracts generally do not have citations.
 Information in title should not be repeated.
 Be explicit.
 Use numbers where appropriate.
 Answers to these questions should be found in the abstract:
1. What did you do?
2. Why did you do it? What question were you trying to answer?
3. How did you do it? State methods.
4. What did you learn? State major results.
5. Why does it matter? Point out at least one significant implication.

“When I examine myself and my methods of thought, I come to the conclusion that the gift of
fantasy has meant more to me than any talent for abstract, positive thinking.” << Albert
Einstein

188
A2Z

PhD
Thesis
Reflections on Academic Research

Chapter XXIII

Endnotes
189
ENDNOTES
Introduction

Millions of researchers, scholarly writers, students, and librarians use EndNote to search
online bibliographic databases, organize their references, images and PDFs in any
language, and create bibliographies and figure lists instantly. Instead of spending hours
typing bibliographies, or using index cards to organize their references, they do it the
easy way—by using EndNote! An endnote is source citation that refers the readers to a
specific place at the end of the paper where they can find out the source of the
information or words quoted or mentioned in the paper. Endnotes are used: (1) to cite
the source of statements quoted or closely paraphrased in the text, (2) to make
additional comments about some point of the text, or (3) to acknowledge someone else
for an idea or argument.

Endnotes only serve one purpose, to allow the reader to access with ease and
confidence the source that you have used. Any citation form that does well this is
appropriate, but most disciplines insist on their own particular way of citing information,
and you must follow those preferences. There is nothing magical about these forms --
they all do the same thing -- but you should get used to the fact that different disciplines
require different citation forms. In the case of either footnotes or endnotes, the only
indication that goes in the text itself is the footnote number, the small supra-number after
the text you wish to reference.

An Illustration

When using endnotes, your quoted or paraphrased sentence or summarized material is


followed by a superscript number.

Example:

Let's say that you have quoted a sentence from Lloyd Eastman's history of Chinese
social life. You have written this sentence:

According to Eastman, "The family was the central core of the Chinese social system."1

Analysis of the example:


190
Notice that there is a superscript number after the quotation. You insert the number by
using your word-processor's "insert reference" (or citation) function.

The superscript number corresponds to a note placed at the end of the paper (which is
called an endnote). Your word-processor will create a note number and a space at the
end of your paper, where you then fill in the citation. This endnote lets the reader know
where you found your information.

Note numbers are sequential: first note in your paper is numbered 1, the second note is
2 (even if you are quoting the same source as in #1), etc.

AGAIN, even if you are repeating a reference to the same source, your numbers must
continue in sequence (1, 2, 3, 4, 5). You must use "Arabic" numbers (1, 2, 3...), not
Roman numerals (i, ii, iii...)!

What do I put in the endnote (the part that appears at the end the paper) the first time I
refer to a source?

The first time you have a citation to a particular source, the note at the end of the paper
must include the following information in the following order:

Author’s first name then last name, Title of Book (City of publication: Publishing
company’s name, Date of Publication), Page Number of quoted, paraphrased, or
summarized material.

Example:

You have written this sentence:

According to Eastman, "The family was the central core of the Chinese social system."1

At the end of the paper (in the space set aside for this note by your word-processing
software), you would put the following information in the following order:

1Lloyd E. Eastman [1988], Family, Field, and Ancestors: Constancy and Change in
China's Social and Economic History, 1550-1949, New York: Oxford University Press,
53.

191
In addition to including this information at the end of the paper, this source of information
should also be included in Bibliography, at the end of the book.

What if I cite the same source again in my paper?

If you cite the same source again in you paper, use a short form for all subsequent
citations to that source:

Author's last name, First Words of Book Title, page number.

OR

Author's last name, page number.

Example:

You have already cited the Eastman, but then you cite it again in note #3:

3 Eastman, Family, Field, and Ancestors, 54.

Conclusion

The Endnote is a very useful tool for academic researchers who should make
use of the same to enrich the depth and quality of their research study and to
gain ample and sizable knowledge and experience in their own fields of
speciality. The quantity and quality of citations noted in a research thesis reflect
the seriousness and curiosity of research scholars. Evaluators or examiners
would definitely take a note of the scholarship of scholars and appreciate the
efforts put by the scholars.

“In my end is my beginning”. << TS Eliot

192
A2Z

PhD
Thesis
Reflections on Academic Research

Chapter XXIV Research Conclusion

193
RESEARCH CONCLUSION

Introduction

An effective concluding paragraph should provide closure for a paper, leaving the reader
feeling satisfied that the thesis has been fully explained. Probably the shortest paragraph
of an essay, the conclusion should be brief and to the point. The conclusion should
provide a restatement of the thesis, a summary of the author's conclusions, and perhaps
a solution to the problem, if this is the writer's intent. However, a good writer avoids a
blatant repetition of the thesis statement which can leave a reader feeling annoyed a
disappointed after reading an otherwise interesting paper. Repeating the thesis, word
for word, in the conclusion seems lazy and is not very interesting. It is best to restate the
ideas using different language, perhaps even to create a sort of dramatic effect that
comes from repetition. Good conclusions might have a dramatic quality -- rather like a
grand finale. The conclusion should leave the reader with an overall sense of how the
writer feels about the subject. Concluding statements which refer back to the
introductory paragraph are appropriate here. Frequently, the ideas in the body of an
essay lead to some significant conclusion that can be stated and explained in this final
paragraph. Finally, this is not the place to introduce ideas you forgot to mention in the
body of the paper!. For most essays, one well-developed paragraph is sufficient for a
conclusion, although in some cases, a two-or-three paragraph conclusion may be
required.

Definition

The conclusion is intended to help the reader understand why your research should
matter to them after they have finished reading the paper. A conclusion is not merely a
summary of your points or a re-statement of your research problem but a synthesis of
key points. For most essays, one well-developed paragraph is sufficient for a conclusion,
although in some cases, a two-or-three paragraph conclusion may be required.

Importance of a Good Conclusion

A well-written conclusion provides you with several important opportunities to


demonstrate your overall understanding of the research problem to the reader. These
include:
194
1. Presenting the last word on the issues you raised in your paper. Just as the
introduction gives a first impression to your reader, the conclusion offers a chance to
leave a lasting impression. Do this, for example, by highlighting key points in your
analysis or findings.
2. Summarizing your thoughts and conveying the larger implications of your study.
The conclusion is an opportunity to succinctly answer the "so what?" question by placing
the study within the context of past research about the topic you've investigated.
3. Demonstrating the importance of your ideas. Don't be shy. The conclusion offers you
a chance to elaborate on the significance of your findings.
4. Introducing possible new or expanded ways of thinking about the research
problem. This does not refer to introducing new information [which should be avoided],
but to offer new insight and creative approaches for framing/contextualizing the research
problem based on the results of your study.

Significane of a Conclusion

When writing the conclusion to your paper, follow these general rules:

 State your conclusions in clear, simple language.


 Do not simply reiterate your results or the discussion.
 Indicate opportunities for future research, as long as you haven't already done so in the
discussion section of your paper.

The function of your paper's conclusion is to restate the main argument. It reminds the
reader of the strengths of your main argument(s) and reiterates the most important
evidence supporting the argument(s). Make sure, however, that your conclusion is not
simply a repetitive summary of the findings because this reduces the impact of the
argument(s) you have developed in your essay. Consider the following points to help
ensure your conclusion is appropriate:

1. If the argument or point of your paper is complex, you may need to summarize the
argument for your reader.
2. If, prior to your conclusion, you have not yet explained the significance of your findings or
if you are proceeding inductively, use the end of your paper to describe your main points
and explain their significance.
3. Move from a detailed to a general level of consideration that returns the topic to the
context provided by the introduction or within a new context that emerges from the data.
4. Suggest what aspects of this topic need further research.

The conclusion also provides a place for you to persuasively and succinctly restate your
research problem given that the reader has now been presented with all the information
about the topic. Depending on the discipline you are writing in, the concluding paragraph
may contain your reflections on the evidence presented, or on the essay's central
research problem. However, the nature of being introspective about the research you

195
have done will depend on your topic and whether your professor wants you to express
your observations in this way.

Some Strategies
If your essay deals with a contemporary problem, warn readers of the possible
consequences of not attending to the problem.

1. Recommend a specific course or courses of action.


2. Use a relevant quotation or expert opinion to lend authority to the conclusion you have
reached [a good place to look is research from your literature review].
3. Restate a key statistic, fact, or visual image to drive home the ultimate point of your
paper.
4. If your discipline encourages personal reflection, illustrate your concluding point with a
relevant narrative drawn from your own life experiences.
5. Return to an anecdote, example, or quotation that you introduced in your introduction, but
add further insight that is derived from the findings of your study; use your interpretation
of results to reframe it in new ways.
6. Provide a "take-home" message in the form of a strong, succient statement that you want
the reader to remember about your study.

Common Problems

Failure to be concise
The conclusion section should be concise and to the point. Conclusions that are too long
often have unnecessary detail. The conclusion section is not the place for details about
your methodology or results. Although you should give a summary of what was learned
from your research, this summary should be relatively brief, since the emphasis in the
conclusion section is on the implications, evaluations, insights, etc. that you make.

Failure to comment on larger, more significant issues


Whereas in the Introduction your task was to move from general (the field of study) to
specific (your research problem), in the concluding section, your task is to move from
specific (your research problem) back to general (your field, how your research
contributes new understanding). In other words, in the conclusion you should place your
research within a larger context.

Failure to reveal the complexities of a conclusion or situation


Negative aspects of your research should never be ignored. Problems, drawbacks,

196
challenges, etc. encountered during your research study can be included as a way of
qualifying your conclusions.

Failure to provide a concise summary of what was learned


In order to be able to discuss how your research fits back into your field of study (and the
world at large), you need to summarize it very briefly. Often this element of your
conclusion is only a few sentences.

Failure to match the objectives of your research


Often research objectives change while the research is being carried out. This is not a
problem unless you forget to go back and refine your original objectives in your
introduction, as these changes emerge they must be documented so that they
accurately reflect what you were trying to accomplish in your research [not what you
thought you might accomplish when you began].

Resist the urge to apologize


If you've immersed yourself in studying the research problem, you now know a good
deal about it, perhaps even more than your professor! Nevertheless, by the time you
have finished writing, you may be having some doubts about what you have produced.
Repress those doubts! Don't undermine your authority by saying things like, "This is just
one approach to examining this problem; there may be other, much better appoaches...."

Conclusion

Finally, the conclusion of a thesis should be closed summarizing everything that has
come before, explaining in simple terms the way in which the research study ended,
relating it to the greater environment of the world at large, and leaving the reader with
the ability to draw his or her own conclusions from what you have described. Concluding
statements which refer back to the introductory paragraph are appropriate here.
Frequently, the ideas in the body of an essay lead to some significant conclusion that
can be stated and explained in this final paragraph. Finally, this is not the place to
introduce ideas you forgot to mention in the body of the paper!

The only possible conclusion the social sciences can draw is: some do, some don't.”
<< Ernest Rutherford

197
Chapter XXV

Editing &
Proofreading

A2Z

PhD
Thesis
Reflections on Academic Research

198
EDITING AND PROOFREADING

Definition
Proofreading is the act of searching for errors before you hand in your final research
thesis. Errors can be both grammatical and typographical in nature, but proofreading can
also be used to identify problems with the flow of your paper [i.e., the logical sequence of
thoughts and ideas] and to find any word processing errors [e.g., different font types,
indented paragraphs, line spacing, etc.].

Strategies for Proofreading

Getting started

 Be sure you've revised the larger aspects of your text. Don't make corrections at
the sentence and word level if you still need to work on the focus, development, and
arrangement of the whole paper, of sections, or of paragraphs.
 Set your text aside for a while between writing and proofreading. Some distance
between writing your paper and proofreading it will help you identify mistakes more
easily.
 Eliminate unnecessary words before looking for mistakes. Throughout your paper,
you should try to avoid using inflated diction if a simpler phrase works equally well.
Simpler, more precise language is easier to proofread than overly complex sentence
construction and vocabulary.
 Know what to look for. Based upon the comments of your professors on previous
drafts of your paper, make a list of mistakes you need to watch for.

Identifying the Erros:

1. Work from a printout, not a computer screen. Besides sparing your eyes the strain
of glaring at a computer screen, proofreading from a printout allows you to easily
skip around to where errors might have been repeated in multiple places
throughout the research paper.
2. Read out loud. This is especially helpful for spotting run-on sentences, but you'll
also hear other problems that you may not pick up when reading silently.
Reading your paper out loud also helps you play the role of the reader, thereby
encouraging you to understand the paper as your audience might.
3. Use a blank sheet of paper to cover up the lines below the one you're reading. This
technique keeps you from skipping ahead of possible mistakes.
4. Use the search function of the computer to find mistakes you're likely to make . For
example, search for "it" if you confuse "its" and "it's;" search for for "-ing" if
dangling modifiers are a problem; search for opening parentheses or quote
marks if you tend to leave out the closing ones.

199
5. If you tend to make many mistakes, check separately for each kind of error,
moving from the most to the least important, and following whatever technique
works best for you to identify that kind of mistake. For instance, read through
once (backwards, sentence by sentence) to check for fragments; read through
again (forward) to be sure subjects and verbs agree, and again (perhaps using a
computer search for "this," "it," and "they") to trace pronouns to antecedents.
6. End with using a computer spell checker or reading backwards word by word . But
remember that a spelling checker won't catch mistakes with homonyms (e.g.,
"they're," "their," "there") or certain typos (like "he" for "the").
7. Leave yourself enough time. Since many errors are made and overlooked by
speeding through writing and proofreading, taking the time to carefully looking
over your writing will help you catch errors you might otherwise miss. Always
read through your writing slowly. If you read through the paper at a normal
speed, you won't give your eyes sufficient time to spot errors.
8. Ask a friend to read your paper. Offer to proofread a friend's paper if they will
review yours. Having another set of eyes look for errors will often spot errors that
you otherwise have missed.

Individualize the Proofreading


In addition to following the suggestions above, individualizing your proofreading process
to match weaknesses in your writing will help you proofread more efficiently and
effectively. For example, I still tend to make subject-verb agreement errors. Accept the
fact that you likely won't be able to check for everything, so be introspective about what
your typical problem areas are and look for each type of error individually. Here's how:

 Find out what errors you typically make. Review instructors' comments about your
writing and/or review your paper with a tutor.
 Learn how to fix those errors. Talk with your professor about helping you
understand why you make the errors you do make so that you can learn to avoid
them.
 Use specific strategies. Use the strategies detailed below to find and correct your
particular errors in usage, sentence structure, and spelling and punctuation.

Avoid These Common Grammar Mistakes!

Given the rules and the multiple exceptions to every rule that characterizes the English
language, there are many, many sites on the web that discuss avoiding grammar
mistakes. Listed below are the most common and, thus, the ones you should focus on
locating and removing while proofreading your research paper.

1. Affect / effect -- welcome to what I consider to be the most confusing aspect in


the English language. "Effect" is most often a noun and generally means “a
result.” However, "effect" can be used as a verb that essentially means "to

200
bring about," or "to accomplish." "Affect" is almost always a verb and generally
means "to influence." However, affect can be used as a noun when you're
talking about the mood that someone appears to have. [Ugh!]
2. Apostrophes -- the position of an apostrophe depends on if the noun is singular
or plural. For singluar words, add an "s" to the end, even if the final letter is an
"s." For contractions, replace missing letters with an apostrophe; but remember
that it is where the letters no longer are, which is not always where the words
are joined [e.g., "is not" and "isn't"].
3. Capitalization -- a person’s title is capitalized when it precedes the name and is,
thus, seen as part of the name [e.g., President Zachary Taylor]; once the title
occurs, further references to the person holding the title appear in lowercase
[e.g., the president]. For groups or organizations, the name is capitalized when
it is the full name [e.g., the Department of Justice]; further references should be
written in lowercase [e.g., the department]. Note that, in general, the use of
capital letters should be minimized as much as possible.
4. Colorless verbs and bland adjectives –- passive voice, use of the to be verb, is
a lost opportunity to use a more interesting and accurate verb when you can.
Adjectives can also be used very specifically to add to the sentence. Try to
avoid generic or bland adjectives and be specific. Use adjectives that add to
the meaning of the sentence.
5. Comma splices -- a comma splice is the incorrect use of a comma to connect
two independent clauses (an independent clause is a phrase that is
grammatically and conceptually complete: that is, it can stand on its own as a
sentence). To correct the comma splice, you can: replace the comma with a
period, forming two sentences; replace the comma with a semicolon; or, join
the two clauses with a conjunction such as "and," "because," "but," etc.
6. Compared with vs. compared to -- compare to is to point out or imply
resemblances between objects regarded as essentially of a different order;
compare with is mainly to point out differences between objects regarded as
essentially of the same order [e.g., life has been compared to a journey;
Congress may be compared with the British Parliament].
7. Confusing singular possessive and plural nouns –- singular possessive nouns
always take an apostrophe, with few exceptions, and plural nouns never take
an apostrophe. Omitting an apostrophe or adding one where it does not belong
makes the sentence unclear.
8. Coordinating conjunctions -- words, such as but, and, yet, join grammatically
similar elements (i.e., two nouns, two verbs, two modifiers, two independent
clauses). Be sure that the elements they join are equal in importance and in
structure.
9. Dangling participle -- a participial phrase at the beginning of a sentence must
refer to the grammatical subject of the sentence.
10. Dropped commas around clauses–-place commas around words, phrases, or
clauses that interrupt a sentence. Do not use commas around restrictive
clauses, which provide essential information about the subject of the sentence.
11. The Existential "this" -- always include a referent with "this," such as "this
theory..." or "this approach to understanding the...." With no referent, "this" can
confuse the reader.
12. The Existential "it" -- the "existential it" gives no reference for what "it" is. Be
specific!

201
13. Its / it's--"its" is the possessive form of "it." "It's" is the contraction of "it is." They
are not interchangeable.
14. Interrupting clause –- this clause or phrase interrupts a sentence, such as,
"however." Place a comma on either side of the interrupting clause.
15. Know your non-restrictive clauses –- this clause or phrase modifies the subject
of the sentence but is not essential to understanding the sentence. The word
“which” is the relative pronoun usually used to introduce the nonrestrictive
clause.
16. Know your restrictive clauses –- this clause limits the meaning of the nouns it
modifies. The restrictive clause introduces information that is essential to
understanding the meaning of the sentence. The word “that” is the relative
pronoun normally used to introduce this clause. Without this clause or phrase,
the meaning of the sentence changes.
17. Lonely quotes –- quotes cannot stand on their own as a sentence. Integrate
them into a sentence.
18. Misuse and abuse of semicolons –- semicolons are used to separate two
related independent clauses or to separate items in a list that contains
commas. Do not abuse semicolons by using them often. They are best used
sparingly.
19. Overuse of unspecific determinates -- words such as "super" [as in super
strong] or "very" [as in very strong], are unspecific determinates. How
many/much is "very"? How big is super? If you ask ten people how cold, "very
cold" is, you would get ten different answers. Academic writing should be
precise, so eliminate as many unspecific determinants as possible.
20. Sentence fragments –- these occur when a dependent clause is punctuated as
a complete sentence. Dependent clauses must be used together with an
independent clause.
21. Singular words that sound plural -- when using words like "each," "every,"
"everybody," "nobody," or "anybody" in a sentence, we're likely thinking about
more than one person or thing. But all these words are grammatically singular:
they refer to just one person or thing at a time. And unfortunately, if you change
the verb to correct the grammar, you create a pedantic phrase like "he or she"
or "his or her."
22. Split Infinitive -- an infinitive is the form of a verb that begins with "to." Splitting
an infinitive means placing another word or words between the "to" and the
infinitive verb. This is considered bad by purists, but it is nowadays considered
a matter of style and not bad grammar.
23. Subject/pronoun disagreement –- there are two types of subject/pronoun
disagreement. Shifts in number refer to the shifting between singular and plural
in the same sentence. Be consistent. Shifts in person occurs when the person
shifts within the sentence from first to second person, from second to third
person, etc.
24. That vs. which -- that clauses (called restrictive) are essential to the meaning of
the sentence; which clauses (called nonrestrictive) merely add additional
information. In general, most nonrestrictive clauses in academic writing are
incorrect or superfluous. While proofreading, go on a "which" hunt and turn
most of them into restrictive clauses!
25. Verb Tense Agreement -- do not switch verbs from present to past or from past
to present without a good reason.

202
26. Who / whom -- who is used as the subject of the clause it introduces; whom is
used as the object of a preposition, as a direct object, or as an indirect object.
A key to remembering which word to use is to simply substitute who or whom
with a pronoun. If you can substitute he, she, we, or they in the clause, and it
still sounds okay, then you know that who is the correct word to use. If,
however, him, her, us, or them sounds more appropriate, then whom is the
correct choice for the sentence.

Source: http://libguides.usc.edu/writingguide

“I was working on the proof of one of my poems all the morning,


and took out a comma. In the afternoon I put it back again.”
<< Oscar Wilde

203
A2Z

PhD
Thesis
Reflections on Academic Research
Chapter XXVI

Writing an Annotated Bibliography

204
WRITING AN ANNOTATED BIBLIOGRAPHY
Definition

An annotated bibliography is a list of citations related to a particular subject area or


theme that include a brief, usually not more than 150 words, descriptive or evaluative
summary. The annotated bibliography can be arranged chronologically by date of
publication or alphabetically by author, with citations to print and/or digital materials,
such as, books, newspaper articles, journal articles, dissertations, government
documents, pamphlets, web sites, etc., and/or, multimedia sources like films and audio
recordings. The purpose of the annotation is to inform the reader of the relevance,
accuracy, and quality of the sources cited.

Importance of a Good Annotated Bibliography

In lieu of writing a formal research thesis, your supervisor may ask you to create an
annotated bibliography. You may be assigned this for a number of reasons, including
showing that you understand the literature underpinning your research problem, to
demonstrate that you can conduct an effective review of pertinent literature, or to share
sources among your peer-scholars so that, collectively, everyone obtains a
comprehensive understanding of key research on the subject. Think of an annotated
bibliography as a more deliberate, in-depth review of the literature than what is normally
conducted for a research paper.

On a broader level, writing an annoted bibliography can be excellent preparation for


conducting a larger research project by allowing you to see what research have already
been done and where your proposed study may fit within it. By reading and responding
to a variety of sources associated with a research problem, you can begin to see what
the issues are and gain a better perspective on what scholars are saying about your
topic. As a result, you are better prepared to develop your own point of view and
contributions to the literature.

Scope of an annotated bibliography

To learn about your topic: Writing an annotated bibliography is excellent preparation for a
research project. Just collecting sources for a bibliography is useful, but when you have

205
to write annotations for each source, you're forced to read each source more carefully.
You begin to read more critically instead of just collecting information. At the professional
level, annotated bibliographies allow you to see what has been done in the literature and
where your own research or scholarship can fit. To help you formulate a thesis: Every
good research paper is an argument. The purpose of research is to state and support a
thesis. So a very important part of research is developing a thesis that is debatable,
interesting, and current. Writing an annotated bibliography can help you gain a good
perspective on what is being said about your topic. By reading and responding to a
variety of sources on a topic, you'll start to see what the issues are, what people are
arguing about, and you'll then be able to develop your own point of view.

To help other researchers: Extensive and scholarly annotated bibliographies are


sometimes published. They provide a comprehensive overview of everything important
that has been and is being said about that topic. You may not ever get your annotated
bibliography published, but as a researcher, you might want to look for one that has
been published about your topic.

The Process

Creating an Annotated Bibliography calls for the application of a variety of intellectual


skills: Concise Exposition, Succinct Analysis, and Informed Library Research.First,
Locate and Record citations to Books, Periodicals, and Documents that may contain
useful information and ideas on your topic. Briefly examine and review the actual items.
Then choose those works that provide a variety of perspectives on your topic. Cite the
Book, Article, or Document using the appropriate style.Write a concise Annotation that
Summarizes the Central Theme and Scope of the thesis. Include One Or more
sentences that (a) Evaluate the Authority or Background of the Author, (b) Comment on
the intended audience, (c) Compare or Contrast this work with another you have cited,
or (d) Explain how this work illuminates your bibliography topic.

Structure and Writing Style


[A] Types

1. Descriptive: This type of annotation describes the source without summarizing the actual
argument, hypothesis, or message in the content. Like an abstract, it describes what the
206
source addresses, what issues are investigated, and any special features, such as
appendices or bibliographies that supplement the main text. What it does not include is
any evaluation or criticism of the content. This type of annotation seeks to answer the
question: Does this source cover or address the topic one is researching?

2. Informative/Summative: This type of annotation summarizes what the content,


message, or argument of the source is. It generally contains the hypothesis,
methodology, and conclusion or findings, but like the descriptive type, you are not
offering your own evaluative comments about such content. This type of annotation
seeks to answer these types of questions: What are the author's main arguments? What
conclusions did the author draw?

3. Evaluative/Critical/Analytical: This type of annotation incldes your evaluative


statements about the content of a source and is the most common type of annotation
your supervisor will ask you to write. It might critique the strengths and weaknesses of
the source or the applicability of the conclusions to the research problem you are
studying. This type of annotation seeks to answer these types of questions: Is the
reasoning sound? Is the methodology sound? Does this source address all the relevant
issues? How does this source compare to other sources on this topic?

[B] Choosing Sources for Your Bibliography


Appropriate sources to include can be anything that has value in regards to
understanding the research problem, including non-textual sources, such as, films,
maps, photographs, and audio recordings, or, archival materials and primary documents,
such as, diaries, government documents, collections of personal correspondence,
meeting minutes, or offical memorandums.

Your method for selecting which sources to annotate depends upon the purpose of the
assignment and the research problem you select. For example, if the research problem
is to compare the social factors that led to protests in Egypt with the social factors that
led to protests against the government of the Phillipines in the 1980's, you will have to
include non-U.S. and historical sources in your bibliography.

[C] Strategies to Define the Scope of your Bibliography


It is important that the sources cited and described in your bibliography are well-defined
and sufficiently narrow in scope to ensure that you're not overwhelmed by the volume of
items you could possibly include. Many of the general strategies you can use to narrow a

207
research topic are the same you can use to define what to include in your bibliography.
These are:

 Aspect--choose one lens through which to view your topic, or look at just one facet of your topic
(e.g., rather than writing a bibliography of sources about the role of food in religious rituals; create a
bibliography on the role of food in Hindu ceremonies).
 Time--the shorter the time period, the more narrow the focus.
 Geography--the smaller the area of analysis, the more narrow the focus (e.g., rather than cite
sources about trade relations in West Africa, include only sources that examine trade relations between
Niger and Cameroon).
 Relationship--review sources that examine how two or more different topics relate to one another?
(e.g., cause/effect, compare/contrast, etc.)
 Type--focus on your bibliography in terms of a specific type or class of people or things (e.g.,
research on health care provided to elderly men in Japan).
 Source--your bibliography includes specific types of materials (e.g., only books, only scholarly
journal articles, only films, etc.).
 Combination--use two or more of the above strategies to focus your bibliography very narrowly or
broaden coverage of a very speciafic research problem.

[D] Format and Content


The format of an annotated bibliography can differ depending on its purpose and the
nature of the assignment. It may be arranged alphabetically by author or chronologically
by publication date. Ask your supervisor for specific guidelines in terms of length, focus,
and type of annotation as cited.

Introduction
Your bibliography should include a brief introduction that explains the rationale for selecting the
sources that you did and note, if appropriate, what sources were excluded and the reasons why.

Citation
This first part of your entry contains the bibliographic information written in a standard
documentation style, such as, MLA, Chicago, or APA. Be consistent!

Annotation
The second part should summarize, in paragraph form, the material contained in the source.
What you say about the source is dictated by the type of annotation you are asked to write (see
above). In most cases, your annotation should provide critical commentary that evaluates the
source and its usefulness for your topic and for your paper. Things to think about when writing
include: Does the essay offer a good introduction on the issue? Does the source deal with a
particular aspect of the issue? Would novices find the piece accessible or is it intended for an
audience already familiar with the topic? What limitations, if any, does the source have [reading
level, timeliness, reliability, etc.]? What is your overall reaction to the source?

Length
Annotations can vary significantly in length, from a couple of sentences to a couple of pages.
However, they are normally about 150 words. The length will depend on the purpose. If you're
just writing summaries of your sources, the annotations may not be very long. However, if you are
208
writing an extensive analysis of each source, you'll need to devote more space.

Sample Annotated Bibliography entry for a Journal Article

The following example uses the APA format for the journal citation.

Waite, L. J., Goldschneider, F. K., & Witsberger, C. (1986). Nonfamily living and the erosion
of traditional family orientations among young adults. American Sociological
Review, 51 (4), 541-554.

The authors, researchers at the Rand Corporation and Brown University, use data from
the National Longitudinal Surveys of Young Women and Young Men to test their
hypothesis that nonfamily living by young adults alters their attitudes, values, plans, and
expectations, moving them away from their belief in traditional sex roles. They find their
hypothesis strongly supported in young females, while the effects were fewer in studies
of young males. Increasing the time away from parents before marrying increased
individualism, self-sufficiency, and changes in attitudes about families. In contrast, an
earlier study by Williams cited below shows no significant gender differences in sex role
attitudes as a result of nonfamily living.

Some Guidelines for Writing an Annotated Bibliography

o Write a brief and interesting introduction, as cited, to an Annotated Bibliography.

o Give bibliographical information in APA style.

o Be consistent, in citation, throughout the text.

o Read the book, paper, article, etc well and purposefully to ensure the correct and
proper understanding.

o Write concise evaluative statements covering the content, coverage, applicability


to the research study and other relevancy.

o Ensure the narration is precise and well within about 150 words.

“Some books are to be tasted, others to be swallowed, and some few to be chewed and
digested.”
<< Sir Francis Bacon

209
A2Z

PhD
Thesis
Reflections on Academic Research

Chapter XXVII

Research Results
210
RESEARCH RESULTS

Definition

The results section of the research paper is where you report the findings of the
research study based upon the information gathered as a result of the methodology [or
methodologies] applied in the research study. The results section should simply state
the findings, without bias or interpretation, and arranged in a logical sequence. The
results section should always be written in the past tense. A section describing results is
particularly necessary if the research includes data generated from the study.

Characteristics a Good Results Section

In formulating the results section, it is useful to note that the results of a study do not
prove anything. Research results can only confirm or reject the research problem
underpinning the research study. However, articulating the results helps the scholar to
understand the problem from within, to break it into pieces, and to view the research
problem from various perspectives.

The page length of this section is set by the amount and types of data to be reported . Be
concise, using non-textual elements, such as figures and tables, if appropriate, to
present results more effectively. In deciding what data to describe in the results section,
one must clearly distinguish material that would normally be included in a research study
from any raw data or other material that could be included as an appendix. In fact, raw
data should not be included at all unless requested to do so by the supervisor.

Avoid providing data that is not critical to answering the research question . The
background information already described in the Introduction section should provide the
reader with any additional context or explanation needed to understand the results. A
good rule is to always re-read the background section of the study paper after the
scholar has written the results section to ensure that the reader has enough context to
understand the results.

211
Structure and Writing Style

For most research paper formats, there are two ways of presenting and organizing the
results. The first method is to present the results followed by a short explanation of the
findings. For example, you may have noticed an unusual correlation between two
variables during the analysis of your findings. It is correct to point this out in the results
section. However speculating why this correlation exists, and offering a hypothesis about
what may be happening, belongs in the discussion section.

The other approach is to present a section and then discuss it, before presenting the
next section then discussing it, and so on. This is more common in longer papers
because it helps the reader to better understand each finding. In this model, it can be
helpful to provide a brief conclusion in the results section that ties each of the findings
together and links to the discussion. Note that the discussion part of your paper will
generally follow the same structure.

The content of the results section:

1. An introductory context for understanding the results by restating the research


problem that underpins the purpose of your study.
2. A summary of your key findings arranged in a logical sequence that generally
follows your methodology section.
3. Non-textual elements, such as, figures, charts, photos, maps, tables, etc. to
further illustrate the findings, if appropriate.
4. In the text, a systematic description of your results, highlighting for the reader
observations that are most relevant to the topic under investigation [remember
that not all results that emerge from the methodology that you used to gather the
data may be relevant].
5. Use of the past tense when refering to your results and ensure that everything is
in logical order.

Using Non-textual Elements

 Either place figures, tables, charts, etc. within the text of the result, or include
them in the back of the report--do one or the other but never do both.
 In the text, refer to each non-textual element in numbered order [e.g., table 1
table 2; chart 1, chart 2].
 If you place non-textual elements at the end of the report, make sure they are
clearly distinguished from any attached appendix materials, such as raw data.
 Regardless of placement, each non-textual element must be numbered
consecutively and complete with caption [caption goes under the figure, table, chart,
etc.]

212
 Each non-textual element must be titled, numbered consecutively, and complete
with a heading [title with description goes above the figure, table, chart, etc.].
 In proofreading your results section, be sure that each non-textual element is
sufficiently complete so that it could stand on its own, separate from the text.

Avoid the following:

1. Discussing or interpreting your results. Save all this for the next section of the
thesis

2. Reporting background information or attempting to explain your findings; this


should have been done in your Introduction section, but don't panic! Often the
results of a study point to the need to provide additional background information
or to explain the topic further so don't think you did something wrong. Revise
your introduction as needed.

3. Ignore negative results. If some of your results fail to support your hypothesis, do
not ignore them. Document them, and then state in your discussion section why
you believe they emerged from your study. Note that negative results, and how
you handle them, oftent provides you with the makings of a great discussion
section, so don't be afraid to highligh them.

4. Include raw data or intermediate calculations. Ask the supervisor professor if you
need to include any raw data generated by your study, such as transcripts from
interviews or data files. If raw data is to be included, place it in an appendix or set
of appendices.

5. Present the same data or repeat the same information more than once . If you feel
the need to highlight something, you will have a chance to do that in the
discussion section.

6. Confuse figures with tables. Be sure to properly label any non-textual elements in
your paper. If you are not sure, look up the term in a dictionary.

“A new idea must not be judged by its immediate results.”


<< Nikola Tesla

213
A2Z

PhD
Thesis
Reflections on Academic Research

Chapter XXVIII

Defending a Thesis

214
DEFENDING A THESIS
presentation with flair

Introduction

The thesis defense or viva voce is like an oral examination in some ways. It is different in
many ways, however. The chief difference is that the candidate usually knows more
about the syllabus than do the examiners.

Nowadays, PhD defenses are generally public, by inviting people, especially


academicians, research scholars, etc. The trick of the trade in giving a great
presentation is to be prepared, know your subject, and practise your presentation until
you feel completely natural to stand up in front of an audience. Perhaps your first
presentation will be in an informal setting with other members of your lab during a
weekly or monthly group meeting. Or you may be asked to give a talk to the entire
department.

Be ready

You're nearly ready for the final act. After several years of research and the hard work of
writing up your results, you have submitted your magnum opus to your viva committee
and now face the final step. Ready or not, it's time to put yourself and your work in the
critical examination in a public spotlight.

A variety of formal procedures and regulations, which vary by institution, dictate how and
where your thesis defense is conducted. Usually, a scholar is interrogated endlessly by a
committee of experts, and there is a small but finite chance the candidate will fail. You
want to perform well and bring your Ph.D. studies to the best possible conclusion. To
ensure a successful thesis defense, you need to do three things: prepare, prepare, and
prepare.

Most doctoral programs provide opportunities to participate in the defense meetings or


also known as Public viva-voce of other doctoral candidates. Make it a point to attend a
few of these meetings. Chances are that you'll be pleasantly surprised by a tone that is
much more supportive and respectful than what you had imagined.

215
Defense consists of four stages:

 A brief and self-introduction giving the academic and/or industrial background.


 A presentation, usually with PPT slides, covering the major aspects, viz research
questions, importance of the study, findings & conclusion and usefulness to the
public, etc.
 A critical question-answer session
 Private discuss of the presentation by the committee members

Full Coverage
Your presentation (and thesis) needs to address the following:

 What is the problem you are studying?


 Why is it important?
 What results have you achieved?

Time Management

Generally, the whole defense will not take more than two hours, but should take
considerably less time. Part of the challenge of a defense is to convince the committee
that you can summarize the important points of your work in a very limited time.

Keep to your allotted time. If you've been given 20 minutes for your talk, then talk for 20
minutes. Fifteen minutes is even better so that you can allow some time at the end of
your presentation for questions and/or discussion. For many people, the question-and-
answer session is the most nerve-racking part of the presentation. After all, you have no
control over the questions asked, so you can't really prepare the answers. Or can you? A
good exercise is to try to anticipate the questions you may be asked and prepare the
answers in advance.

Meticulous Preparation

So in the week or two before your thesis defense, read your thesis all the way through
with a critical eye and a highlighter in hand to refresh your memory about experimental
details, protocols, results, and your conclusions. Years have passed since you did some
of that work, so it's important to remind yourself of the fine points.

As you read, put yourself in the role of an examiner. What would you ask the author of
this thesis? Where are the trouble spots, the unresolved issues, the shaky conclusions?
If you can predict some of the questions and prepare the answers, you will be in much
216
better shape during the defense itself. Even if you don't get those questions, the exercise
will give you confidence and reacquaint you with the fundamentals.

Judge the Committee!

It would be a mistake, however, to underestimate the examiners' knowledge of your


subject. Moreover, in the formal setting of a thesis defense, you have one truly big
disadvantage: Your examiners can prepare questions beforehand, but you have to reply
to them on the spot. Some examiners are very good at finding awkward or controversial
issues, and they will certainly question you about those aspects of your work.

As you stand in the spotlight, you may even realize, to your discomfort, that it's been
quite some time since you thought about those thorny issues. Consequently, some
questions will be sincere questions: the examiner asks because he/she doesn't know
and expects that the candidate will be able to rectify this. Scholars often expect
questions to be difficult and attacking, and answer them accordingly. Often the questions
will be much simpler than they expect.

Think of your defense as a high-level professional conversation about a topic of interest.


No doctoral dissertation committee worthy of the name assembles for the sole purpose
of publicly humiliating a candidate; faculty members are in the business of supporting
successful program completion whenever possible. Have some confidence in you!

Discuss your planning with your supervisor

This may be a very crucial and important stage of preparation. This meeting might feel
more like a negotiation than a discussion. At one end of the table is you, the hard-
working Ph.D. scholar who wants to wrap up the work in a reasonable amount of time. At
the other end of the table is your supervisor, possibly hungry for more research results
that can be used in a future presentation or publication. Hopefully, some disagreements
about how much work you need to do can be curtailed by giving your supervisor your
proposed table of contents and your countdown list in advance of the meeting. These
documents will show your progress and demonstrate how much you already have
accomplished.

217
If there is a major disagreement, try not to get angry. Instead, summarize the issues you
don't agree about and ask for some time to reflect on your supervisor's point of view.
Sometimes these planning discussions can't be finished in a single meeting, may
extended to two or three meetings; it's worth doing properly. You will save yourself quite
a bit of thesis-preparation stress if you can structure a countdown plan on which all
parties can agree.

Rehearsal of Presentation

You've structured your talk and made your slides. Now for the fun part: It's time to
rehearse your presentation out loud. First do a self-presentation (this will feel funny at
first, but it is very effective for putting yourself at ease and for getting used to the sound
of your voice in a quiet room). Then practise your talk in front of a few fellow students or
other trusted colleagues. Use these practice sessions to rehearse the pacing of your talk
and to master the effective use of visual aids. Ask your colleagues for their comments
and honest assessment of your performance at the end of the presentation. Productive
criticism from friends is useful for making improvements, and it's better to hear it from
them.

Be ready for a 'free kick'. It is relatively common that a panel will ask one (or more)
questions that, whatever the actual wording may be, are essentially an invitation to you
to tell them (briefly) what is important, new and good in your thesis. You would feel you
are in your comfort-zone, so you should rehearse this. You should be able to produce on
demand (say) a one minute speech and a five minute speech, and be prepared to
extend them if invited by further questions. Do not try to recite your abstract: written and
spoken styles should be rather different. Rather, rehearse answers to the questions:

"What is your thesis about, what are the major contributions and what have you done
that merits a PhD?”

Remember, the German philosopher Goethe's advice of "Do not hurry; do not wait" is
doubly applicable.

218
Anticipate the Questions

Some questions may be just too difficult to answer right away, or you may be caught off-
guard. You may tempt to try to bluff your way through it, but a better solution is to admit
that you don't know but discuss the issues raised by the query intelligently. Examiners
will recognize the distinction between a candidate who prevaricates and one who makes
a real attempt to address the question, even if there's no complete answer.

Don't be surprised if some examiner asks you to get more specific with a question such
as, "If you had this study to do over again, what would you do differently?" or "Is this a
line of research you care to pursue beyond the dissertation and, if so, how?"

Question-Answer Session

Read the following and understand the intricacies involved in each:

1. Listen to the question carefully. Too often, Ph.D. scholars stop listening
halfway through because they believe they know what the question is about, or
they are so nervous they start preparing the answer in their heads while the
question is still being asked. But sometimes the real question comes only at the
very end of a long exposé (in which the examiner may be trying to show off), and
it may not be the question you anticipate. So it is suggested that the you listen
attentively the whole time the examiner is speaking. To help you maintain your
concentration, you might want to take simple notes or jot down key words to
remind you what was said. Just don't let the note-taking distract you from careful
listening.

2. Begin your answer by rephrasing the question succinctly and politely:


"Professor Mishra, or Dr Gopal (or whatever address the formalities require),
your question on the research described in Chapter 2 addresses problematic
issue from an interesting perspective. If I understand your query correctly, you
wonder why ..." This rephrasing establishes whether you have understood the
question properly, and it gives you a moment to collect your thoughts and
prepare the best possible answer.

3. Finally, answer the question. This might seem obvious; but too often the
candidate will make no serious attempt to answer the question properly,
launching instead into a related or unrelated tangent or long-winded explication
that--it is hoped--seems like an answer, but isn't.

Tips for a perfect presentation

Public speaking is an art. Some people are great at it, others less so. The good news is
that many of the necessary skills can be learned. Everyone loves to listen to a great
219
speaker, so aim to be the kind of speaker whose talks you have enjoyed. During your
presentation make your voice, facial expressions, and the body language perfect which
are your most important attributes:

 Be conscious of how you use your voice. Note that it's not just what you say that
counts; it's how you say it. Speak clearly and be audible to the last-row. Don't
rush. Use a natural pace, but don't be conversational. Speaking in a monotone is
boring and will put people to sleep, so be sure to vary the speed and pitch of your
voice.

 Pause at key points to allow the audience to absorb your words.

 Look at the audience throughout your talk. You will create a rapport with the
audience by establishing eye contact with as many people as possible. At the
same time, be aware of your facial expressions. If you look bored, the audience
will be bored. If you are animated and alert, the audience will be interested in
what you have to say.

 Be receptive to the audience. Pay attention to the audience's body language and
nonverbal reactions to your remarks. Know when to stop and when to leave out
part of your presentation if you begin to sense that people in the audience are
losing their ability to pay attention.

Ensure that you avoid:

 blocking the screen with your body;


 gesturing excessively with your hands or fidgeting;
 mumbling and turning your back to the audience; and
 reading from your slides word for word.

Conclusion

After all is said and done, a thesis is an elaborate exercise. It is a demonstration that you
are capable of conceptualizing, conducting, and reporting research in a reasonably
independent way. Only a tiny fraction of the theses written are published, and even then,
they require extensive editing. Why? First of all, theses are written by novices. As a
result, they initiate your career as a scholar rather than define it.

Transforming a thesis into a publishable piece is a major overhaul because the


manuscripts written by students and the articles published in scholarly journals have a
different scope, purpose, audience, and style. The real contribution of most theses is
that they lead to conferral of the degree, open up new career options, help you to mature
as a scholar, and socialize you into the scholarly norms of your field.

220
"You can tell whether a man is clever by his answers.
You can tell whether a man is wise by his questions."
<< Naguib Mahfouz

221
R
E
F
L
E
C
T

O
I
A2Z

PhD
N
S

O
N

A
C
Thesis Chapter XXIX

A
D
E
M
I
C

R
E
S
E
A
R
C
H

Reading a Research Paper

222
READING A RESEARCH PAPER

Introduction

The process of reading research papers effectively is challenging. These papers are
very often written in a very condensed style because of limitations of pages and the
expected audience, which usually know the area well. Moreover, the reasons for writing
the paper may be different than the reasons the paper has been assigned, meaning you
have to work harder to find the content that you are interested. Finally, your time is very
limited, so you may not have time to read every word of the paper or read it several
times to extract all the nuances. For all these reasons, reading a research paper often
requires a special approach as well as skill.

To develop an effective reading style for research papers, it can help to know what you
would get out of the paper, and where that information is located in the paper. Typically,
the introduction will state not only the motivations behind the work, but also indicate the
solution. Often this may be the case from the paper. The body of the paper states the
authors' solution to the problem in detail, and would also describe a detailed evaluation
of the solution in terms of arguments or an empirical evaluation, such as case study,
experiment, etc. Finally, the paper would conclude with a recapitulation, including a
discussion of the primary contributions. A paper would also discuss related work to
some degree. Papers are often repetitive because they present information at different
levels of detail and from different perspectives. As a result, it may be desirable to read
the paper out-of-order or to skip certain sections.

Parameters

Before start understanding how to read a paper, one needs to start at the beginning with
a few following preliminaries:

[A] How are papers organized?


[B] Do all papers conform to one single standardized format?
[C] How does one prepare to read a paper, particularly in an area not so familiar?
[D] What difficulties one can expect?

223
[A] How are papers organized?

In most scientific or other journals, papers almost follow a standard format. They are
divided into several sections, and each section serves a specific purpose in the paper.
Let us first briefly describe the standard format.

A paper begins with a short Abstract or Summary. Generally, it gives a brief


background to the title of the paper; describes concisely the major findings of the paper;
and relates these findings to the field of study.

The next section of the paper is the Introduction. As its name implies, this section
presents the background knowledge necessary for the reader to understand why the
findings of the paper are an advance on the knowledge in the field. Typically, the
Introduction describes first the accepted state of knowledge in a specialized field; then it
focuses more specifically on a particular aspect, usually describing a finding or set of
findings that led directly to the work described in the paper. If the authors are testing a
hypothesis, the source of that hypothesis is spelled out, and findings are given. Papers
more descriptive or comparative in nature may begin with an introduction to an area
which interests the authors, or the need for a broader database.

The next section in most papers is the Materials and Methods or Research
Methodology. In some journals this section is the last one. Its purpose is to describe the
materials used in the experiments and the methods by which the experiments were
carried out. In principle, this description should be detailed enough to allow other
researchers to replicate the work. In general the practice is that, these descriptions are
often highly compressed, and they often refer back to previous papers by the authors.

The third section is usually Results. This section describes the experiments and the
reasons they were done. Generally, the logic of the Results section follows directly from
that of the Introduction. That is, the Introduction poses the questions addressed in the
early part of Results. Beyond this point, the organization of Results differs from one
paper to another. In some papers, the results are presented without extensive
discussion, which is reserved for the following section. This is appropriate when the data
in the early parts do not need to be interpreted extensively to understand why the later

224
experiments were done. In other papers, results are given, and then they are interpreted,
perhaps taken together with other findings not in the paper, so as to give the logical
basis for later experiments.

The fourth section is the Discussion. This section serves multiple purposes. First, the
data in the paper are interpreted; any limitations to the interpretations would be
acknowledged, and fact would clearly be separated from speculation. Second, the
findings of the paper are related to other findings in the field. This serves to show how
the findings contribute to knowledge, or correct the errors of previous work(s). As stated,
some of these logical arguments are often found in the Results when it is necessary to
clarify why later experiments were carried out.

Finally, papers usually have a short Acknowledgements section, in which various


contributions of other workers are recognized and then, followed by a Reference list
giving references to papers and other works cited in the text.

Papers also contain several Figures and Tables. These contain data described in the
paper. The figures and tables also have legends, whose purpose is to give details of the
particular experiment or experiments shown there.

[B] Do all papers conform to one single standardized format?

In most of the journals, the above format is followed. Occasionally, the Results and
Discussion are combined, in cases in which the data need extensive discussion to allow
the reader to follow the train of logic developed in the course of the research. In certain
older papers, the Summary was given at the end of the paper.

The formats for two widely-read scientific journals, Science and Nature, differ markedly
from the above outline. These journals reach a wide range of audience, and many
authors wish to publish in them; accordingly, the space limitations on the papers are
severe, and the prose is usually highly compressed. In both journals, there are no
discrete sections, except for a short abstract and a reference list. In Science, the
abstract is self-contained; in Nature, the abstract also serves as a brief introduction to
the paper. Experimental details are usually given either in endnotes (for Science) or

225
Figure and Table legends and a short Methods section (in Nature). Authors often try to
circumvent length limitations by putting as much material as possible in these places. In
addition, a common practice is to put a substantial fraction of the less-important material,
and much of the methodology, into Supplemental Data that can be accessed online at
any later time.

In response to the pressure to edit and make the paper short and concise, most of the
authors choose to condense or, more typically, omit the logical connections that would
make the flow of the paper easy. In addition, much of the background that would make
the paper accessible to a wider audience is condensed or omitted, so that the less-
informed reader has to consult a review article or previous papers to make sense of
what the issues are and why they are important.

[C] How does one prepare to read a paper?

Though it is highly tempting to read the paper straight through as one would does with
most text, it is more efficient to organize the way one you reads. Generally, one first
reads the Abstract in order to understand the major points of the work. The extent of
background assumed by different authors, and allowed by the journal, also varies as just
indicated above.

One most and extremely useful habit in reading a paper is to read the Title and the
Abstract and, before going on, review in one’s mind what one knows about the topic.
This serves several useful purposes. First, it clarifies whether one in fact knows enough
background to appreciate the paper. If not, one might choose to read the background in
a review or textbook, as appropriate/applicable

Second, it refreshes one’s memory about the topic. Third, and possibly most importantly,
it helps a reader as he/she integrates the new information into one’s previous knowledge
about the topic. That is, it is used as a part of the self-education process that any
professional must continue throughout his/her career.

If one is very familiar with the field, the Introduction can be skimmed and/or skipped. As
stated above, the logical flow of most papers goes straight from the Introduction to
Results; accordingly, the paper should be read in that way as well, skipping Materials
226
and Methods or Research Methodology and referring back to this section as needed to
clarify what was actually done. A reader familiar with the field who is interested in a
particular point given in the Abstract often skips directly to the relevant section of the
Results, and from there to the Discussion for interpretation of the findings. This is easy
to do if the paper is properly organized.

Many papers contain code phrases such as ‘data not shown’, ‘unpublished data’,
‘preliminary data’, etc., since they have connotations that are generally not explicit. In
many papers, not all the experimental data are shown, but referred to by "(data not
shown)". This is often for reasons of space; the practice is accepted when the authors
have documented their competence to do the experiments properly (usually in previous
papers). The other two phrases are "unpublished data" and "preliminary data". The
former can either mean that the data are not of publishable quality or that the work is
part of a larger story that will one day be published. The latter means different things to
different people, but one connotation is that the experiment was done only once.

[D] What difficulties one can expect?

Several difficulties confront the reader, particularly one who is not familiar with the field.
As discussed above, it may be necessary to bring one up to speed before beginning a
paper, no matter how well written it is. Although some problems may lie in the reader,
many are the fault of the writer.

One major or primary problem is that many papers are poorly written. Some scientists
are poor writers. Many others do not enjoy writing, and do not take the time or effort to
ensure that the prose is clear and logical. Also, the author is typically so familiar with the
material that it is difficult to step back and see it from the point of view of a reader not
familiar with the topic and for whom the paper is just another of a large stack of papers
that need to be read.

Bad writing has several consequences for the reader. First, the logical connections are
often left out. Instead of saying why an experiment was done, or what ideas were being
tested, the experiment is simply described. Second, papers are often cluttered with a
great deal of jargon. Third, the authors often do not provide a clear road-map through

227
the paper; side issues and fine points are given equal air time with the main logical
thread, and the reader loses this thread. In better writing, these side issues are relegated
to Figure legends, Materials and Methods, or online Supplemental Material, or else
clearly identified as side issues, so as not to distract the reader.

Another major difficulty arises when the reader seeks to understand just what the
experiment was. All too often, authors refer back to previous papers; these refer in turn
to previous papers in a long chain. Often that chain ends in a paper that describes
several methods, and it is unclear which was used. Or the chain ends in a journal with
severe space limitations, and the description is so compressed as to be unclear. More
often, the descriptions are simply not well-written, so that it is ambiguous what was
done.

Other typical difficulties arise when the authors are uncritical about their experiments; if
they firmly believe a particular model, they may not be open-minded about other
possibilities. These may not be tested experimentally, and may even go unmentioned in
the Discussion. Still another related problem is that many authors do not clearly
distinguish between fact and speculation, especially in the Discussion. This makes it
difficult for the reader to know how well-established the “facts” under discussion are.

One final problem arises from the sociology of science. Many authors are over ambitious
and wish to publish in trendy/modern journals. As a consequence, they overstate the
importance of their findings, or put a speculation into the title in a way that makes it
sound like a well-established finding. Another example of this approach is the "Assertive
Sentence Title", which presents a major conclusion of the paper as a declarative
sentence. In recent times, this trend is becoming prevalent. It's not so bad when the
assertive sentence is well-documented; but quite often such assertive sentence is
nothing more than a speculation and the hasty reader may well conclude that the issue
is settled when it isn't. This practice as far as possible may be avoided.

These last factors represent the public relations side of a competitive field. This behavior
is understandable, if not praiseworthy. But when the authors mislead the reader as to
what is firmly established and what is speculation, it is hard, especially for the novice, to
know what is settled and what is not.

228
Conclusion

The aim of reading a research paper varies depending upon the necessities of the
reader. But it has to be borne in mind , whatever be type of the reader, that one has to
be familiar with the standard format of any research paper and has to have some correct
prospective while reading the paper. At end of the day, the author serves the community
provided the reader gets what he/she wants.

“If one cannot enjoy reading a book over and over again, there is no use in reading it at all.”
<< Oscar Wilde

229
Chapter XXX

Evaluating a
Research Paper

A2Z

PhD
Thesis
Reflections on Academic Research

230
EVALUATING A RESEARCH PAPER

Introduction

Good research reflects a sincere desire to determine what is overall true, based on all
available information; as opposed to bad research that starts with a conclusion and
identifies supporting factoids (individual facts taken out of context). A good research
document empowers readers to reach their own conclusions by including:

• A well-defined question.
• Description of the context and existing information about an issue.
• Consideration of various perspectives.
• Presentation of evidence, with data and analysis in a format that can be replicated by others.
• Discussion of critical assumptions, contrary findings, and alternative interpretations.
• Cautious conclusions and discussion of their implications.
• Adequate references, including original sources, alternative perspectives, and criticism.

Criteria for Evaluation

A thorough understanding and evaluation of a paper involves answering several


questions:

a. What questions does the paper address?


b. What are the main conclusions of the paper?
c. What evidence supports those conclusions?
d. Do the data actually support the conclusions?
e. What is the quality of the evidence?
f. Why are the conclusions important?

[A] What questions does the paper address?

Before addressing this question, one needs to be aware that research in Consumer
Behaviour and Customer Relationship Management can be of several different types:

Type of research Question asked:


Descriptive What is there? What do we see?
How does it compare to other organisms?
Comparative
Are our findings general?
Analytical How does it work? What is the mechanism?

Descriptive research often takes place in the early stages of our understanding of a
system. We can't formulate hypotheses about how a system works, or what its
interconnections are, until we know what is there. Typical descriptive approaches in

231
Consumer Behaviour are behavioural pattern of a consumer and the reasoning thereof.
In Customer Relationship Management, one could regard sustaining and managing the
relationship of a customer as a descriptive endeavor.

Comparative research often takes place when we are asking how general a finding is.
Is it specific to one particular country, or is it broadly applicable? A typical comparative
approach would be comparing the behavioural pattern of a consumer in respect of a
particular product from one country with that from the other countries in which that
product is found. One example of this is the observation that the response/reaction for a
particular product from a European consumer and an African consumer is similar in
some aspects as well as different in some other aspects.

Analytical research generally takes place when we know enough to begin formulating
hypotheses about how a system works, about how the parts are interconnected, and
what the causal connections are. A typical analytical approach would be to devise two
(or more) alternative hypotheses about how a system operates. These hypotheses
would all be consistent with current knowledge about the system. Ideally, the approach
would devise a set of experiments to distinguish among these hypotheses.

Being aware that not all papers have the same approach can orient a reader towards
recognizing the major questions that a paper addresses.

What are these questions?


In a well-written paper, as described above, the Introduction generally goes from the
general to the specific, eventually framing a question or set of questions. This is a good
starting strategy. In addition, the results of experiments usually raise additional
questions, which the authors may attempt to answer. These questions usually become
evident only in the Results section.

[B] What are the main conclusions of the paper?

This question can often be answered in a preliminary way by studying the abstract of the
paper. Here the authors highlight what they think are the key points. But, this is not
enough, because abstracts often have severe space constraints; but it can serve as a
starting point. Still, one needs to read the full paper with this question in mind.
232
[C] What evidence supports those conclusions?

Generally, one can get a reasonably good idea about this from the Results section. The
description of the findings points to the relevant tables and figures. This is easiest when
there is one primary experiment to support a point. However, it is often the case that
several different experiments or approaches combine to support a particular conclusion.
For example, the first experiment might have several possible interpretations, and the
later ones are designed to distinguish among these.

In the ideal case, the Discussion begins with a section of the form "Three lines of
evidence provide support for the conclusion that... First, ...Second,... etc." However,
difficulties can arise when the paper is poorly written. The author(s) often do not present
a concise summary of this type, leaving the reader to make it himself/herself. It is always
possible to argue that in such cases the logical structure of the argument is weak and is
deliberately omitted. In any case, one needs to be sure that one does understand the
relationship between the data and the conclusions.

[D] Do the data actually support the conclusions?

One major advantage of doing this is that it helps the reader to evaluate whether the
conclusions are sound. If it is assumed for the moment that the data are believable, it
still might be the case that the data do not actually support the conclusion the authors
wish to reach. There are at least two different ways this can happen:

a) The logical connection between the data and the interpretation is not sound

b) There might be other interpretations that might be consistent with the data.

One important aspect to look for is whether the authors take multiple approaches to
answering a question. Do they have multiple lines of evidence, from different directions,
supporting their conclusions? If there is only one line of evidence, it is more likely that it
could be interpreted in a different way; multiple approaches make the argument more
persuasive.

233
Another thing to look for is implicit or hidden assumptions used by the authors in
interpreting their data. This can be hard to do, unless you understand the field
thoroughly. Only expert/specialist in a particular field should be able to judge.

[E] What is the quality of that evidence?

This is the hardest nut to crack, for novices and experts alike. At the same time, it is
really a challenging one and one of the most important skills to learn as a young
research scholar. It involves a major reorientation from being a relatively passive
consumer of information and ideas to an active producer and critical evaluator of them.
This is not easy and takes years to master. Beginning scientists often wonder, "Who am
I to question these authorities? After all the paper was published in a top journal, so the
authors must have a high standing, and the work must have received a critical review by
experts." Unfortunately, that's not always the case. In any case, developing one’s ability
to evaluate evidence is one of the hardest and most important aspects of learning.

Here are some steps by which one can evaluate the evidence:

First Step

One has to understand thoroughly the methods used in the experiments. Often these
are described poorly or not at all. The details are often missing, but more importantly the
authors usually assume that the reader has a general knowledge of common methods in
the field. If there is lack of this knowledge, one has to make the extra effort to inform
oneself about the basic methodology before one can evaluate the data. Sometimes you
have to trace back the details of the methods if they are important. The increasing
availability of journals on the Web has made this easier by obviating the need to find a
hard-copy issue, e.g. in the library.

Second Step

One has to have the reasonable knowledge about the limitations of the methodology.
Every method has limitations, and if the experiments are not done correctly they can't be
interpreted.

234
Third Step

Here one has to distinguish between what the data show and what the authors say they
show. The latter is really an interpretation on the authors' part, though it is generally not
stated to be an interpretation. Papers usually state something like "the data in Fig. x
show that ...". This is the authors' interpretation of the data. One need not interpret it the
same way? One has to look carefully at the data to ensure that they really do show what
the authors say they do. One can only do this effectively if one understands the methods
and their limitations.

Fouth Step

It is always helpful to look at the original journal, or its electronic counterpart, instead of a
photocopy. Particularly for half-tone figures, the contrast is distorted, usually increased,
by photocopying, so that the data are misrepresented.

Fifth Step

One should ask and look for if the proper controls are present. Controls tell the reader
that nature is behaving the way we expect it to under the conditions of the experiment. If
the controls are missing, it is harder to be confident that the results really show what is
happening in the experiment. One should try to develop the habit of asking "where are
the controls?" and looking for them.

[F] Why are the conclusions important?

Do the conclusions make a significant advance in our knowledge? Do they lead to new
insights, or even new research directions? Again, answering these questions requires
the thorough understanding of the field and critical ability to analyze the intricate issues
presented in the paper.

Some Guidelines

Research quality is an epistemological issue (related to the study of knowledge). It is


important to librarians (who manage information resources), scientists and analysts (who
create reliable information), decision-makers (who apply information), jurists (who judge
people on evidence) and journalists (who disseminate information to a broad audience).

235
These fields have professional guidance to help maintain quality research. This has
become increasingly important as the Internet makes unfiltered information more easily
available to a general audience. Guidelines for good research are provided below:

I Some of the Desirable Practices recommended


An ideal research paper:-
1. Attempts to fairly present all perspectives.
2. Provides context information suitable for the intended audience. This can be done with a
literature review that summarizes current knowledge, or by referencing relevant
documents or websites that offer a comprehensive and balanced overview.
3. Carefully defines research questions and their links to broader issues.
4. Provides data and analysis in a format that can be accessed and replicated by others.
Quantitative data should be presented in tables and graphs, and available in database or
spreadsheet form on request.
5. Discusses critical assumptions made in the analysis, such as to why a particular data set or
analysis method is used or rejected. Indicates how results change with different data and
analysis. Identifies contrary findings.
6. Presents results in ways that highlight critical findings. Graphs and examples are
particularly helpful for this.
7. Discusses the logical links between research results, conclusions and implications.
Discusses alternative interpretations, including those with which the researcher disagrees.
8. Describes analysis limitations and cautions. Does not exaggerate implications.
9. Is respectful to people with other perspectives.
10. Provides adequate references.
11. Indicates funding sources, particularly any that may benefit from research results.

II Some of the Undesirable Practices to be avoided

1. Issues are defined in ideological terms. “Straw men” reflecting exaggerated or extreme
perspectives are use to characterize a debate.
2. Research questions are designed to reach a particular conclusion.
3. Alternative perspectives or contrary findings are ignored or suppressed.
4. Data and analysis methods are biased.
5. Conclusions are based on faulty logic.
6. Limitations of analysis are ignored and the implications of results are exaggerated.
7. Key data and analysis details are unavailable for review by others.
8. Researchers are unqualified and unfamiliar with specialized issues.
9. People with differing perspectives are insulted and ridiculed.
10. Citations are primarily from special interest groups or popular media, rather than from
peer reviewed professional and academic organizations.

Conclusion
While research papers contribute to the community in general, the well-judged and well-
balanced evaluation endures the quality of the paper and enriches the value and utility of
the paper.

236
“The greatest sin is judgment without knowledge”

“One test of the correctness of educational procedure is the happiness of the child.”
<< Maria Montessori

237
A2Z

PhD
Thesis
Reflections on Academic Research

JOURNAL IMPACT
Journal Impact Factor
Chapter XXXI
238
FACTORS

Introduction

It has become mandatory for academic research scholars to publish a minimum number
of research papers in peer-reviewed national or international reputed journals having a
reasonable Impact Factor [IF] or also known as Journal Impact Factor [JIF]. Journal
Impact Factor is from Journal Citation Report (JCR), a product of Thomson ISI (Institute
for Scientific Information). JCR provides quantitative tools for evaluating journals. The
impact factor is one of these; it is a measure of the frequency with which the "average
article" in a journal has been cited in a given period of time.

Historical Background of Citation Indexing

The concept behind citation indexing is fundamentally simple. By recognizing that the
value of information is determined by those who use it, what better way to measure the
quality of the work than by measuring the impact it makes on the community at large.
The widest possible population within the scholarly community (i.e. anyone who uses or
cites the source material) determines the influence or impact of the idea and its
originator on our body of knowledge. Because of its simplicity, one tends to forget that
citation indexing is actually a fairly recent form of information management and retrieval.

There were three factors that led to the development of citation indexing back in the
1950's. With the huge influx of government dollars into research and development
following World War II, the research community naturally began to publicly document its
findings through the accepted channel of published scientific journal literature. The
subsequent burgeoning of the literature created a need for a method of indexing and
retrieval that would be more cost effective and efficient than the then-current model of
human indexing of materials for subject specific indices. While the subtle judgements
made by subject specialists were valuable in giving depth to a subject index, manual
indexing was both a more time consuming process and labor intensive. Its costs
increased in proportion to the growth of material to be indexed. So the need for a better
way of managing information was the first factor.

The second factor was the growing dissatisfaction with the capacity of subject indexing
to meet the needs of the active researcher. At this point in time, a subject index could
239
have excessive lag times in adding materials to the indexes of the time; months could
pass before researchers in one field would learn of published findings in some other field
that had relevance to their own study. Furthermore, there were limitations to the subject
indexing in terms of retrieval. Terminology appropriate to one specific discipline would
not necessarily have meaning to researchers in another, perhaps overlapping, discipline.
At the same time, scientists were recognizing that they had to be aware of, if not
completely familiar with, work in a number of different subject disciplines in order to be
confident that they had properly grounded the research through an appropriate review of
the literature.

Along with this need was the hope that automation might hold the answers, the third and
final factor in the development of citation indexing. Computerization in the 1950s was far
removed from the desktop environment of today, but there was tremendous excitement
over potential benefits to be derived from the application of machines to the generation
and compilation of data. The U.S. government hoped that automation could mitigate or
even eliminate completely the difficulties of manual indexing. A number of projects were
launched by the United States with the intention of investigating these possibilities.

Dr. Eugene Garfield, founder and now Chairman Emeritus of ISI® (now Thomson
Reuters), was deeply involved in the research relating to machine generated indexes in
the mid-1950's and early 1960's. One of his earliest points of involvement was a project
sponsored by the Armed Forces Medical Library (predecessor to the current National
Library of Medicine). The Welch Medical Library Indexing project, as it was called, was
to investigate the role of automation in the organization and retrieval of medical
literature. The hope was that the problems associated with subjective human judgement
in selection of descriptors and indexing terms could be eliminated. By removing the
human element, one might thereby increase the speed with which information was
incorporated in to the indexes. It might also increase the cost-effectiveness of the
indexes. Garfield grasped early on that review articles in the journal literature were
heavily reliant on the bibliographic citations that referred the reader to the original
published source for the notable idea or concept. By capturing those citations, Garfield
believed, the researcher could immediately get a view of the approach taken by another
scientist to support an idea or methodology based on the sources that the published
writer had consulted and cited as pertinent in the bibliography. As retrieval terms,

240
citations could function as well as keywords and descriptors that were thoughtfully
assigned by a professional indexer.

In the early 1960s, Eugene Garfield and Associates developed two pilot projects that
would test the viability and efficiency of citation indexing. The first project involved the
creation of a database that would index the citations of 5,000 chemical patents held by
two private pharmaceutical companies. The referenced citations in this instance were to
prior patents, the documentation sources that the government patent examiners were
using to support a decision to grant or deny a patent. The connections that the patent
citation index made were then analyzed with two comparable classifications and
indexing systems that were currently being used by the participants. Based on this
investigation and analysis, the project sponsors determined that citation indexing
permitted the retrieval of relevant literature across arbitrary classifications in a way that
subject- oriented indexing could not.

A second pilot project in 1962 involved Garfield's recently incorporated enterprise, the
Institute for Scientific Information (now Thomson Reuters), with the United States
National Institutes of Health in building an index to the published literature on genetics.
This project was far more complex in nature than the patents index. Three databases
were built to cover the literature over 1 year, 5 years and 14 years with a varying number
of source publications indexed in each. While this project was to test the feasibility and
utility of a narrow, discipline-oriented citation index, at completion, it was concluded that
the database with the most broadly based set of source publications formed the most
comprehensive and useful guide to the published literature in the field of genetics. The
database for the single-year term had drawn not just on journals that were primarily
devoted to the field of genetics research but had drawn as well from a large pool of
journals that published genetics papers on a more peripheral or occasional basis.
Additionally, while the automated system required a certain level of effort in
standardizing the entries from a wide variety of published materials, the project
demonstrated the cost-effectiveness of citation indexing as opposed to the expense of
traditional subject indexing processes.

While, at the time of the project's completion, the government sponsors chose not to
subsidize the development of a national citation database, Eugene Garfield was
encouraged to move ahead with the private publication of his multidisciplinary citation

241
index as the first edition of the Science Citation Index® (SCI®). Available for purchase
since 1963, the SCI then and now represents the most comprehensive citation index to
the scientific journal literature. Today, the Web-based version of that index covers 5,600
journals across more than 150 scientific disciplines.

Garfield's achievement lay in establishing the utility and objectivity of a citation index in
pulling up related papers in published literature that at first glance might not have
seemed pertinent to the researcher's inquiry. Today, it is considered to be one of the
most reliable of resources in tracing the development of an idea across the multitude of
disciplines that are part of our body of scientific knowledge.

What is Impact Factor [IF]?

The JCR provides quantitative tools for ranking, evaluating, categorizing, and comparing
journals. The impact factor is one of these; it is a measure of the frequency with which
the "average article" in a journal has been cited in a particular year or period. The
annual impact factor is a ratio between citations and recent citable items published.
Thus, the impact factor of a journal is calculated by dividing the number of current year
citations to the source items published in that journal during the previous two years. The
journal Impact Factor is the average number of times articles from the journal published
in the past two years have been cited in the JCR year. The Impact Factor is calculated
by dividing the number of citations in the JCR year by the total number of articles
published in the two previous years. An Impact Factor of 1.0 means that, on average,
the articles published one or two year ago has been cited one time. An Impact Factor of
2.5 means that, on average, the articles published one or two year ago have been cited
two and a half times. Citing articles may be from the same journal; most citing articles
are from different journals. The impact factor is useful in clarifying the significance of
absolute (or total) citation frequencies. It eliminates some of the bias of such counts
which favor large journals over small ones, or frequently issued journals over less
frequently issued ones, and of older journals over newer ones. Particularly in the latter
case such journals have a larger citable body of literature than smaller or younger
journals. All things being equal, the larger the number of previously published articles,
the more often a journal will be cited

Example [A]:

242
Impact Factor [IF] is calculated as follows:

No. of times articles published in 2010 & 2011 were cited in indexed Journals during the
year 2012 = A

No. of articles, reviews, or notes published in 2010 & 2011 = B

Impact Factor for the year 2012 = A/B


If A = 150, B = 125, then IF [A/B] = 150 / 125 = 1.2

Example [B]:

Impact Factor [IF] for 5 years is calculated as follows:

No. of times articles published in 2007, 2008, 2009, 2010 & 2011 [ie, last 5 years] were
cited in indexed Journals during the year 2012 = A

No. of articles, reviews, or notes published in these 5 years as above = B

Impact Factor for the year 2012 = A/B

If A = 750, B = 625, then IF [A/B] = 750 / 625 = 1.2

[Note: The 5-year Impact Factor is available only in JCR 2007 and subsequent years.]

Aggregate Impact Factors

This is somewhat a type of fine-tuned Impact Factor, ie., the aggregate Impact Factor is
meant for a subject category. It is calculated in the same way as the Impact Factor of a
Journal. But here, the number of citations to all journals in a particular category [for
example, market research] and the number of articles from all these journals in that
category are taken into account. An aggregate Impact Factor of 2.0 means that that, on
average, the articles in a particular subject category [ie. market research] published one
or two years ago have been cited twice.

Example [C]:

Aggregate Impact Factor [IF] is calculated as follows:

243
No. of times articles published in 2010 & 2011 were cited in indexed Journals in a
particular category [market research] during the year 2012 = A

No. of articles, reviews, or notes published in these Journals for 2 years as above = B

Aggregate Impact Factor for the year 2012 = A/B

If A = 2250, B = 1950, then Aggregate IF [A/B] = 2250 / 1950 = 1.15

Cited-only Journals in the JCR

Some of the journals listed in the JCR are not citing journals, but are cited-only journals.
This is significant when comparing journals by impact factor because the self-citations
from a cited-only journal are not included in its impact factor calculation. Self-citations
often represent about 13% of the citations that a journal receives. Users can identify
cited-only journals by checking the JCR Citing Journal Listing. Cited-only journals may
be ceased journals, suspended journals, or superseded titles. Any journal that appears
elsewhere in JCR, but not in the Citing Journal Listing, is a cited-only journal.
Furthermore, users can establish analogous impact factors, (excluding self-citations), for
the journals they are evaluating using the data given in the Citing Journal Listing.

Calculation of Impact Factor revised to exclude self-citations:

A = citations in 2012 to articles published in 2010 & 2011


B = 2012 self-citations to articles published in 2010 & 2011
C = A – B = Total citations excluding self-citations to recent articles
D = No. of articles published in 2010 & 2011
E = Revised Impact Factor = C / D
[Earlier Impact Factor = A / D]

Example [D]:
Calculation of Impact Factors without self-citations.

Cites in Self-cites
Minus Articles
JCR 2012 to in 2012 to Revised
Self- Published
Impact 2010 & 2010 & Impact
Cites 2010 &
Name of the Factor 2011 2011 Factor
C = (A- 2011
Journals Articles Articles E = C/D
[A / D] B) D
A B

244
Journal I 1.40 525 - 525 375 1.40

Journal II 1.48 775 55 720 525 1.37

Journal III 1.23 1075 95 980 875 1.12

Journal IV 1.15 2250 165 2085 1950 1.07

Journal V 1.10 675 85 590 615 0.96

Journal VI 1.20 1975 215 1760 1650 1.07

A comparison of JCR Impact Factors and Revised Impact Factors of these six sample
Journals highlights the significant difference and tell-tales the fine-tuning of revised IF.
These values alone will be considered when self-citations are excluded.

Title Change

A user's knowledge of the content and history of the journal studied is very important for
appropriate interpretation of impact factors. Situations such as those mentioned above
and others such as title change are very important, and often misunderstood.

A title change affects the impact factor for two years after the change is made. The old
and new titles are not unified unless the titles are in the same position alphabetically. In
the first year after the title change, the impact is not available for the new title unless the
data for old and new can be unified. In the second year, the impact factor is split. The
new title may rank lower than expected and the old title may rank higher than expected
because only one year of source data is included in its calculation. Title changes for the
current year and the previous year are listed in the JCR guide.

Calculation of Unified 2012 Impact Factor for Title Change

P = 2012 citations to articles published in 2010 & 2011 [A1 + A2]


P1 = for New Title
P2 = for Superseded Title

Q = No. of articles published in 2010 & 2011 [B1 + B2]


Q1 = for New Title
Q2 = for Superseded Title

245
R = Unified Impact Factor = P / Q
R1 = P1 / Q1 = JCR Factor for the New Title
R2 = P2 / Q2 = JCR Factor for the Superseded Title

Applications of Impact Factors

There have been many innovative applications of journal impact factors. The most
common involve market research for publishers and others. But, primarily, JCR provides
librarians, academic research scholars as well as professional researchers with a tool for
the management of library journal collections. In market research, the impact factor
provides quantitative evidence for editors and publishers for positioning their journals in
relation to the competition—especially others in the same subject category, in a vertical
rather than a horizontal or intradisciplinary comparison. JCR data may also serve
advertisers interested in evaluating the potential of a specific journal.

Perhaps the most important and recent use of impact is in the process of academic
evaluation. The impact factor can be used to provide a gross approximation of the
prestige of journals in which individuals have been published. This is best done in
conjunction with other considerations such as peer review, productivity, and subject
specialty citation rates. As a tool for management of library journal collections, the
impact factor supplies the library administrator with information about journals already in
the collection and journals under consideration for acquisition. These data must also be
combined with cost and circulation data to make rational decisions about purchases of
journals.

The impact factor can be useful in all of these applications, provided the data are used
sensibly. It is important to note that subjective methods can be used in evaluating
journals as, for example, by interviews or questionnaires. In general, there is good
agreement on the relative value of journals in the appropriate categories. However,
the JCR makes possible the realization that many journals do not fit easily into
established categories. Often, the only differentiation possible between two or three
small journals of average impact is price or subjective judgments such as peer review.

Conclusion
Though the impact factor is found to be a very useful tool by academic fraternity for
evaluation of journals, but it must be used very cautiously and discreetly. Considerations

246
include the amount of review or other types of material published in a journal, variations
between disciplines, and item-by-item impact. But, for junior research scholars, the
Impact Factor of a Journal may be very useful in their research study.

Following Sources are duly acknowledged:

http://www.sciencegateway.org/impact/
http://admin-apps.webofknowledge.com/JCR/help/h_impfact.htm
http://thomsonreuters.com/products_services/science/free/essays/impact_factor/

Faith is the first factor in a life devoted to service. Without it, nothing is possible.
With it, nothing is impossible. << Martin Luther

247
Chapter XXXII

Publishing a
Research Paper

A2Z

PhD
Thesis
Reflections on Academic Research

248
PUBLISHING A RESEARCH PAPER

Identifying peer-reviewed journals

Peer-reviewed journals are an important medium for reported research output from
universities. A definition of a peer review is:

The process by which a learned journal passes a paper received for publication to outside
experts for their comments on its suitability and worth.

Publishing your work in a peer reviewed journal is an indication of quality. Intending


researchers need to submit their articles for review by experts in the field before the
article can be approved for publication in a peer-reviewed journal. Many databases allow
you to restrict your search to peer-reviewed journals. Journals indexed in Web of
Knowledge are all peer-reviewed. To check if a journal is peer-reviewed you can search
for it in Ulrich's Periodicals Directory or refer to the ISI Master Journal List.

Journal Impact Factors

Looking at the impact factor of a journal is a further way of measuring its quality.
The journal impact factor is the average number of times that articles published in a
specific journal in the two previous years (e.g. 2010-2011) were cited in a particular year
(i.e. 2012). The calculation is determined as follows:

A= total cites in 2012


B= 2002 cites to articles published in 2010 & - 2011 (this is a subset of A)
C= number of articles published in 2010-2011
D= B/C = 2002 impact factor

There are variations between disciplines. One should view journals in the context of their
specific field. Some disciplines work on a five year impact factor.

Journal impact factors should not be used solely to evaluate journals. Other criteria
should also be considered, such as peer review and scope.

249
Obtaining Journal Impact Factors

Scholar can consult ISI Journal Citation Reports on the Web (JCR Web) to find
the Impact Factor for a single journal title or a range of titles in a subject category. JCR
Web draws citation data from over 7,000 scholarly journals worldwide in the sciences
and social sciences.

JCR Web also depicts the Impact Factors of a journal over the last five years in
the Impact Factor Trend Graph. Another feature is the Immediacy Index which measures
how quickly the average article from a journal is cited within the year of publication. This
number is useful for evaluating journals that publish cutting-edge research.

JCR Web is available via Web of Knowledge and is linked to Web of Science searches.

Selecting journals in which to publish

Once the scholar has identified likely journals, the aims and scope of each are to be
checked to determine whether the paper at hand is suitable for that journal.

 Does the journal focus on primary research or review articles?


 Does it publish qualitative or quantitative research methods?
 What are the principal fields covered by the journal?

If your work meets the aims and scope of the journal you have selected, submit your
manuscript in the appropriate format for the journal. This format is usually indicated
under "instructions to authors".

Instructions to authors

Instructions to authors (also called advice to authors or authors guide) detail what is and
isn't acceptable to a particular publisher. Generally these guidelines include layout,
referencing style, how to submit, submitting tables and figures in text, the audience,
review process and publication.

You can find instructions to authors:

 in the print journal


 on the journal home page.

250
 by writing to the editor or publisher of the journal. Email addresses and websites can by
found from Ulrich's Periodicals Directory

Schematic View of Publishing a Paper

[Flowchart reproduced with permission from authors of UNISA website]

Some Useful Advice

Experienced journal editors and authors are willing to pass on their secrets of success.
Here is their best advice.

Have a focus and a vision


251
Angela M. Neal-Barnett, PhD, of Kent State University and author of the book, "Bad
Nerves" (Simon & Schuster, 2003), as well as numerous papers in multiple journals
believes that the key to successfully publishing an article is to "get a vision"--a reason
and purpose for writing. That concept isn't always familiar to academicians who often
write because they have to for tenure or promotion, she says. But, she advises, while
"academic wisdom [says] 'publish or perish,' ancient wisdom says 'without vision, the
people will perish.'" Once you have a vision, says Neal-Barnett, write it down and keep it
in constant view to remind you of your mission.

Write clearly
"There is no substitute for a good idea, for excellent research or for good, clean, clear
writing," says Nora S. Newcombe, PhD, of Temple University, former editor of
APA's Journal of Experimental Psychology: General. Newcombe endorses the advice of
Cornell University's Daryl J. Bem, PhD, who in Psychological Bulletin (Vol. 118, No. 2)
wrote that a review article should tell "a straightforward tale of a circumscribed question
in want of an answer. It is not a novel with subplots and flashbacks, but a short story with
a single, linear narrative line. Let this line stand out in bold relief." Newcombe also
admits that neatness counts. Though she tries not get in a "bad mood" about grammar
mistakes or gross violations of APA style, she says, such mistakes do "give the
impression that you're not so careful."

Get a pre-review
Don't send the manuscript to an editor until you have it reviewed with a fresh eye, warns
Newcombe. Recruit two objective colleagues: one who is familiar with the research area,
another who knows little or nothing about it. The former can provide technical advice,
while the latter can determine whether your ideas are being communicated clearly.

Send your manuscript to the right journal


Many rejections are the result of manuscript-journal mismatch--a discrepancy between
the submitted paper and the journal's scope or mission. Nora S. Newcombe, PhD, of
Temple University, former editor of APA's Journal of Experimental Psychology: General,
advises authors to consider the "theoretical bent" of the papers that regularly appear in
the journal before they submit a paper to it. A major faux pas is submitting your
manuscript simply to get it reviewed, says Newcombe. She's heard authors say, "This is

252
a small experiment that I know would never get published in that journal, but I would like
to get some feedback." Not a good idea, Newcombe says, because it wastes editors'
and reviewers' time, and those who reject it from the journal may also be the ones who
have to review the paper when it's submitted to a different journal. "It's a small
community out there. Don't use up your reviewers," she says.

Importance of cover letter


Many authors don't realize the usefulness of cover letters, Newcombe says. In addition
to stating "here it is" and that the paper conforms to ethical standards, Newcombe says
the letter can contain the author's rationale for choosing the editor's journal--especially if
it's not immediately apparent. The letter can also suggest reviewers for your manuscript,
she says, especially in the case of a field that an editor isn't well-versed in. The flip side
is also acceptable: Authors can suggest that certain people not review the manuscript for
fear of potential bias. In both cases, authors can't expect the editor to follow the
recommendations, says Newcombe. In fact, the editor may not follow any of them or
may use all of them.

Be Calm
The overwhelming majority of initial journal manuscripts are rejected at first. "Remember,
to get a lot of publications, you also will need to get lots of rejections," says Edward
Diener, PhD, editor of APA's Journal of Personality and Social Psychology: Personality
Processes and Individual Differences. Only a small proportion--5 to 10 percent--are
accepted the first time they are submitted, and usually they are only accepted subject to
revision. Since most papers are rejected from the start, says Newcombe, the key is
whether the journal editors invite you to revise it.

Read the reviews carefully


In fact, anything aside from simply "reject," Neal-Barnett reminds, is a positive review.
These include:
 Accept: "Which almost nobody gets," she says.
 Accept with revision: "Just make some minor changes."
 Revise and resubmit: "They're still interested in you!"
 Reject and resubmit: Though not as good as revise and resubmit, "they still want the
paper!"

Some reviewers may recommend submitting your work to a different journal. "They're
not saying the article is hopeless," says Neal-Barnett, "they're just saying that it may not

253
be right for that journal." If revision isn't invited following the initial rejection, many new
authors may toss the manuscript and vow to never write again to or change programs.
Newcombe's advice, though, is to read the reviews carefully and determine why that
decision was made. If the research needs more studies or if the methodology needs to
be changed somehow, "if you have a sincere interest in the area, do these things," says
Newcombe. You can resubmit it as a new paper, noting the differences in the cover
letter.Also keep in mind that "quite often, unfortunately, a journal will reject an article
because it's novel or new for its time," says Newcombe. "If you feel that it is valid and
good, then by all means, send it off to another journal."

Do the revisions
If you are invited to revise, "Do it, do it fast and don't procrastinate," says Newcombe.
Also, she warns that because reviewers can at times ask for too much, authors should
take each suggestion into consideration, but decide themselves which to implement.

Be diplomatic
What if reviewers disagree? "There is a wrong and a right way" to address dissention
among reviewers, says Newcombe. She quotes from Daryl Bem's Psychological
Bulletin article:

Wrong: "I have left the section on the animal studies unchanged. If reviewers A and C
can't even agree on what the animals have developed, I must be doing something right."

Right: "You will recall that reviewer A thought the animal studies should be described
more fully whereas reviewer C thought they should be omitted. Other psychologists in
my department agree with reviewer C that the animals cannot be a valid analogue to the
human studies. So, I have dropped them from the text and have attached it as a footnote
on page six."

Ultimately, it's good to keep in mind that the road to being published isn't a lonely one:
"All authors get lots of rejections--including senior authors such as me," says Diener.
"The challenge," he says, "is to persevere, and improve one's papers over time."

Some Guidelines to Write and Publish a Research Papers

Planning the Manuscript

1. The research paper topic should be unique and there should be a logical reason to study it.

254
2. Do your homework. Make sure you know what investigators in your field and other fields have
published about your topic (or similar topics). There is no substitute for a good literature review
before jumping into a new project.

3. Take the time to plan your experimental design. As a general rule, more time should be devoted to
planning your study than to actually performing the experiments (though there are some exceptions,
such as time-course studies with lengthy time points). Rushing into the hands-on work without
properly designing the study is a common mistake made by young researchers.

4. When designing your experiment, choose your materials wisely. Look to the literature to see what
others have used. Similar products from different companies do not all work the same way. In fact,
some do not work at all.

5. Get help. If you are performing research techniques for the first time, be sure to consult an
experienced friend or colleague. Rookie mistakes are commonplace in academic research and lead
to wasted time and resources.

6. Know what you want to study, WHY you want to study it, and how your results will contribute to the
current pool of knowledge for the subject.

7. Be able to clearly state a hypothesis before starting your work. Focus your efforts on researching
this hypothesis. All too often people start a project and are taken adrift by new ideas that come along
the way. While ideas are good to note, be sure to keep your focus.

8. Along with keeping focus, know your experimental endpoints. Sometimes data collection goes
smoothly and you want to dig deeper and deeper into the subject. If you want to keep digging
deeper, do it with a follow-up study.

9. Keep in mind where you might like to publish your study. If you are aiming for a high-impact journal,
you may need to do extensive research and data collection. If your goal is to publish in a lower-tier
journal, your research plan may be very different.

10. If your study requires approval by a review board or ethics committee, be sure to get the
documentation as needed. Journals will often require that you provide such information.

11. If your study involves patients or patient samples, explicit permissions are generally required from
the participant or donor, respectively. Journals may ask for copies of the corresponding
documentation.

General

12. Read and follow ALL of the guidelines for manuscript preparation listed for an individual journal.
Most journals have very specific formatting and style guidelines for the text body, abstract, images,
tables, and references.

13. HYPOTHESIS: be sure to have one and state it clearly. This is, after all, why you are doing the
research.

14. Write as though your work is meaningful and important. If you don’t, people will not perceive it as
meaningful and important.

255
15. Use an external peer review service (available through JournalPrep.com) to get your manuscript
reviewed prior to submission. Rapid and expert peer reviews, before you submit, may significantly
increase your odds of getting your manuscript accepted for publication.

16. Critique your own work. Look for areas that reviewers might spot as weaknesses and either correct
these areas or comment on them in your manuscript, leaving reviewers with fewer options for
negative criticisms.

17. Always present the study as a finished piece of work (although you may suggest future directions).
Otherwise, you can be sure reviewers will suggest additional research.

18. Be painstaking. Be thorough and patient with several rounds of editing of your work while
considering all the tiny details of the specifications requested by the journal. It will pay off in the end.

19. Focus. If you have a hypothesis to develop, be consistent to the end. Have substantial and
convincing evidence to prove your theories. Brainstorm your ideas and have a definite direction
mapped out before beginning to write an article.

20. Write in a precise and accurate way. Avoid long sentences; the reader may find them difficult to
follow.

21. Team-like spirit is an important attribute that contributes to successful publishing. Welcome advice
from those around you with potentially valuable input. No matter how competent you feel, having
your work seen through a different lens may help to spot flaws that you were unable to identify.

22. As a final step, after completing your research paper, edit, edit, edit. You need to identify and correct
any and all mistakes that you may have made.

23. Short papers are more likely to be read than long ones.

24. Select a descriptive title. Flash and puns are rarely as appealing as they may seem at first. You are
better off going simple and descriptive. This will also help you get cited.

25. Focus on the information the readers require when following your experiment, modeling description,
or data analysis instead of overloading them with details that might have been important during the
study but are irrelevant for them.

26. Your paper should advance a particular line of research. It does not need to answer every remaining
question about the topic.

27. If you present your work at an academic conference prior to submitting it for publication, get
constructive criticisms from as many potential reviewers as possible.

28. Make sure your paper reads well. A bunch of choppy, simple sentences, while grammatically correct,
is unpleasant to read.

29. Clear, concise, and grammatically correct English. Period.

256
30. Non-native English speakers should ALWAYS try to arrange for a review by a native speaker. If you
know someone with excellent proofreading skills and a general knowledge about your research
discipline (ex. Biological Sciences), ask them to help you out. If you don’t know someone who meets
these criteria, use a professional editing service such as that offered at JournalPrep.com. You will
save yourself from a great deal of frustration and lost time.

31. Show friends and colleagues your work, including those in different fields of research. Get as much
feedback as you can before you submit.

32. The body of the paper supports the central idea and must show a thoughtful, comprehensive study
of the research topic; it should be clearly written and easy to follow. It generally includes three main
parts: 1) Methodology, 2) Results & Data Analysis, and 3) Discussion.

33. When referencing other papers, do not simply reference work in the same way other papers have. If
paper X says that paper Y showed a specific result, check for yourself to ensure that this is true
before saying the same thing in your own manuscript. The number of reputable authors who
misunderstand their colleagues’ findings is shocking.

34. If you are in the process of running a follow-up experiment, write your manuscript in such a way that
it begs for that experiment. When reviewers respond and request it you will already have it
completed.

Introduction

35. Start your article with a comprehensive yet concise literature review of your exact subject and
highlight in which way your paper will make a new contribution to the field.

36. Throughout your introduction use the past tense. One exception to this is when you are speaking
about generally accepted facts and figures (ex. Heart disease is the leading cause of death…).

37. Avoid using new acronyms. They will simply confuse the readers.

38. The introduction of a research paper is extremely important. It generally presents a brief literature
review, the problem and the purpose of your research work. It should be powerful, simple, realistic,
and logical to entice the reader to read the full paper.

39. Avoid unnecessarily long paragraphs. Break up your paragraphs into smaller, useful units.

40. Do not be afraid to use headings in your introduction (and discussion).

Materials & Methods

41. Do not over-explain common scientific procedures. For example, you do not need to explain how
PCR or Western Blotting work, just that you used the techniques. If you are using a novel technique,
then you need to explain the steps involved.

257
42. Use third person passive tense. For example, “RNA was extracted from the cells.” Compare this
with, “We extracted RNA from the cells.”

43. Be sure to mention from which companies you purchased any significant reagents for your
experiments.

44. When in doubt about how to report your materials and methods, look to papers published in
recognized journals that use similar methods and/or materials.

45. Do not mention sources of typical labware (beakers, stripettes, pipet tips, cell culture flasks, etc).

Results

46. Make sure your graphs and tables can speak for themselves. A lot of people skim over academic
papers.

47. The Results section should contain only results, no discussion.

48. Do not repeat in words everything that your tables and graphs convey. You can, however, point out
key findings and offer some text that complements the findings.

49. Be sure to number your figures and tables according to journal guidelines and refer to them in the
text in the manner specified by the journal.

50. Clear to read graphs are essential. Do not overload graphs with data. Make sure axis descriptions
are not too small.

Discussion

51. Your discussion section should answer WHY you obtained the observed results. Do not simply
restate the results. Also address WHY your results are important (i.e. how do they advance the
understanding of the topic).

52. If multiple explanations for your results exist, be sure to address each one. You can favor one
explanation but be sure to mention alternative explanations, if some exist. If you don’t, your
reviewers will.

53. If your research findings are suggestive or supportive rather than decisive then make sure to indicate
so. NEVER overstate the importance of your research findings. Rather, clearly point to their true
significance.

54. Understand the message of your paper. You may discover what the message is only after a
literature search, as is occasionally the case for some manuscript types such as case reports.

55. Highlight how your research contributes to the current knowledge in the field and mention the next
steps or what remains. Feel free to explain why your results falsify current theories if that is the case.

258
56. Make sure that your discussion is concise and informative. If you ramble and include a great deal of
unnecessary information, your paper will likely get rejected or at least be looked upon less favorably.

Conclusions & References

57 The importance of the conclusions section should not be overlooked. It includes a brief restatement
. of the other parts of the research paper, such as the methodology, data analysis and results, and
concludes the overall discussion. It should be brief, concise, and worth remembering.

58 Reference page: All references used as sources of information in your research paper should be
. mentioned to strengthen your paper and also to avoid your work being considered plagiarized.

59 Failure to include every obscure reference to a topic will NOT prevent publication. What WILL prevent
. publication is procrastination by insisting on including such references.

60 Use bibliographic software such as EndNote or RefWorks. This will help you format your references
. section readily when you make changes throughout your paper after getting suggestions from friends,
colleagues or reviewers.

Abstract

61. In your abstract, limit the amount of background information you provide. Try to give only what is
necessary in a couple of sentences or less.

62. Never refer to figures or tables in your abstract.

63. When writing an abstract, always use the past tense since you are giving a summary of what was
done. One exception is if you mention future directions in your concluding statement.

64. Write a clear and concise abstract. The reader has to understand the study rationale, the methods
used, and the study findings. Many researchers will only ever read the abstract of your paper so it
must contain the most pertinent information.

65. Be sure to check journal guidelines for abstract length. Many journals will not accept abstracts longer
than 200-250 words.

66. Feel free to hook readers with a “big picture” statement to open the abstract. Remember, many
action editors will know very little about your topic area and, in some cases, your abstract will be the
only thing that dictates whether or not you get through triage.

Journal Selection

67. The most common mistake to be made is not knowing the body of research in which an article fits.
Wrong choice of journal for publishing spells outright rejection. Even if the article is very encouraging
with sound and rigorous scholarly work, it will not stand the test.

259
68. Look at journals that have published articles on your topic previously. This is an encouraging sign
that your work may appeal to the journal editors.

69. Look at journal impact factors. This will give you an idea of the quality of the journal and how difficult
it will be to get your paper accepted.

70. Look at journal acceptance/rejection rates. These are sometimes, but not always, inversely
correlated with impact factor values.

71. Look at average time to publication as well as average time to acceptance/rejection notification. If
you want your work published fast then make sure you choose a journal that offers rapid processing.
Some journals will highlight their rapid processing times as an impetus for authors to submit their
work to those particular journals.

72. Some journals charge fees for manuscript processing or color figure reproduction for accepted
manuscripts. Make sure you are familiar with the costs associated with publication before you submit
your work.

Manuscript Submission

73. Look at papers recently published in your journal of interest. Ask yourself if your paper is of equal or
higher caliber. If not, submit your work to a different journal.

74. Identify the journals related to your field of study and their individual focuses, and then select a
journal with a focus similar to the content of your manuscript. Many journals will clearly describe their
focus and scope on their website.

75. Consider your field of study. Every field of study has several different journals publishing information
pertaining to that field. Knowing the names of those journals narrows your prospective playing field.

76. Select two or three journals with a focus similar to the content of your manuscript. While you are only
going to be published in one, preparing multiple choices keeps you from having to duplicate the
selection process immediately following your possible rejection.

77. Locate the contact information for each journal and any information pertaining to submissions. Make
sure you get the most recent information, as the names of editors and submission policies can
change over time and without warning.

78. Go over your manuscript to ensure it is formatted according to the submission guidelines, paying
special attention to the references/bibliography, text formatting, and citation style.

79. Create your cover letter. This should include the name of the editor to whom you are sending your
work, if available. While you want to be personable, you should avoid being too personal. This is a
business communication, not a letter to your friend. Be sure to keep it professional. Include contact
information for the editor in case he or she should wish to speak with you about your work.

80. Get your cover letter professionally edited. Cover letters are often the first thing that a journal editor
will read. Your letter needs to be strong and impressive, as it can set the tone for the subsequent

260
review process.

81. Submit your work. This could be done physically or electronically, depending on the submission
guidelines of your selected journal. In the case of electronic submissions, some journals will accept
attachments; others will not. Be sure to send your work in the correct format. If you are sending it
physically, include a self-addressed, stamped envelope, either large enough to return your work in or
just large enough for them to send you a letter.

82. Aim high but not too high. Aiming for top tier journals with research findings that are not
groundbreaking will leave you with a lot of rejections and lost time.

83. Do NOT submit your article to more than one journal at a time. This is unethical and you will
eventually get caught.

84. When uploading text, table and image files electronically, many submission systems will dynamically
assemble your files into a single PDF document for easier handling. Be sure to review your PDF
after it is generated to ensure that it looks correct and that all information has been included.

85. Respect word length. Many journals have specific requirements for word length for different
document types (original articles, short reports, case reports, review papers, etc). If the journal says
the word limit is 6000 then do not send a paper with 6100 words.

86. If a journal allows you to suggest reviewers for your manuscript, do so. This can work to your
advantage. Suggest reviewers who know your field well and who might be interested in the results
presented in your paper.

87. If a journal allows you to suggest reviewers who you do not want to review your paper, take
advantage of this to make sure your work is not sent to someone in your field who may not see eye
to eye with you, your supervisor, your lab, or your research in general.

88. If you definitely do not want your paper reviewed by specific individuals in your field, do not submit a
paper to a journal where these individuals have published recently. Editors often look to people who
have recently published on a similar topic in their journal to serve as reviewers.

89. If you think specific reviewers may look favorably upon your work, look to journals where they have
recently published and submit your work there, if it is within scope. In doing so, be sure to reference
these individuals in your manuscript whenever credit is due. There is nothing that angers peer
reviewers more than reviewing an article in which their own work should be cited and is not.

90. Read the mission statement for the journal to which you will submit your work. If your paper is highly
theoretical and the journal clearly states that it does not publish purely theoretical work, find a new
journal.

91. Email the editor to see if your manuscript topic is appropriate. Most will happily direct you elsewhere
if it is inappropriate for their journal.

92. Look for journals that have issued calls for papers. They are more likely to look upon any work
favorably.

261
Post-submission

93. When you get initial peer reviews, consider them carefully. In your resubmission cover letter,
respond to each point made by each reviewer. Highlight the points you followed and the ones you
did not (and indicate why).

94. When you are asked to perform additional studies, do them quickly and resubmit your manuscript as
soon as possible.

95. If reviewers suggest changes/additional studies before the article can be published, respond to the
editor indicating that you will address these suggestions so that they know your intentions.

96. Do not respond to reviewer comments in an argumentative tone. Be polite but straightforward. Feel
free to disagree but be sure to have hard evidence to support your claims.

97. If accepted, be sure to carefully check page proofs and do so quickly. A 24-48 hour turnaround
request is typical.

98. In responding to reviewer comments, it is a good idea to copy and paste the reviewers’ comments
verbatim in one color (e.g. black) and add your responses in another color (e.g. blue). You should
also copy and paste any relevant sections from your revised manuscript into your cover letter.
Ideally, a reviewer should be able to tell how adequately you have addressed their comments
without having to read your revised manuscript.

99. Well-organized, well-written response letters can help a manuscript circumvent re-review. The editor
will see the changes that you have made and may accept it outright.

100 Remember to select as many “Key Words” as possible. Many people do key word searches when
. performing literature reviews. This will increase the likelihood of your manuscript being read.

“Publishing a volume of verse is like dropping a rose petal


down the Grand Canyon and waiting for the echo.”
<< Don Marquis

262
Reflections on Academic Research

A2Z

PhD
Thesis
Academic Research

Chapter XXXIII

Plagiarism

263
PLAGIARISM

What is plagiarism?

Plagiarism is the method of taking another person's writing, conversation, song, or even
idea and showing it off as one’s own. This includes information from web pages, books,
songs, television shows, email messages, interviews, articles, artworks or any other
medium. Whenever you paraphrase, summarize, or take words, phrases, or sentences
from another person's work, it is necessary to indicate the source of the
information within your paper using an internal citation. It is not enough to just list the
source in a bibliography at the end of your paper. Failing to properly quote, cite or
acknowledge someone else's words or ideas with an internal citation is plagiarism.

Plagiarism is deception in a literal sense. Plagiarism is copying the work of others and
claiming it as your own. Whether you copy from a published essay, an encyclopedia
article, or a paper from a fraternity's files, you are plagiarizing. If you do so, you run a
terrible risk. You could be punished, suspended, or even expelled. There is also another
kind of plagiarism, known as accidental plagiarism. This happens when a scholar does
not intend to plagiarize, but fails to cite the sources completely and correctly. Careful
notetaking and a clear understanding of the rules for quoting, paraphraing, and
summarizing sources can help prevent this.

Tips for avoiding accidental plagiarism

 Cite every piece of information that is not a) the result of your own research, or b)
common knowledge. This includes opinions, arguments, and speculations as well
as facts, details, figures, and statistics.
 Use quotation marks every time you use the author's words. (For longer quotes,
indenting the whole quotation has the same effect as quotation marks.)
 At the beginning of the first sentence in which you quote, paraphrase, or
summarize, make it clear that what comes next is someone else's idea:
o According to Smith...
o Jones says...
o In his 1987 study, Robinson proved...
 At the end of the last sentence containing quoted, paraphrased, or summarized
material, insert a parenthetical citation to show where the material came from:

The St. Martin's Handbook defines plagiarism as "the use of someone else's
words or ideas as [the writer's] own without crediting the other person" (Lunsford
and Connors 602).
264
(Notice the use of brackets to mark a change in the wording of the original.)

The following discussions and examples are cited from:


Lunsford, Andrea, and Robert Connors. St. Martin's Handbook. 3rd. ed. New York: St. Martin's
Press, 1995.

Avoiding two common forms of accidental plagiarism

1. Paraphrases with no citation

Because a paraphrase is supposed to contain all of the author's information and none of
your own commentary, a paraphrase with no citation is an example of plagiarism.
The St. Martin's Handbook defines an appropriate paraphrase as follows:

A paraphrase accurately states all the relevant information from a passage in your own words
and phrasing,without any additional comments or elaborations [it] always restates all themain
points of the passage in the same order and in about the same number of words. (Lunsford and
Connors 596)

Lunsford and Connors go on to give two examples of unacceptable paraphrases: one that uses
the author's words, and one that uses the author's sentences structures (597).

Lunsford and Connors also state that "even for acceptable paraphrases you must include a
citation in your essay identifying the source of the information" (597). This point is crucial: without
the information about the source, an appropriate paraphrase becomes plagiarism.

Even if you have avoided using the author's words, sentences structure, or style, an unattributed
paraphrase is plagiarism because it presents the same information in the same order.

2. Misplaced citations

If you use a paraphrase or direct quotation, it is important to place the reference at the
very end of all the material cited. Any quoted, paraphrased, or summarized material that
comes after the reference is plagiarized: it looks like it is supposed to be your own idea.

This is one reason why accurate notetaking is so important; it is possible to forget which
words are yours and which are the original writers.

Original source:

Paraphrasing material helps you digest a passage, because chances are you can't restate the
passage in your own words unless you grasp its full meaning. When you incorporate an accurate
paraphrase into your essay, you show your readers that you understand that source. (Lunsford
and Connors 596)

Plagiarism (misplaced citation):

Lunsford and Connors say that paraphrasing is useful because "[p]araphrasing material helps
you digest a passage, because chances are you can't restate the passage in your own words

265
unless you grasp its full meaning" (596). When you incorporate an accurate paraphrase into your
essay, you show your readers your understanding of that source.

The reader would logically assume that the sentence following the citation is your own
comment on the quotation, when it is actually part of the original quote.

Finally, a point about multiple citations from the same source: cite them all individually. It
is not adequate to give one citation at the end of the paragraph for a bunch of individual
points abstracted from a source.

Parenthetical citations are intended to make citing your sources easy to do; don't be shy
about using them.

Example of acceptable paraphrase: putting the idea in your own words

Taken from Lunsford and Connors 597-98. Key words and phrases in the original are
in boldface. The changes in wording and sentence structure in the paraphrase are
underlined.

Original

But Frida's outlook was vastly different from that of the Surrealists. Her art was not
the product of a disillusioned European culture searching for an escape from the limits of
logic by plumbing the subconscious. Instead, her fantasy was a product of her
temperament, life, and place; it was a way of coming to terms with reality, not of passing
beyond reality into another realm.

Hayden Herrera, Frida: A Biography of Frida Kahlo (258)

Paraphrase

As Herrera explains, Frida's surrealistic vision was unlike that of the European Surrealists.
While their art grew out of their disenchantment with society and their desire to explore the
subconscious mind as a refuge from rational thinking, Frida's vision was an outgrowth of her own
personality and life experiences in Mexico. She used her surrealistic images to understand better
her actual life, not to create a dreamworld (258).

The following discussions and examples are cited from:


http://www.indiana.edu/~wts/pamphlets/plagiarism.shtml

How to Recognize Unacceptable and Acceptable Paraphrases

Here’s the ORIGINAL text, from page 1 of Lizzie Borden: A Case Book of Family and
Crime in the 1890s by Joyce Williams et al.:

266
The rise of industry, the growth of cities, and the expansion of the population were the three great
developments of late nineteenth century American history. As new, larger, steam-powered
factories became a feature of the American landscape in the East, they transformed farm hands
into industrial laborers, and provided jobs for a rising tide of immigrants. With industry came
urbanization the growth of large cities (like Fall River, Massachusetts, where the Bordens lived)
which became the centers of production as well as of commerce and trade.

Here’s an UNACCEPTABLE paraphrase that is plagiarism:

The increase of industry, the growth of cities, and the explosion of the population were three large
factors of nineteenth century America. As steam-driven companies became more visible in the
eastern part of the country, they changed farm hands into factory workers and provided jobs for
the large wave of immigrants. With industry came the growth of large cities like Fall River where
the Bordens lived which turned into centers of commerce and trade as well as production.

What makes this passage plagiarism?

The preceding passage is considered plagiarism for two reasons:

 the writer has only changed around a few words and phrases, or changed the
order of the original’s sentences.
 the writer has failed to cite a source for any of the ideas or facts.

If you do either or both of these things, you are plagiarizing.

NOTE: This paragraph is also problematic because it changes the sense of several sentences
(for example, "steam-driven companies" in sentence two misses the original’s emphasis on
factories).

Here’s an ACCEPTABLE paraphrase:

Fall River, where the Borden family lived, was typical of northeastern industrial cities of the
nineteenth century. Steam-powered production had shifted labor from agriculture to
manufacturing, and as immigrants arrived in the US, they found work in these new factories. As a
result, populations grew, and large urban areas arose. Fall River was one of these manufacturing
and commercial centers (Williams 1).

Why is this passage acceptable?

This is acceptable paraphrasing because the writer:

 accurately relays the information in the original;


uses her own words.
 lets her reader know the source of her information.

Here’s an example of quotation and paraphrase used together, which is also


ACCEPTABLE:

Fall River, where the Borden family lived, was typical of northeastern industrial cities of the
nineteenth century. As steam-powered production shifted labor from agriculture to manufacturing,
267
the demand for workers "transformed farm hands into industrial laborers," and created jobs for
immigrants. In turn, growing populations increased the size of urban areas. Fall River was one of
these hubs "which became the centers of production as well as of commerce and trade" (Williams
1).

Why is this passage acceptable?

This is acceptable paraphrasing because the writer:

 records the information in the original passage accurately.


 gives credit for the ideas in this passage.
 indicated which part is taken directly from her source by putting the passage in
quotation marks and citing the page number.

Note that if the writer had used these phrases or sentences in her own paper without
putting quotation marks around them, she would be PLAGIARIZING. Using another
person’s phrases or sentences without putting quotation marks around them is
considered plagiarism even if the writer cites in her own text the source of the phrases or
sentences she has quoted.

Strategies for Avoiding Plagiarism

1. Put in quotations everything that comes directly from the text especially when taking
notes.

2. Paraphrase, but be sure you are not just rearranging or replacing a few words.

Instead, read over what you want to paraphrase carefully; cover up the text with your
hand, or close the text so you can’t see any of it (and so aren’t tempted to use the text
as a “guide”). Write out the idea in your own words without peeking.

3. Check your paraphrase against the original text to be sure you have not accidentally
used the same phrases or words, and that the information is accurate.

Terms You Need to Know (or What is Common Knowledge?)

Common knowledge: facts that can be found in numerous places and are likely to be
known by a lot of people.

Example: John F. Kennedy was elected President of the United States in 1960.

This is generally known information. You do not need to document this fact.

However, you must document facts that are not generally known and ideas that interpret
facts.

268
Example: According the American Family Leave Coalition’s new book, Family Issues
and Congress, President Bush’s relationship with Congress has hindered family leave
legislation (6).

The idea that “Bush’s relationship with Congress has hindered family leave legislation” is
not a fact but an interpretation; consequently, you need to cite your source.

Quotation: using someone’s words. When you quote, place the passage you are using
in quotation marks, and document the source according to a standard documentation
style.

The following example uses the Modern Language Association’s style:

Example: According to Peter S. Pritchard in USA Today, “Public schools need reform but
they’re irreplaceable in teaching all the nation’s young” (14).

Paraphrase: using someone’s ideas, but putting them in your own words. This is
probably the skill you will use most when incorporating sources into your writing.
Although you use your own words to paraphrase, you must still acknowledge the source
of the information.

The following discussions and examples are cited from:


http://www.ccc.commnet.edu/mla/plagiarism.shtml

Some More Examples

The original text from Elaine Tyler May's "Myths and Realities of the American Family"
reads as follows:

Because women's wages often continue to reflect the fiction that men earn the family wage,
single mothers rarely earn enough to support themselves and their children adequately. And
because work is still organized around the assumption that mothers stay home with children,
even though few mothers can afford to do so, child-care facilities in the United States remain
woefully inadequate.

Here are some possible uses of this text. As you read through each version, try to
decide if it is a legitimate use of May's text or a plagiarism.

Version A:

Since women's wages often continue to reflect the mistaken notion that men are the main wage
earners in the family, single mothers rarely make enough to support themselves and their children
very well. Also, because work is still based on the assumption that mothers stay home with
children, facilities for child care remain woefully inadequate in the United States.

Plagiarism: In Version A there is too much direct borrowing of sentence structure and
wording. The writer changes some words, drops one phrase, and adds some new
269
language, but the overall text closely resembles May's. Even with a citation, the writer is
still plagiarizing because the lack of quotation marks indicates that Version A is a
paraphrase, and should thus be in the writer's own language.

Version B:

As Elaine Tyler May points out, "women's wages often continue to reflect the fiction that men earn
the family wage" (588). Thus many single mothers cannot support themselves and their children
adequately. Furthermore, since work is based on the assumption that mothers stay home with
children, facilities for day care in this country are still "woefully inadequate." (May 589).

Plagiarism: The writer now cites May, so we're closer to telling the truth about the
relationship of our text to the source, but this text continues to borrow too much
language.

Version C:

By and large, our economy still operates on the mistaken notion that men are the main
breadwinners in the family. Thus, women continue to earn lower wages than men. This means, in
effect, that many single mothers cannot earn a decent living. Furthermore, adequate day care is
not available in the United States because of the mistaken assumption that mothers remain at
home with their children.

Plagiarism: Version C shows good paraphrasing of wording and sentence structure, but
May's original ideas are not acknowledged. Some of May's points are common
knowledge (women earn less than men, many single mothers live in poverty), but May
uses this common knowledge to make a specific and original point and her original
conception of this idea is not acknowledged.

Version D:

Women today still earn less than men — so much less that many single mothers and their
children live near or below the poverty line. Elaine Tyler May argues that this situation stems in
part from "the fiction that men earn the family wage" (588). May further suggests that the
American workplace still operates on the assumption that mothers with children stay home to
care for them (589).

This assumption, in my opinion, does not have the force it once did. More and more
businesses offer in-house day-care facilities. . . .

No Plagiarism: The writer makes use of the common knowledge in May's work, but
acknowledges May's original conclusion and does not try to pass it off as his or her own.
The quotation is properly cited, as is a later paraphrase of another of May's ideas.

Strategies for Detection

There are some tell-tale signs that a passage, paper, article, etc have been plagiarized.
These include:
270
 mixed citation styles
 no references or quotations
 missing references
 bibliography entries that have not been cited
 strange formatting
 anachronisms
 datedness
 sharp shifts in style

Run papers, passages, etc.., that display any of these features through one of the
plagiarism detection services listed here:

[a] http://www.plagiarismchecker.com/

[b] http://bedfordstmartins.com/technotes/techtiparchive/ttip102401.htm

[c] http://www.duplichecker.com/

[d] http://www.dustball.com/cs/plagiarism.checker/

[e] http://teaching.berkeley.edu/bgd/prevent.html

[f] http://www.virtualsalt.com/antiplag.htm

[g] http://www.plagiarism.com

[h] http://www.plagiarism.org

[i] http://www.plagiarisma.net

“The important thing is not to stop questioning.”<< Albert Einstein

271
A2Z

PhD
Thesis
Reflections on Academic Research

Glossary
272
GLOSSARY

A priori contrasts
A special class of test used in conjunction with the F test i.e., specifically designed to test the
hypotheses of the experiment of the study (in comparison to hoe or unplanned test)

ABAB design
Multiple intervention design in which the experimental manipulation occurs at least twice with an
intervening period in which to observe the effect of the withdrawal of the initial manipulation, also
called a reversal design.

Abstract
A brief summary of the research study. [#] Brief summary that appears at the beginning of most
social research reports; can be retrieved by an abstracting service.

Acceptance criterion
The maximum number of defective items that can be found in the sample and still allow
acceptance of the lot.

Acceptance sampling
A statistical procedure in which the number of defective items found. In a sample is used to
determine whether a lot should be accepted or reflected.

Accuracy The
To which bias is absent from the sample- the underestimates and the overestimates are balanced
among members of the sample (i.e., no systematic variance)

Action research
A methodology with brain storming followed by sequential trial – and error top discover the most
effective solution to a problem; succeeding solutions are tired the desired results are achieved
used with complex problems about which title is known.

Active factors
Those independent variables (IV) the researcher can manipulate by causing the subject to
receive on treatment level or another.

Activity analysis >>>> Process Analysis.

Adjusted multiple coefficient of determination


A measure of the goodness of fit of the estimated multiple regression equation that adjusts for
the number of independent variables in the model and thus avoid overestimating the impact of
adding more independent variables.

273
Administrative question
A measurement question that identifies the participant, interviewer, interview location (nominal
data)

Aggregate
A group of persons that have certain traits or characteristics in common without necessarily
having any direct social connection with one another. For example, "all female physicians" is an
aggregate; Gross National Income is an aggregation of data about individual incomes.

Aggregate-level data
Based on grouped data using a spatial or temporal unit of analysis.

Alpha level
The significance level. Specifically, alpha is the Type I error, or the probability of concluding that
there is a treatment effect when in reality there is not.

Alpha problem
Difficulty of deciding whether to reject the null hypothesis when a few statistically significant
results are produced by many inferential statistical tests.

Alpha
In tests of statistical significance, the alpha level indicates the Probability of committing a Type I
error; in estimates of internal consistency, a reliability coefficient, as in Cronbach alpha. [#]
Probability of wrongly rejecting a null hypothesis; usually set by researcher before the study (by
consensus .05 unless otherwise indicated).

Alternative hypothesis (H1)


That a difference exists between the simple parameter and the population statistics to which it is
compared: the logical opposite of the null hypothesis used in significance testing [#] The
hypothesis concluded to be true if the null hypothesis is rejected.
[#] A specific statement of prediction at states you will accept happen in your study.

Ambiguities a projective technique (imagination exercise)


Where participants imagine a brand of applied to are different product (e.g., a tied dog food or
Marlboro cereal), and then describe it’s a tributes and position.

Analysis of covariance
Statistical procedure for adjusting posttest scores for pretest group differences.

Analysis of variance (ANOVAs)


Tests the null hypothesis that the means of several independent populations are equal; test a
statistic is the F ration; when you need K- independents samples tests. [ #] A statistical test for
comparing mean scores among 3 or more groups

Analysis
The process of synthesizing data to answer the research question

ANCOVA (Analysis of Covariance)


An analysis that estimates the difference between the groups on the posttest after adjusting for
differences on the pretest.

274
Anonymity
The assurance that no one, including the researchers, will be able to link data to specific
individual. [#] A research condition in which no one, including the researcher, knows the
identities of research participants.

ANOVA (Analysis of variance)


An analysis that estimates the difference between groups on a posttest. The ANOVA could
estimate the difference between a treatment and control group (thus being equivalent to the t-
test) or can examine both main and interaction effects in a factorial design.

ANOVA table
A table to summaries the analysis of variance computations and results. It contains columns
showing the source of variation, the sum of squares, the degrees of freedom, the mean square,
and the F values.

Applied research
That addresses existing problems or opportunities. [#] It is a research done
for an express purpose to solve an identified problem. [#] Research
undertaken with the intention of applying the results to some specific
problem, such as studying the effects of different methods of law
enforcement on crime rates. One of the biggest differences between
applied and basic research is that in applied work the research questions
are most often determined, not by researchers, but by policy makers or
others who want help. Types of applied research include evaluation research and action
research.

Arbitrary scales
Universal practice of ad hoe scale development used by instrument designers to create scales
that are highly specific to the practice or object being studied.

Archives
Ongoing records kept by institutions of society.

Area chart
A graphical presentation that displays total frequency, .group frequency, and time series chart or
surface chart.

Area random sampling >>> cluster random sampling

Area sampling
A cluster sampling technique applied to a population with well-defined political or natural
boundaries: population is divided into homogeneous cluster from which a single-stage or
multistage sample is drawn.

Argument
Statement that explains, interprets, defends. challenges, or explores meaning.

Artifact correlation
Where distinct subgroups in the data combine to give the impression of one.

275
Assent
Agreement by an individual not competent to give legally valid informed consent (e.g., a child or
cognitively impaired person) to participate in research.

Assignable cause
Variations in process outputs that are due to factors such as machine tools wearing out, incorrect
machine settings, poor-quality raw materials, operator error and so on. Corrective action should
be taken when assignable causes of output variation are detected.

Association
The process used to recognize and understand patterns in data and then used to understand and
exploit natural patterns.

Asymmetrical relationship
In which we postulate that change in one variable (IV) is responsible for change in another
variable (DV).

Atomistic Fallacy: The fallacy one commits when making inferences about groups or
aggregates from individuals (see Ecological Fallacy).

Attenuation
Effect of measurement error or unreliability in reducing the apparent magnitude of association of
two variables.

Attitude a learned
Stable predisposition to respond to oneself. Other person, objects or issues in a consistently
favorable or unfavorable way.

Attribute
A specific value of a variable. For instance, the variable sex or gender has two attributes: male
and female.

Attrition
Loss of subjects from a study over time. [#] Loss of study participants during a study. Attrition
can be a threat to the internal validity of a study, and it can change the composition of the study
sample.

Audience
Characteristics and background of the people or groups for whom the secondary source was
created: one of the five factors used to evaluate the value of a secondary source.

Authority
The level of data and the credibility of a source as indicated by the credentials of the author and
publisher: one of five factors used to evaluate the value of a secondary source.

Authority figure
A projective technique (imagination exercise)where participants are asked to imagine that the
brand or product is an authority figure and to describe the attributes of the figure.

276
Autocorrelation
Correlation in the errors that arises when the error terms at successive points in time are related.

Automatic interaction detection (AID)


A data partitioning procedure that searches up to 300 variable for the single best predictor of a
dependent variable.

Autonomic system
Portion of human nervous system including two subsystems, the sympathetic and the
parasympathetic, the former of which controls certain bodily responses indicating emotion.

Autonomy
The personal capacity participants should possess in research conditions to consider alternatives,
make choices, and act without undue influence or interference of others.

Average linkage
Method evaluates the distance between two clusters by first finding the geometric center of each
cluster and then computing distance between the two centers.

Backward elimination
2
Sequentially removing the variable from a regression model that change R the least: See >>>
Forward selection and Stepwise selection.

Balanced rating
Has an equal number of categories above and below the midpoint or an equal number of
favorable/unfavorable response choices.

Band >>> Predication and Confidence Bands

Bar chart
A graphical presentation techniques that represents frequency data as horizontal or vertical bars:
vertical bars are most often used for time series and quantitative classifications (histograms,
stacked bar, and multiple-variable charts are specialized bar charts).

Bar code
Technology employing labels containing electronically read vertical bar data codes.

Bar graph
A graphical device for depicting data that have been summarized in a frequency distribution,
relative frequency distribution or percent frequency distribution.

Basic requirements of probability


Two requirements that restrict the manner in which probability assignments can be made : (1) for
each experimental outcome Ei we must have 0<=P(Ei)<=1 : (2) considering all experimental
outcomes, we must have P(E1)+ P(E2)+…….+ P(E3) = 1.

Basic research >>> Pure Research.

Bayesian statistics
Uses subjective probability estimates based on general experience rather than on data collected
277
Bell curve
Smoothed histogram or bar graph describing the expected frequency for each value of a variable.
The name comes from the fact that such a distribution often has the shape of a bell.

Beneficence An ethical principle that requires an obligation to protect research participants from
harm. The principle of beneficence can be expressed in two general rules: (1) do not harm; and
(2) protect from harm by maximizing possible benefits and minimizing possible risks of harm.

Benefit A valued or desired outcome; an advantage.

Benefit chain >>> Laddering.

Beta weights
Standardized regression coefficient where the size of the number reflects the level of influence X
exerts on Y.

Between–subjects design
Experimental design in which the contrast between differently treated groups measures the
treatment effect.

Bias
It is a loss of balance and accuracy in the use of research methods. It can creep into research via
the sampling frame, random sampling, or non-response. It can also occur at other stages in
research, such as while interviewing, in the design of questions, or in the way data are analyzed
and presented. Bias means that the research findings will not be representative of, or
generalisable to, a wider population. [#] That part of the deviation of the observed score from the
true value of the construct being measured that is unchanging or tends in one direction (as
distinct from randomly varying error which sums to zero over enough cases)

Biased Sample
A sample that is not representative of the population from which it was drawn
(>>> Representative Sample).

Bibliography (bibliographic database)


A secondary source that help locate a book, article , photograph, etc.

Binomial probability distribution


A probability distribution showing the probabilities in a binomial experiment.

Binomial probability function


The function used to compute probabilities in a binomial experiment.

Binominal experiment
A probability experiment having the following four properties : consisting of n identical trials, and
independent trials.

Biographical Research
A narrative approach to research is primarily qualitative, and includes gathering/ using data in the
form of diaries, stories and life histories.

Bivariate Analysis
Pertaining to two variables only.

278
Bivariate association
Association between two variables.

Bivariate correlation analysis


A statistical technique to assess the relationship of two continuous variables measured on an
interval or ratio scale.

Bivariate normal distribution


Data are from a random sample where two variables are normally distributed in a joint manner.

Blind
Technique of avoiding experimenter expectancy by concealing assignments of subject from
researcher or of avoiding demand characteristic by concealing assignment of subject from
subject. When both subject and experimenter are blind to the assignment, the study is called
‘double blind”. [#] When participants do not know if they are being exposed to the experimental
treatment.

Blocking
The process of using the same or similar experimental units for all treatments. The purpose of
blocking is to remove a source of variation from the error term and hence provide a more
powerful test for a difference in population or treatment means. [#] Dividing subjects into groups
based on a measured independent variable.

Boolean operators
Connecting words such as and or that can identify overlapping or non-overlapping sets of
information.

Bound on the sampling error


A number added to and subtracted from a point estimate to create an approximate 95%
confidence interval. It is given by two times the standard error of the point estimator.

Box plot
A graphical summary of data. A box, drawn from the first to the third quartiles, shows the location
of the middle 50% of the data. Dashed lines, called whiskers, extending from the ends of the box
show the location of data values greater than the third quartile and data values less than the first
quartile.

Boyes’ theorem
A method used to compute posterior probabilities.

Branch
Technique that skips irrelevant questions and directs the interviewee to the next appropriate item.

Branched question
A measurement question sequence determined by the participant’s previous answer(s): the
answer to one question assumes other questions have been asked or answered and directs the
participant to answer specific questions that follow and skip other questions: branched questions
determine question sequencing.

Brand mapping
A projective technique(type of semantic mapping) where participants are presented with different
brands and asked to talk about their perception, usually in relation to several criteria. They may
also be asked to spatially place each brand on one or more semantic maps.

279
Briefing
A short presentation to a small group ,where statistics constitute much of the content.

Buffer question
A neutral measurement question designed chiefly to establish rapport with the participant (usually
nominal data).

Business Intelligence System (BIS)


A system of ongoing information collection about events and trends in the technological
,economic , political and legal, demographic ,culture , social , and competitive arenas.

Business research
A systematic inquiry that provides information to guide business decisions,
the process of determining, acquiring, analyzing and synthesizing, and
disseminating relevant business data, information, and insights to decision
makers in way that mobilize the organization to take appropriate action
that, in turn maximize organizational performance.

(Classic) Controlled Experiment


An experimental design with two or more randomly selected groups (an experimental
group and control group) in which the researcher controls or introduces the independent
variable and measures the dependent variable at least two times (pre- and post-test
measurements).

C creativity session
Qualitative technique where an individual activity exercise is followed by a sharing /discussion
session, where participants build on one another’s creative ideas: often used with children: may
be conducted before or during IDIs or group interviews: usually consists of drawing, visual
compilation, or writing exercises.

Call number
Identifying code numbers and letters by which an item can be located in a library.

Callback
Procedure involving repeated attempts to make contact with a targeted participant to ensure that
the targeted participants is reached and motivated to participate in the study.

Canned experimenter
Standardization of experimental procedure by use of tape- recorded in structions.

Cartoons or empty balloons


A projective technique where participants are asked to write the dialog for a cartoonlike picture.

280
Case
The entity or thing the hypothesis talks about. [#] Unit of analysis, usually individual subject for
whom measures are collected on each variable.

Case study (case history)


A methodology the combines individual and (sometimes) group interviews with record analysis
and observation: used to understand events and their
ramifications and processes: emphasizes the full contextual
analysis of a few events or condition and their interrelations for a
single participant: a type of pre-experimental design (one-shot
case study). [#] A research strategy that focuses on one case (an
individual, a group, an organization, etc.) within its social context
during one time period. [#] It is the presentation of data about a
single setting or event. It is not a method of research as such
because the data being offered can have been gathered using a
variety of different methods (questionnaire, observation, and so
forth). It is predominantly a description, and is usually based on a
qualitative data set, though statistics such as survey findings may be incorporated.

Categorization
For this scale type, participants put themselves or property indicates in groups or categories:
also, a process for grouping data for any variable into limited number of categories.

Causal forecasting methods


Forecasting methods that relate a time series to other variables that are believed to explain or
cause its behavior.

Causal Hypothesis
A statement hypothesizing that the independent variable affects the dependent variable in some
way.

Causal Relationship
A relationship where an independent variable affects a dependent variable in some way. [#] A
cause-effect relationship. For example, when you evaluate whether your treatment or program
causes an outcome to occur, you are examining a causal relationship.

Causal study
Research that attempts to reveal a causal relationship between variables.( A produces B or
causes B to occur,)

Causal
Pertaining to a cause-effect relationship. [#] Pertaining to the generation of an effect.

Causation situation
Where one variable leads to a specified effect on the other variable.

Cause construct
Your abstract idea or theory of what the cause is in cause-effect relationship you are
investigating.

Cell
In a cross-tabulation, subgroup of the data created by the value intersection of two (or more)
281
variables: each cell contains the count of cases as well as the percentage of the joint
classification.

Census
A count of all the elements in a population. [#] Survey of the entire population.

Central index
Database of publications searchable by the references or citations included in the articles.

Central limit theorem


A theorem that enables one to use the normal probability distribution to approximate the sampling
distribution of the sample mean and sample proportion whenever the sample size is large. [#] The
sample means of repeatedly drawn samples will be distributed around the population mean: for
sufficiently large sample (i.e. n= 30 +), approximates a normal distribution. [#] Principle that the
sampling distribution approached normality as the number of samples increases.

Central tendency
A measure of location, most commonly the mean, median, and mode. [#]An estimate of the
center of a distribution of values. The most usual measures of central tendency are the mean,
median and mode. [#] In descriptive statistics, the value of score best representing a group of
scores (for example, the mean).

Centroid
A term used for the multivariate mean scores in MANOVA.

Chebyshev’s theorem
A theorem applying to any data set that can be used to make statements about the proportion of
items that must be within a specified number of standard deviations of the mean.

Checklist
A measurement question that poses numerous alternatives and encourages multiple unordered
responses: see multiple-choice, multiple-response scale.

Children
Persons who have not yet attained the legal age for consent to treatment or procedures involved
in the research, as determined under the applicable law of the jurisdiction in which the research
will be conducted.

Children’s panel
A series of focus group sessions where the same child may participate in up to three groups in
one year, with each experience several month apart.

Chi-square –based measures


Test to detect the strength of the relationship between the variables tested with a chi-square test:
phi, cramer’s V and contingency coefficient C

Claim
A statement, similar to a hypothesis, which is made in response to the research question at
hand, and that is backed up with evidence based on research.

282
Class midpoint
The point in each class that is halfway between the lower and upper class limits.

Classical method
A method of assigning probabilities that assumes that experimental outcomes are equally likely.

Classification question
A measurement question that provides sociological-demographic variable for use in grouping
participants answer (nominal, ordinal, interval,

Closed –ended questions


Items that can be answered from a few predetermined options.

Closed question / response


A measurement question that presents the participant with a fixed set of choices (nominal,
ordinal, or interval data).

Closed-ended Questions
Survey questions that can only be answered in predetermined ways (for example, a scale of one
to five measuring satisfaction with something).

Cluster analysis
Identifies homogeneous subgroups and then draws a sample from each subgroup, a single-stage
or multistage design.

Cluster random sampling


A sampling method that involves dividing the population into groups called clusters, randomly
selecting clusters, and then sampling each element in the selected clusters. This method is useful
when sampling a population that is spread across a wide area geographically.

Cluster Sample
A probability sample that is determined by randomly selecting clusters of people from
a population and subsequently selecting every person in each cluster for inclusion in the sample.

Cluster
Sample unit consisting of a group of elements, for example, a college or city.

Clustering
A technique that assigns each data record to a group or segment automatically by clustering
algorithms that identify the similar characteristics in the data set and then partition them into
groups.

Clusters sampling
A probability sampling method in which the population is first divided into clusters and then one
or more clusters are selected for sampling. In single-stage cluster sampling, every element in
each selected cluster is sampled; in tow-stage cluster sampling, a sample of the elements in each
selected cluster is collected.

283
Co linearity
When two independent variables are highly correlated: causes estimated regression coefficients
to fluctuate widely, making interpretation difficult.

Code book
A written description of the data that describes each variable and indicates where and how it can
be accessed.

Code of ethics
An organization’s codified set of norms or standards of behavior that guide moral choices about
research behavior; effective codes are regulative, protect the public interest, are behavior-
specific, and are enforceable.

Codebook
The coding rules for assigning numbers or other symbols to each variable: a.k.a coding scheme.
[#] Index that names the variables and specifies their location in the data set.

Coded Data
Refers to a way of recording material at data collection, either manually or on computer, for
analysis. The data are put into groups or categories, such as age groups, and each category is
given a code number. Data are usually coded for convenience, speed, computer storage space
and to permit statistical analysis.

Codes
Numbers given to indicate specific data items as part of the process of preparing quantitative data
for analysis. The Code Book sets out and labels all the codes in use in a particular piece of
research. While this may be a separate document, prepared as part of the process on getting
data ready for analysis, it may also be incorporated into the questionnaire itself or in the computer
analysis process.

Coding
Assigning numbers or other symbols to responses so that they can be tallied and grouped into a
limited number of categories. [#] The process of categorizing qualitative data.

Coefficient alpha
Reliability coefficient of length adjusted, inter-item or within-test consistency appropriate for tests
with items with three or more answer options (KR- 20 statistic substitutes when items have two
answer options).

Coefficient of determination
A measure of the goodness of fit of the estimated regressing equation. It can be interpreted as
the proportion of the variation in the dependent variable y that is explained by the estimated
2
regression equation. [#] Transformation of r by squaring (r ), which expresses a relations in PRE
terms, that is percentage of the variance explained.

Coefficient of variation
A measure of relative variability for a data set, found by dividing the standard deviation by the
mean and multiplying by 100.

284
2
Coefficients of determination (r )
The amount of common variance in X and Y, two variable in regression: the ratio of the line of
best fit’s error over that incurred by using the mean value of Y.

Cognitively Impaired
Having either a psychiatric disorder (e.g., psychosis, neurosis, personality or behavior disorders,
or dementia) or a developmental disorder (e.g., mental retardation) that affects cognitive or
emotional functions to the extent that capacity for judgment and reasoning is significantly
diminished. Capacity for autonomy and voluntary participation is thus impaired. Others, including
people under the influence of or dependent on drugs or alcohol, those suffering from
degenerative diseases affecting the brain, terminally ill patients, and persons with severely
disabling physical handicaps, may also be compromised in their ability to make decisions in their
best interests.

Cohort Study
A specific kind of trend study involving the study of a cohort over time. [#] Types of trend survey
in which fresh samples are drawn and interviewed from the same subpopulation, known as a
cohort and usually defined by birth year, as it ages.

Cohort
A group of people born within a given time frame or experiencing a life event at approximately
the same time.

Command
In the context of an online catalog search, the part of the user’s instructions and tells the
computer the desired action, for example FIND.

Common causes
Normal or natural variations in process outputs that are due purely to chance. No corrective
action is necessary when output variations are due to common causes.

Communication approach
Involving questioning or surveying people (by personal interview, telephone, mail, computer, or
some combination of these) and recoding their responses for analysis.

Communication study
The research questions the participants and collects their responses by personal means.

Comparative scale
A scale where the participant evaluates an object against a standard using numerical ,graphical,
or verbal scale.

Comparison wise Type I error rate


The probability of a Type I error associated with a single pair wise comparison.

Compensation
Payment or medical care provided to participants injured in research; does not refer to payment
for participation in research

285
Compensatory contamination
Problem of control subjects acquiring the experimental treatment through rivalry of diffusion; has
the effect or reducing the difference between experimental and control conditions.

Compensatory equalization of treatment


A social threat to internal validity that occurs when the control group is given a program or
treatment (usually by a well- meaning third party) designed to make up for or “compensate” for
the treatment the program group gets. This threat diminishes the researcher’s ability to evaluate
the program effect by equalizing the group’s experiences.

Compensatory program
A program given to only those who need it on the basis of some screening mechanism.

Compensatory rivalry
A social threat to internal validity that occurs when one group knows the program another group
is getting and, because of that, develops a competitive attitude with the other group. Often it is the
comparison group knowing that the program group is receiving a desirable program (e.g., new
computers) that generates the rivalry.

Competence
Used as a legal term to indicate a person’s capacity to act on one’s own behalf; a person’s ability
to understand information presented, to realize the consequences of acting (or not acting) on that
information, and to make a choice

Complementary inference
This is when the results of two strands of a mixed methods study provide two different but non-
conflicting conclusions or interpretations.

Complete within-subjects design


Multiple intervention design in which each subject receives all possible orders of the experimental
manipulations.

Completely randomized design


An experimental design in which the treatments are randomly assigned to the experimental units.

Completion rate
Proportion of the sample that is successfully contacted and interviewed.

Component
Sorts a projective technique where participants are presented with flash cards containing
components features and asked to create new combinations.

Compound item
Question that consists of two or more components.

Computer- assisted interview (CAI):


Technique in which the interviewer reads questions from a computer and inputs the answers
directly to the computer; either for telephone interviewing (CATI) or personal interviewing (CAPI)

286
Computer-administered telephone
Survey a telephone survey via voice- synthesized Computer questions: data are tallied
continuously.

Computer-assisted personal interview (CAPI)


A personal, face to face interview (IDI) with computer sequenced questions, employing
visualization techniques: real-time data entry possible.

Computer-assisted self-interview (CASI)


Computer-delivered survey that is self – administered by the participant.

Concealment
A technique in an observation study where the observer is shielded from the participant to avoid
error caused by observer’s presence: this is accomplished by one-way mirrors, hidden cameras,
hidden microphones.etc.

Concept
A bundle of meaning or characteristics associated with certain concrete , unambiguous events
,objects, conditions, or situations.

Concept maps
Two dimensional graphs of a group’s ideas where that are more similar are located closer
together and those judged less similarly are more distant. Concept maps are often used by a
group to develop a conceptual framework for research project.

Conceptual (or Inferential) Consistency


This refers to the degree to which the inferences are consistent with each other and with the
known state of knowledge and theory.

Conceptual Framework
This is a consistent and comprehensive theoretical framework emerging from an inductive
integration of previous literature, theories, and other pertinent information. Conceptual framework
is usually the basis for reframing the research questions and for formulating hypotheses or
making informal tentative predictions about the possible outcome of the study.

Conceptual scheme
The interrelationship between concepts and constructs.

Conceptual utilization
Evaluation use in which the research provides background information or clarification but does
not actually guide the policy choices.

Conclusion validity
The degree to which conclusions your each about relationships in your data are reasonable.

Concordant
When a participant that ranks higher on one ordinal variable also ranks higher on another
variable. the pairs of variables are concordant.

287
Concurrent Mixed Method Design
This is a multistrand design in which both QUAL and QUAN data are collected and analyzed to
answer a single type of research question (either QUAL or QUAN). The final inferences are
based on both data analysis results. The two types of data are collected independently at the
same time or with a time lag.

Concurrent Mixed Model Design


This is a multistrand mixed design in which there are two relatively independent strands/phases:
one with QUAL questions and data collection and analysis techniques and the other with QUAN
questions and data collection and analysis techniques. The inferences made on the basis of the
results of each strand are pulled together to form meta-inferences at the end of the study. See
also rules of integration.

Concurrent Nested Design


This is a concurrent mixed model design classified on the basis of (conceptual or paradigmatic)
dominance or priority of the study. In this design, a quantitative strand/phase is embedded within
a predominantly qualitative study (quan + QUAL) or vice versa (QUAN + qual). QUAL and QUAN
approaches are used to “confirm, cross-validate, or corroborate findings within a single study”
(Creswell, Plano Clark, Gutmann, & Hanson, 2003).

Concurrent Triangulation Design


This is a concurrent mixed model design classified on the basis of purpose of the study. In this
design, QUAL and QUAN approaches are used to “confirm, cross-validate, or corroborate
findings within a single study” (Creswell et al., 2003).

Concurrent validity
An operationalization’s ability to distinguish between groups that it should theoretically be able to
distinguish between.

Conditioning factor
Variable that affects the relationship between two other variable and may explain conflicts in
literature reviews.

Confidence coefficients
The confidence level expressed as a decimal value. For example, 0.95 is the confidence
coefficient for a 95% confidence level.

Confidence interval estimate


The interval estimate of the mean value of y for a given value of x.

Confidence interval
Technically, 1-alpha. The confidence interval is the probability of correctly concluding that there is
no treatment effect. [#] Range of values around the sample estimate within which we can expect
the population value to fall at some probability level. [#] The confidence associated with an
interval estimate. For example, if an interval estimation procedure provides intervals such that
95% of the intervals formed using the procedure will include the population parameter, the
interval estimate is said to be constructed at the 95% confidence level.

Confidentiality
A research condition in which no one except the researcher(s) knows the identities of the
research participants. The treatment of information that a participant has disclosed to the
288
researcher in a relationship of trust and with the expectation that it will not be revealed to others
in ways that violate the original agreement, unless permission is granted by the participant. [#] A
privacy guarantee to retain validity of the research as well as to protect participants. [#] An
assurance made to study participants that identifying information about them acquired through
the study will not be released to anyone outside of the study.

Confirmatory data analysis


An analytical process guided by classical statistical inference in its use of significance and
confidence.

Confirmatory research
Data collection and analysis aimed at testing prior hypotheses.

Confounding Factor
Any factor that might serve as an alternative explanation for a study’s result; confounding factors
include non-randomized samples, selection bias, and any arbitrary differences between people
that are being compared. [#] In a case of spuriousness, the “third” variable, which actually causes
the two variable and makes them appear connected.

Conjoint analysis
Measures complex decision making that requires multiattribute judgments uses input from no
metric independent variables to secure part-worth’s that represent the importance of each aspect
of the participant’s overall assessment: produces a scale value for each attribute or property.

Consensus scaling
Scale development by a panel of experts evaluating instrument items based on topical relevance
and lack of ambiguity.

Consistency
A property of a point estimator that is present whenever larger sample sizes tend to provide point
estimates closer to the population parameter.

Constant dollars
Monetary expression of costs or benefits adjusted for inflation.

Constant –sum scale


The participant allocates point to more than one attribute or property indicant. Such that they
total to 100 to 10: a.k.a. fixed –sum scale.

Construct
Something that exists theoretically but is not directly observable. (#) A concept developed
(constructed) for describing relations among phenomena or for other research purposes. (#) A
theoretical definition in which concepts are defined in terms of other concepts. For example,
intelligence cannot be directly observed or measured; it is a construct. [#] A definition specifically
invented to represent an abstract phenomenon for a given research project.

Construct validity
Approach to measurement validity that assesses the extent to which the measure reflects the
intended construct with different methods focusing on the relations among observed variables or
on the fit of observed associations with theory. [#] The degree to which inferences can
289
legitimately be made from the operationalizations in your study to the theoretical constructs on
which those operationalizations are based.

Constructivism
The belief that you construct your view of the world based on your experiences and perceptions.

Consumer’s risk
The risk of accepting a poor-quality lot. This is a Type II error.

Contamination
Intrasession events that cause doubt that the experimental and control groups differ only on the
studied variable.

Content analysis
A flexible, widely applicable tool for measuring the semantic content of a communication –
including counts, categorizations, associations interpretation.etc.(e.g..used to study the content of
speeches. Newspaper and magazine editorials , focus group and IDI transcripts); contains four
types of items : syntactical, referential, propositional: initial process is done by computer. [#] The
systematic and quantitative study of some form of communication (e.g. speeches, TV programs,
newspaper articles, advertisements, etc.). [#] The analysis of text documents. The analysis can
be quantitative, qualitative, or both. Typically, the major purpose of content analysis is to identity
patterns in text.

Content validity
A check of the operationalization against the relevant content domain for the construct. [#]
Approach to measurement validity that judges the content of the test (for example, and
achievement test) for its adequacy in representing the domain being covered.

Contingency coefficient
A measure of association for nominal, nonparametric variables: used with any size chi-square
table, the upper limit varies with table’s sizes: does not provide direction of the association or
reflect causation.

Contingency table
A table used to summaries observed and expected frequencies for a test of independence. [#] A
cross-tabulation table constructed for statistical testing, with test determining whether the
classification variables are independent.

Contingency table
Cross –tabulation among two or more variables.

Continuous measure
Type of quantitative variable that can take on any value in its possible range, for example, a
person’s height, which can be measured in fractions of an inch or meter.

Continuous random variable


A random variable that may assume any value in a interval or collection of intervals.

Contraindicated
Disadvantageous, perhaps dangerous; a treatment that should not be used in certain individuals
290
or conditions due to risks. For instance, a drug may be contraindicated for pregnant women and
people with high blood pressure. Such individuals should not be involved in the study.

Control
The ability to replicate a scenario and dictate particular outcomes: the ability to exclude, isolate or
manipulate the influence of a variable in a study : a critical factor in inference from an experiment,
implies that all factors. With the exception of the independent variable (IV), must be held constant
and not confounded with another variable that is not part of the study.

Control dimension
In quota sampling a descriptor used to define the sample’s characteristics (e.g. education,
religion).

Control group
A group of participants that is not exposed to the independent variable being studied but still
generates a measure for the dependent variable. [#] This is a
feature of experimental research, and is there to provide a contrast
to the experimental group through the removal of the independent
variable. The use of a control group may be necessary in order to
measure the validity of a research finding. [#] In experimental
research, a group that, for the sake of comparison, does not
receive the treatment the experimenter is interested in. [#] The
group in an experimental design that receives either no treatment
or a different treatment from the experimental group. This group
can thus be compared to the experimental group. [#] Condition in
which the experimental treatment is withheld to provide a
comparison with the treated group.

Control variable
A variable introduced to help interpret the relationship between variables.

Controlled test market


Real-time test of a product through arbitrarily selected distribution partners.

Controlled Variables
Researchers may control some variables in order to allow the research to focus on specific
variables without being distorted by the impact of the excluded variables. A common way to
control a variable is to be selective; eg gender is controlled by selecting as respondents only men
or only women; age can be partially controlled by restricting a sample to one age range, rather
than any age. See also Control group.

Controlled vocabulary
Carefully defined subject hierarchies used to search some bibliographic databases. [#]
Set of terms officially designed and recognized by a catalog or file system.

Convenience Sample
A non-probability sample that is determined by selecting participants that are readily accessible
(convenient) to the researcher. [#] Non-probability sample where element selection is based on
ease of accessibility. [#]A non-probabilistic method of sampling whereby elements are selected

291
on the basis of convenience. [#] No probability sampling where researchers use any readily
available individuals as participants.

Convergent Inference
This is when the conclusions or interpretations of two strands of a mixed methods study are
consistent with each other (i.e., agree with each other).

Convergent interviewing
An IDI technique for interviewing a limited number of experts as participants in a sequential
series of IDIs: after each successive interview, the researcher refines the questions, hoping to
converge on the central issuers in a topic area: sometimes called convergent and divergent
interviewing.

Convergent validity
The degree to which the operationalization is similar to (converges on) other operationalizations
to which it should be theoretically similar.

Conversion Mixed Model Design


This is a multistrand concurrent design in which mixing of QUAL and QUAN approaches occurs in
all components/stages, with data transformed (qualitized or quantitized) and analyzed both
qualitatively and quantitatively.

Cook's D
A measure of the influence of an observation based on the residual and leverage.

Correlation
The extent to which two or more things are related ("co-related") to one another. This is usually
expressed as a correlation coefficient.

Correlation coefficient
A numerical measure of the strength of the liner association between two variables that takes
values between -1 and +1. values near +1 indicate a strong positive linear relationship, while
values near -1 indicate a strong negative linear relationship. Value near zero indicate lack of a
linear relationship. [#] A statistical measurement of the degree of correlational
relationship between two variables. Values of correlation coefficients range from –1.00 to +1.00.
A correlation coefficient of 0.00 indicates no relationship between the variables. Correlations
approaching –1.00 or +1.00 indicate strong relationships between the variables.

Correlation hypothesis
A statement indicating that variables occur together in some specified manner without implying
that one causes the other

Correlation matrix
A table of correlations showing all possible relationship among a set of variables. The diagonal of
a correlation matrix (the numbers that go from the upper-left corner to the left right) always
consists of ones because these are the correlations between each variable and itself (and a
variable is always perfectly correlated with itself). Off-diagonal elements are the correlation of the
variables represented by that row and column in the matrix.

292
Correlation relationship
Two variables that perform in a synchronized manner. [#]By which two or more variables changes
together, such that systematic changes in one accompany changes in the systematic changes in
the other.

Correlation
A single number that describes the degree of relationship between two variables.

Correlational design
Research approach in which the independent variable is measured rather than fixed by an
intervention.

Correlational Relationship
A relationship where two variables are associated (this can be measured in terms of strength and
direction using statistical tests) but not causally related. They vary together in some way, but the
variation of one does not itself cause the variation of the other.

Cost- benefit analysis


Approach to efficiency assessment that assigns monetary value to the benefits of a program and
compares them with the monetary costs of the program.

Cost-effectiveness analysis
Approach to efficiency assessment that compares different programs producing the same type of
nonmonetary benefit in terms of their respective monetary costs.

Counterbalancing
Technique for studying all possible orders or multiple interventions in incomplete within- subjects
design by use different subjects who, taken together, experience all possible orders.

Covariance
A numerical measure of linear association between two variables. Positive value indicate a
positive relationship, while negative values indicate a negative relationship.

Covariates
Variables you adjust for in your study.

Covariation
A state that exists when two things - such as the price and the sales of a commodity - vary
together. Measures of association are designed to capture the degree of covariation.

Covenation association
Level of one variable predicts the level of another.

Cover story
False explanation for the experiment used to distract the subject from guessing the true nature of
the study.

Cramer’s V
A measure of association for nominal ,nonparametric variable: used with large than 2 X 2 chi-

293
square tables : does not provide direction of the association or reflect causation: range from zero
to +1.0.

Criterion validity
Approach to measurement validity that correlates the measure to be validated with another called
a criterion, which is accepted as valid.

Criterion-related validity
The validation of a measure based on its relationship to another independent measure as
predicted by your theory of how the measures should behaves.

Critical incident technique


An IDI technique involving sequentially asked questions to reveal .in narrative form. what led up
to an incident being studied :exactly what the observed party did or did not do that was especially
effective or ineffective : the outcome or result of this action :and why this action was effective or
what more effective action might have been expected.

Critical path method (CPM)


A scheduling tool for complex or large research proposals that cites milestones and time involve
between milestones.

Critical realism
The belief that there is an external reality independent of a person’s thinking (realism) but that we
can never know that reality with perfect accuracy.

Critical value
A value that is compared with the test statistic to determine whether Ho should be rejected. [#]
The dividing point(s) between the region of acceptance and the region of rejection: these values
can be computed in terms of the standardized random variable due to the normal distribution of
sample means. [#] Values derived from the probability distribution of an inferential statistic used
to determine the statistical significance of an observed value of the statistic at any given alpha
level.

Critiquing
Uses the same principles as literature reviews. In a critique of a paper or piece of research you
must analyse method as well as content, and say whether or not you agree with the arguments
and why. If you disagree you need to be able to say what evidence supports your objection. You
also need to identify gaps in the literature.

Cronbach’s Alpha
One specific method of estimating the reliability of a measure. Although not calculated in this
manner, Cronbach’s Alpha can be thought of as analogous to the average of all possible spilt-half
correlations.

Cross – sectional survey


Survey conducted at one time.

Cross – tabulation
Tabular summary of an association in which each individual is assigned to one and only one cell,
representing a combination of the levels of the variables.
294
Cross- sectional correlation
Association of variables measured at the same time; also synchronous, static, or on-time
correlation.

Crossbreaks
Also called cross-tabulation ("crosstabs") and cross partitions. A way of arranging data about
categorical variables in a matrix so that relations can be more clearly seen. This is not to be
confused with a factorial table, in which two or more variables are related to a third. While not all
researchers make these distinctions in the terms, the concepts are quite distinct.

Cross-level inference
Drawing a causal conclusion at of level of analysis using data from another level.

Cross-sectional data
Data collection at the same or approximate the same point in same.

Cross-sectional
A study that takes place at a single point in time.

Cross-tabulating
A process of analysing data according to one or more key variables. A common example is to
analyse data by the gender of the research subject or respondent, so that you can compare
findings for men with findings for women. Also known as cross-referencing.

Cross-tabulation
A tabular summary of data for two variables. Classes for one variable are represented by the
rows, while classes for the other variable are represented by the columns.

Cumulative frequency distribution


A tabular summary of quantitative data showing the number of items with values less than or
equal to the upper class limit of each lass.

Cumulative or Guttman scale


A method of scaling in which the items are assigned scale values that allow them to be placed in
a cumulative ordering with respect to the construct being scaled.

Cumulative percent frequency distribution


A tabular summary of quantitative data showing the percentage of items with values less than or
equal to the upper class limit of each class.

Cumulative relative frequency distribution


A tabular summary of quantitative data showing the fraction or proportion of items with values
less than or equal to the upper class limit of each class.

Cycles
Patterns in time series marked by recurring highs and lows.

295
Data
Information collected by a researcher. (Data is the plural term; datum the singular). Data are often
thought of as statistical or quantitative, but they may take many other forms as well--such as
transcripts of interviews or videotapes of social interactions. Non-quantitative data such as
transcripts or videotapes are often coded or translated into numbers to make them easier to
analyze.

Data
The facts and figures that are collected, analyzed, and interpreted.

Data Analysis
Subjecting data to a systematic analysis which can range from statistical to textual analysis,
either manually or electronically.

Data audit
The process of reviewing data collection procedures and data to make judgments about the
potential for bias or discussion.

Data base
A collection of data organized for rapid search and retrieval, usually by a computer; often a
consolidation of many records previously stored separately.

Data collection bias


When procedures for collecting survey data lead to a consistent distortion from the true value of
the sample.

Data collection error


The random inconsistencies produced in gathering survey data.

Data Consolidation
This means combining qualitative and quantitative data to create new or consolidated variables or
data sets.

Data Conversion/Transformation
Collected quantitative data types are converted into narratives that can be analyzed qualitatively
(i.e., qualitized), and/or qualitative data types are converted into numerical codes that can be
statistically analyzed (i.e., quantitized).

Data Entry
Data may be entered on to computer directly via a keyboard, or by transferring blocks of coded or
raw data into the analysis program. With a manual approach it is usual to set out a matrix on
squared paper, with a square for each data item.

296
Data Quality
This is the degree to which the collected data (results of measurement or observation) meet the
standards of quality to be considered valid (trustworthy) and reliable (dependable). This term
has been used by Punch (1998) to represent “quality control of data (p. 257)” “in terms of
procedures in the collection of the data, and … in terms of three technical aspects of quality of the
data: reliability, validity, and reactivity, p.257)” (1) Data/Measurement validity: Do the results of
data collection truly represent the construct or phenomenon that they are expected to capture
(measure or represent)? “How well the data represent the phenomena for which they stand
(Punch, 1998, p.258).” See also convergent validity and discriminant validity.
(2) Data/Measurement reliability: Do the obtained results of measurement or observation
accurately reflect the magnitude, intensity, or quality of the attribute or phenomenon that is being
measured or observed?

Data set
All the data collected in a particular study.

Data set
A collection of related data items, such as answers given by respondents to all questions on a
survey.

Data
In everyday conversation 'data' and 'information' are often used as meaning much the same
thing, but sometimes they are used differently. In research it is common to refer to the raw
material gathered as 'data', and it is then processed manually or on computer. It becomes
information when it acquires meaning through aggregation or interpretation by the researcher or
automated analysis. Some depict a hierarchy of: data -> information -> knowledge.

Debriefing
After running a study, explaining to a participant what happened and what the study is for,
explaining any deception used in the study, asking for any remaining comments or concerns, and
ensuring that the participant is left with no adverse consequences from the experience. This
sometimes involves providing contact information for groups that can provide support regarding a
difficult issue.

Debriefing
Researcher’s interview with the subject after the experiment to check the subject’s beliefs about
the study and to tell the subject about the purpose of the study.

Deception
The intentional withholding of information from participants, or deception about the study’s
purpose and exact nature, that is deemed necessary by the researcher in order to meet the
study’s goals. Deception should only be used when the researcher feels that participant
knowledge about the study would alter participants’ behavior or responses in the study.
Deception should not cause any adverse consequences to the participants, and participants
should be debriefed after running the study.

Deduction
Drawing of specific assertions from general principles.

Deductive
Top-down reasoning that works from the more general to the more specific.
297
Deductive inference (in research cycle)
This is a process in which hypotheses or predictions are formed on the basis of (1) a conceptual
framework that is constructed from the literature, (2) the inferences of a previous strand of a
mixed methods study, or (3) an existing theory. See also inference and inference quality.

Deductive logic
(Erzberger & Kelle, 2003) (1) This refers to the application of general rules to specific cases. For
example, from the general rule that all men are mortal, it can be deduced that if Socrates is a
man, then he will be mortal. (2) This refers to a type of reasoning usually applied if a link is drawn
from an already formulated theoretical statement to a statement about observable empirical facts,
a link that can be generalized in the following term: “If A (a theoretical statement) is true, then we
would expect the fact C to happen.”

Degrees of freedom
A parameter of the t distribution. When the t distribution is used in the computation of an interval
estimate of a population mean, the appropriate t distribution bas n-1degrees of freedom, where n
is the size of the simple random sample.

Degrees of freedom (df)


A statistical term that is a function of the sample size. In the t-test formula, for instance, the df is
the sum of the persons in both groups minus 2.

Degrees of freedom
In inferential statistics, amount of free information left in the data after calculating the inferential
statistic.

Delphi method
A qualitative forecasting method that obtains forecasts through group consensus.

Demand Characteristics
A bias that results when participants display characteristics because they are aware that they are
being observed.

Demand characteristics
Cues in the experimental situation that guide a subject’s view of the study.

Demographics
Information about the sample that includes areas such as age, sex, social class, presence of
children, etc.

Dependent variable
The variable that is being predicted or explained. It is denoted by y.

Dependent variable
The presumed effect in a study; so called because it "depends" on another variable. (#) The
variable whose values are predicted by the independent variable, whether
or not caused by it. For example, in a study to see if there is a relationship
between students' drinking of alcoholic beverages and their grade point
averages, the drinking behavior would be the presumed cause
(independent variable); the grade point average would be the effect
(dependent variable). # A variable that varies due (at least in part) to the
298
impact of the independent variable – that is, its value “depends” on the value of the independent
variable. In the variables “sex” and “academic major,” academic major is the dependent variable,
meaning that your major can’t determine whether you are male or female, but your sex might
indirectly lead you to favor one major over another (nationally, men tend to major in engineering,
women in education).

Descriptive association
Observing a relationship between variables without claiming causality.

Descriptive Research
Describes certain characteristics of populations, and identifies and explores relationships
between variables.

Descriptive statistics
Tabular, graphical and numerical methods used to summarize data.[#] Statistics that summarize
a data set, e.g., mean, median, mode, standard deviation. [#] Statistics used to describe the basic
feature of the data in a study. [#] Statistics used to characterize a group of observations (for
example, central tenancy or variability) or an association of two or more variables.

Descriptive Study
Any study that is not truly experimental (e.g., quasi-experimental studies, correlational studies,
record reviews, case histories, and observational studies).

Descriptive
Characterizing something or some relationship.

Deseasonalize
To remove the seasonal cycle from a time series by a statistical method.

Design
In research, the arrangement of subjects, experimental manipulation, and observation of results.

Desk Study
An umbrella name given to sedentary research, primarily reading and note taking, and thinking.

Detrend
To remove that trend from a time series by a statistical method.

Developmental Research
A variant of applied research in that the research has a problem solving function and leads to
further research on the basis of its own findings

Dialectical position
(Greene & Caracelli, 2003) To think dialectically is to invite the juxtaposition of opposed or
contradictory ideas, to interact with the tensions invoked by these contesting arguments, or to
engage in the play of ideas. The arguments and ideas that are engaged in this dialectic stance
emanate from the assumptions that constitute philosophical paradigms—assumptions about the
social world, social knowledge, and the purpose of science in society.

299
Diary
A record of events, ideas or feelings in the life of or affecting the diary author. More specifically a
research diary aims to be a record made by a respondent as close as possible to the time of
occurrence of the events/ ideas/feelings, and may be both structured (e.g. a particular format for
recording) and focused (e.g. recording specific items only).

Dichotomous question
A question with two possible responses.

Diffusion or imitation of treatment


A social threat to internal validity that occurs because a comparison group learns about the
program either directly from program group participants. In a school context, children from
different groups within the same school might share experiences during lunch hour. Or,
comparison group students, seeing what the program group is getting, might set up their own
experience to try to imitate that of the program group.

Direct causal path


In a theory, a simple, one way causal connection between two constructs.

Direct causation
Impact of one variable on an other not involving another variable.

Direct observation
The process of observing a phenomenon to gather information about it. This process is
distinguished from participated observation in that a direct observer doesn’t typically try to
become a participant in the context and does strive to be as unobtrusive as possible so as not to
bias the observations.

Discount rate
Interest rate used to adjust future monetary benefits fro the rate of gain that cold be expected by
some alternative use of the same funds.

Discrete measure
Type of quantitative variable that can take on only certain values between which there are gaps;
for example, counting the number of students enrolled in a class, for which fraction would be
nonsensical.

Discrete random variable


A random variable that may assume either a finite number of values or an infinite sequence of
values.

Discrete uniform probability distribution


A probability distribution for which each possible value of the random variable has the same
probability.

Discriminant validity
The degree to which concepts that should not be related theoretically are, in fact, not interrelated
in reality.

300
Dispersion
The spread of the values around the central tendency. The two common measures of dispersion
are the range and the standard deviation.

Dissemination
The mechanisms by which the results of research are communicated to stakeholders and other
interested parties

Distribution
The manner in which a variable takes different values in your data.

Distribution-free methods
Another name for non-parametric statistical methods that indicates the lack of assumptions about
the population probability distribution.

Divergent inference
(Erzberger & Kelle, 2003) This is when the inferences made on the basis of the two strands of a
mixed methods study are inconsistent or dissonant (Rossman & Wilson, 1985); that is, they do
not agree with each other. Inconsistencies between qualitative and quantitative findings might be
a consequence of the inadequacy of the applied theoretical concepts. It might, therefore, be
necessary to revise and modify the initial theoretical assumptions and to draw on further
theoretical concepts that have not yet been related to the domain in question.

Dot plot
A simple graphical summary of data with each observation represented by a dot placed above a
horizontal axis that shows the range of values for the observations.

Double entry
An automated method for checking data-entry accuracy in which your enter data once and then
enter it a second time, with the software automatically stopping each time a discrepancy is
detected until the data enterer resolves the discrepancy. This procedure assures extremely high
rates of a data entry accuracy, although it requires twice as long for data entry.

Double pretest design


A design that indicates two waves of measurement prior to the program.

Double-Blind Design
An experiment in which neither the participants nor the research staff who interact with them
knows the memberships of the experimental or control groups. Also known as Double-Masked
Design

Dow Jones Averages


Aggregate price indexes designed to show common stock price trends and movement on the
New York Stock Exchange.

Dummy variable
A variable used to model the effect of qualitative independent variables. A dummy variable may
take only the value zero or one. [#] A variable that uses discrete numbers, usually 0 and 1, to
represent different groups in your study in the equations of the GLM.

301
Durbin-Watson test
A test to determine whether first-order autocorrelation is present.

Ecological fallacy
False interpretation of aggregate level data in individual-level terms.

Ecological fallacy
Faulty reasoning that results from making conclusions about individuals based only on analyses
of group data.

Ecological Fallacy
The fallacy one commits when making inferences about individuals from information about groups
or aggregates.

Ecological transferability
This refers to generalizability or applicability of inferences obtained in a study to other settings or
contexts. Subumes the QUAN term ecological validity and ecological external validity, and the
QUAL term transferability See inference transferability.

Ecological validity
Extend to which a research situation represents the natural social environment.

Editing Data
The process of going over the data and ensuring that they are complete and acceptable for data
analysis.

Effect construct
Your abstract idea or theory of what the outcome is in a cause-effect relationship you are
investigating.

Effect size
This refers to the intensity, magnitude, or practical significance of an obtained result (e.g.,
relationship, difference) in the QUAL or QUAN strands of a mixed methods
study. Onwuegbuzie and Teddlie (2003) explicitly relate this historically QUAN term to QUAL
research, naming several new terms, including manifest effect size, frequency (manifest) effect
size, and intensity (manifest) effect size. [#] Numerical index of the magnitude of a relationship
found in a study; commonly used in meta-analysis.

Effect to variability ratio


magnitude of an effect, such as the effect of an experimental treatment compared to a control
condition, that takes into account the dispersion of scores in the groups.

Efficiency analysis
Stage of evaluative research that weights the program’s outcomes by its costs.

302
Efficiency in Sampling
Attained when the sampling design chosen either results in a cost reduction to the research or
offers a greater degree of accuracy in terms the sample size.

Electronic mail (e-mail)


The most useful of Internet services that allows one to send and receive messages from all over
the world almost instantaneously.

Electronics questionnaire
Online questionnaire administered when the microcomputer is hooked up to computer networks.

Element
A single member of the population.[#] Unit from whom survey information is collected, usually a
person.

Elements
The entities on which data are collected.

Emancipated Minor
A legal status given to those individuals who have not yet attained the age of
legal competency as defined by state law, but who are entitled to adult treatment because of
assuming adult responsibilities such as being self-supporting and not living at home, marriage, or
procreation.

Emancipatory Research
Emancipatory research is conducted on and with people from marginalised groups/communities.
It is led by a researcher or research team who is either an indigenous or external insider; is
interpreted within intellectual frameworks of that group; and is conducted largely for the purpose
of empowering members of that community and improving services for them. It also engages
members of the community as co-constructors or validators of knowledge.

Empirical
Based on direct observations and measurements of reality.

Empirical criterion approach


Measurement construction approach that selects items according to their ability to discriminate
groups known to differ on the dimension to be measured.

Empirical Research
Research conducted 'in the field', where data are gathered first hand. Case studies and surveys
are examples of empirical research.

Empirical rule
A rule that states the percentages of items that is within one, two and three standard deviations
from the mean for mound-shaped, or bell-shaped, distributions.

Endogenous construct
In a theory, a construct that is caused by one or more other constructs exogenous or
endogenous, within the theory.

303
Enterprise resource planning
Integrated system solution for standard business requirements for the enterprise, often supported
by a single application package. l

Entry
First step in which the researcher gains access to the social setting to be studied.

Enumeration
List of all elements in the population; usually not available.

Environmental impact statements


Social and environmental impact reports required by law to be prepared and discussed before
new projects can begin.

Epistemological assumptions
The assumptions that underlie the theory of methods.

Epistemology
Branch of philosophy dealing with the nature of knowledge and other ability to know. [#]
The philosophy of knowledge or of how you come to know about the world.

Equitable
Fair or just; used in the context of selection of participants to indicate that the benefits and
burdens of research are fairly distributed.

Error
The difference between an observed score and a predicted or estimated score. Symbolized as e
or E. [#] The deviation of observed scores from true scores, including both random errors and
such nonrandom sources as bias.

Error term
A term in a regression equation that captures the degree to which the line is in error (for example,
the residual) in describing each point.

Estimated multiple regression equation


The estimate of the multiple regression equation based on sample data and the least squares
method.

Estimated regression equation


The estimate of the regression equation developed from sample data by using the least squares
method. For simple linear regression, the estimated regression equation is y = b 0 b1x.

Ethical committees
Health sector research to be conducted with or about patients has to be approved by the
appropriate local ethical committee. The main functions of ethical committees are to protect
patients, their families and staff, and to promote and uphold good research practice and
standards.

Ethical Research
Research that follows widely held guidelines about what is ethical, moral and responsible in
304
research settings (e.g. not plagiarizing others’ work, not misreporting sources, not submitting
questionable data, not destroying or concealing sources, etc.) and that considers its role in the
broader community and the effect of its findings on the community.

Ethics
Branch of Philosophy that pertains to the study of right and wrong conduct. [#] Code of conduct
or expected societal norms of behavior .

Ethnocentrism
Perceptual bias because of one’s own cultural beliefs.

Ethnographic Research
Ethnography is the study of people and their cultures. Ethnographic research involves
observation of and interactions with the people or group being studied in the group’s own
environment, often for long periods of time.

Ethnography
It is a combination of ethnos = people or race, and graphy = to describe or write about. The
primary method used is observation, and the key features are a focus on description, multi-
dimensionality and noting processes. It is an all-embracing approach - 'ethnographers tend to go
looking, rather than go looking for something'.

Ethnomethodology
It is the study of common social knowledge, in particular as it concerns the understanding of
others and the varieties of circumstance in which it can take place

Ethnography
Field research technique originating in anthropology that emphasizes the phenomenological
approach. [#] Study of a culture using qualitative field research.

Evaluation
A form of research used to assess the value or effectiveness of social care interventions or
programmes.

Evaluation apprehension
Subject’s anxiety generated by being tested.

Event
A collection of sample points.

Ex Post Facto Design


Studying subjects who have already been exposed to a stimulus and comparing them to those
not so exposed, so as to establish cause and effect relationship (in contrast to establishing
cause- and-effect relationships by manipulating an independent variable in a lab or a field
setting). [#] Experimental design in which the control group is created after the treatment has
already taken place.

Exaggerating contamination
Problem of control subjects moving in a direction opposite to that of the experimental subjects (for

305
example, by resentful demoralization); has the effect of increasing the difference between
experimental and control conditions.

Except
Very low-risk category of review in which the investigator seeks clearance by an appointed
administrator such as the department chair rather than the IRB

Exception dictionary
A faulty conclusion reached as a result of basing a conclusion on exceptional or unique cases.

Exception fallacy
A faulty conclusion reached as a result of basing a conclusion on exceptional or unique cases.

Exhaustive
The property of a variable that occurs when you include all possible answerable reponses.

Exogenous construct
In a theory a construct that causes other constructs but which itself has no cause specified within
the theory.

Exogenous variable
A variable that exerts an influence on the cause and effect relationship between two variables in
some way, and needs to be controlled.

Expected value
A measure of the mean, or central location, of a random variable.

Expedited review
Category of IRB review involving low-risk research in which just one member of the IRB judges
the proposal in order to hasten its assessment. [#] Review of proposed research by the IRB chair
or a designated voting member or group of voting members rather than by the entire IRB. Federal
rules permit expedited review for certain kinds of research involving no more than minimal risk
and for minor changes in approved research.

Experiment
A study undertaken in which the researcher has control over some of the conditions in which the
study takes place and control over some aspects of the independent variables being studied.
Random assignment of the subjects to control and experimental groups is usually thought of as a
necessary criterion of a true experiment. For example, if you interviewed moviegoers as they
exited a theater to see if what they saw influenced their attitudes, this would not be experimental
research; you had no control over who the subjects were or what film they watched or the
conditions under which they watched it. On the other hand, if you chose a room, a film, and
subjects to assign randomly to control and experimental groups and interviewed these subjects
about the effects of the film on their attitudes, that would be an experiment.

Experiment wise Type I error:


The probability of making a Type I error on at least one of several pair wise comparisons.

Experimental construct validity


Extend to which a manipulated independent variable reflects the intended construct.

306
Experimental design
The art of planning and executing experiments. The greatest strength of an experimental
research design, due largely to random assignment, is its internal validity: One can be more
certain than with any other design about attributing cause to the independent
variables. The greatest weakness of experimental designs may be external
validity: It may be hard to generalize results beyond the laboratory. [#] A study
design in which the researcher might create an artificial setting, control some
variables, and manipulate the variable to establish cause-and-effect
relationship. [#] A study design that calls for the control or manipulation of the
independent variable in some way. A study design in which participants are
randomly assigned to experimental groups and receive treatment in the form
of the independent variable. [#] Research approach in which the independent variable is fixed by
a manipulation or natural occurrence.

Experimental Group
A group receiving some treatment in an experiment. Data collected about people in the
experimental group are compared with data about people in a control group (who received no
treatment) and/or another experimental group (who received a different treatment). [#] In
experimental conditions it is common to test the validity of a cause/effect relationship by having
two groups of research subjects, an experimental group and a control group. In the former group
the causal (independent) variable is present: in the latter it is explicitly excluded. For example, in
a study to test the impact of counselling on carer stress, the experimental group of carers would
receive counselling, the control group would not. [#] The group exposed to a treatment in an
experimental design. [#] The group in an experimental design study that receives treatment in
the form, or in various forms, of the independent variable. This group can thus be compared to
the control group.

Experimental realism
Extent to which experimental procedures produce a high level of psychological involvement and,
presumably, natural behavior regardless of the degree of mundane realism.

Experimental Research
It is a style of research in which the researcher generates or manipulates a
causal factor and then seeks to observe or measure the effects which
follow. In a drug trial, for example, a group of patients with a particular
illness are given a drug which it is hoped will alleviate or cure the illness,
and the effect of the drug is monitored. A pure experimental approach
involves the random selection of research subjects and control of
extraneous variables, as well as manipulation of the independent variable.
See also Quasi-experimental.

Experimental units
The objects of interest in the experiment.

Experimental
Term used to denote a therapy (drug, device, procedure) that is unproven or not yet scientifically
validated in terms of safety and efficacy. A procedure may be considered “experimental” without
necessarily being part of a formal study to evaluate its usefulness.

307
Experimenter expectancy
Mechanism (s) by which the researcher biases the behavior of the subject to get the
hypothesized results; also called self-fulfilling prophesy and Pygmalion effect.

Expert sampling
A sample of people with known or demonstrable experience and expertise in some area.

Expert system
An inference engine that uses stored knowledge and rules of if-than relationship to solve
problems.

Explorative Research
Seeks to find out more about phenomena which are little known. Explorative studies approach a
topic broadly to identify the range of issues and opinions associated with it. They are often the
fore-runners of more specific research which studies the identified topics in greater depth. [#]
Data collection and analysis aimed at formulating hypotheses.

Exploratory study
A research study where very little knowledge or information is available on the subject under
investigation.

Exponential probability distribution


A continuous probability distribution that is useful in computing probabilities for the time it takes to
complete a task.

External consultants
Research expert outside the organization who are hired to study specific problem to find
solutions.

External validity
The degree to which the conclusions in your study would hold for other places and at other times.
[#] The extent of generalizability of the results of a causal study to other field settings. [#] The
extent to which the findings of a study are relevant to subjects and settings beyond those in the
study. [#] This is defined by Cook and Campbell (1979, p. 37) as follows: “the approximate validity
with which we can infer that the presumed causal relationship can be generalized to and across
alternate measures of the cause and effect and across different types of persons, settings, and
times” (p. 37).

Externalities
Costs or beliefs that occur to third parties not directly involved in the project as provider or
consumer, may be accounted and paid for, or ‘internalized’, by the project.

Extraneous Variables
When an experiment is seeking to monitor the impact of one variable on another (like counselling
on stress level), attention has to be paid to other variables which could have an impact (that is,
other factors which could affect a person's stress level). These other variables are called
'extraneous'.

308
Face scale
A particular representation of the graphic scale, depicting faces with expressions that range from
smiling to sad.

Face Validity
An aspect of validity examining whether the item on the scale, on the face of it , reads as if it
indeed measures that it is supposed to measure. [#] A validity that checks that “on its face” the
operationalization seems like a good translation of the construct.

Face-to-Face interview:
information gathering when both the interviewer and interviewee meet in person.

Factor
(a) In analysis of variance, an independent variable, that is, a variable presumed to cause or
influence another variable; (b) in factor analysis, a cluster of related variables that are
distinguishable components of a larger set of variables; c) a number by which another number is
multiplied, as in the statement: real estate values increased by a factor of three, meaning they
tripled.

Factor analysis
Statistical approach to measurement construction that measures the extent to which test items
agree with a common underlying dimension or factor.

Factorial design
Experimental design to which each subject is assigned to one or another combination of the
levels or two or more independent variables.

Factorial designs
Designs that focus on the program or treatment, its components, and its major dimensions and
enable you to determine whether the program has an effect, whether different subcomponents
are effective, and whether there are interactions in the effects caused by subcomponents.

Factorial experiment
An experimental design that allows statistical conclusions about two or more factors.

Factorial Validity
That which indicates through the use of factor analytic techniques whether a test is a pure
measure of some specific factor or dimension of human participants in research. The Policy
applies to all research involving human participants that is conducted, supported, or otherwise
participant to regulation by any federal department or agency.

Faithful subject
Method for avoiding deception by asking subjects to comply with the experimental procedure and
to suspend their suspicions.

Fallibilism
In epistemology, the posture of doubting our own inductions.

309
Falsifiability
Aspect of an assertion that makes it vulnerable to being proven false, and essential ingredient in
the process of science.

Federal Policy (The)


The federal policy that provides regulations for the involvement

Feminist Research
Feminism is a conceptual tool for critiquing traditional sociological research with key concepts of
empowerment of women, the equality of the research
relationship and understanding the social constructions
of gender. It can be understood in terms of values and
epistemology more than techniques and methodology.
( >>> also Emancipatory research)

Field experiment
An experiment done to detect cause and effect relationship in the natural environment in which
events normally occur.

Field research
A research method in which the researcher goes into the field to observe the phenomenon in its
natural state. [#] Behavioral, social, or anthropological research involving the study of people or
groups in their own environment and without manipulation for research purposes. Research
conducted in natural, real-life settings, outside the laboratory. This involves observation and, in
many cases, interactions with the people being studied

Field research
Generally, any social research taking place in a natural setting; more narrowly, equivalent to
qualitative research.

Field study
A study conducted in the natural setting with a minimal amount of researcher interference with
the flow of events in the situation.

File drawer problem


Risk that statistically significant published findings are really Type I errors left after many
statistically nonsignificant but valid research reports are left in file drawers because of the
publishing bias against negative findings.

Filter
When only a section of the total sample are required to answer the question, e.g. if the question
asks why people are dissatisfied with a particular service, only those who are dissatisfied should
answer the question. Those who are satisfied will skip to the next question that is to be asked of
all respondents.

Filter or contingency question


A question you to ask the respondents to determine whether they are qualified or experienced
enough to answer a subsequent one.

Finite population correction factor


The term that is used in the formulas for and whenever a finite population, rather than an infinite
population, is being sampled. The generally accepted rule of thumb is to ignore the finite
population correction factor whenever n/N <= 0.05.F

310
Fishing and the error rate problem
A problem that occurs as a result of concluding multiple analyze and treating each one as
independent.

Five-number summary
An exploratory data analysis technique that uses the following five numbers to summaries the
data set: smallest value, first quartile, median, third quartile and largest value.

Focus Groups
A group consisting of 8 to 10 members randomly chosen, who discuss product or any given topic
for about 2 hours with a moderator present, so that their opinion
can serve as the basis for further research. [#] They are open-
ended, discursive, and are used to gain a deeper understanding
of respondents' attitudes and opinions. Focus groups typically
involve between 6-10 people, and last for 1-2 hours. A key
feature is that participants are to able interact with, and react to,
each other. In order to facilitate this group dynamic it is important to ensure that participants do
not know each other beforehand and that they are broadly 'compatible'.

Forced Choice
Elicits the ranking of objects relative to one another.

Formative evaluation
Evaluations that strengthen or improve the object being evaluated. Formative evaluations are
used to improve programs while they are still under development.

Frame
A list of the sampling units for a study. The sample is drawn by selecting units from the frame.

Frequencies
The number of items various subcategories of a phenomenon occur, from which the percentage
and cumulative percentage of any occurrence can be calculated.

Frequency distribution
A tabular summary of data showing the number (or frequency) of items in each of several non-
overlapping classes. [#] A summary of the frequency of individual values or ranges of values for a
variable. [#] Visual summary of a group of observations in which the number of occurrences of
each score (frequency) is indicated on the vertical axis and the value of the score on the
horizontal axis.

Frequency polygon
Type of frequency distribution in which a line joins points representing the frequencies of the
scores.

Frequency Tables/ Tabulations


Are a set of data which provide a count of the number of occasions on which a particular
answer/response has been given across all of those respondents who tackled the question.

Full Board Review


Review of proposed research at a convened meeting at which the majority of the IRB members
are present, including one member whose primary concerns are in nonscientific areas. For the
research to be approved, it must receive the approval of a majority of those members present at
the meeting.

311
Full review
Category of IRB review in which the entire committee analyzes the research proposal.

Fully integrated mixed model design


This is a multistrand concurrent design in which mixing of QUAL and QUAN approaches occurs in
an interactive (i.e., dynamic, reciprocal, interdependent, iterative) manner at all stages of the
study. At each stage (e.g., in formulating questions), one approach (e.g., QUAL) affects the
formulation of the other (e.g., QUAN).

Fully-crossed factorial design


A design that includes the paring of every combination of factor levels.

Fundamental principle of mixed methods research


Johnson and Turner (2003) define this principle as follows: “Methods should be mixed in a way
that has complementary strengths and non-overlapping weaknesses. … It involves the
recognition that all methods have their limitations as well as their strengths. The fundamental
principle is followed for at least three reasons: (a) to obtain convergence or corroboration of
findings, (b) to eliminate or minimize key plausible alternative explanations for conclusions drawn
from the research data, and (c) to elucidate the divergent aspects of a phenomenon. The
fundamental principle can be applied to all stages or components of the research process.”

Funneling Technique
The questioning technique that consists of initially asking general and broad questions, and
gradually narrowing the focus thereafter on more specific themes.

Gamma
Pre-measure of association for ordinal variables.

General linear model


A model of the form y = where each of the independent variables zj, j=1,2,…,p, is a function of
x1,x2,…., xk, the variables for which data have been collected.

Generalisable
It has both a standard and technical use in research methods. As in normal conversation, it can
describe the extent to which the findings from a study of a sample can be generalised into
conclusions about the total research population. However, it should be used in a more technical
way, in terms of meaning how results from a sample can be generalised to a greater or lesser
extent according to the outcome of statistical tests of significance.

Generality
Theory attribute of being widely applicable, that is, being able to account for many different
observations.

Generalizability
The applicability of research findings in one setting to others. [#] The extent to which you can
come to conclusions about one thing (often a population) based on information about another
(often a sample). [#] The ability to apply the results of a specific study to groups or situations
beyond those actually studied.
312
Generative theory
The cause produces the effect, a view of causation that requires strong tests.

Gestalt principle
This refers to the whole or the totality. Gestalt psychology is known for the principle (among many
others) stating that the whole is bigger than the sum of its parts. The Gestalt principle is applied to
mixed methods ... to demonstrate that global inferences made at the end of mixed methods
studies are more than the simple sum of the inferences gleaned from QUAL and QUAN strands.

GLM (General Linear Model)


A system of equations that is used as the mathematical framework for most of the statistical
analyses used in applied social research.

Going native
Rote shift in which the researcher gives up the neutral scientific perspective and becomes a
committed member or proponent of the group under study.

Goodness of fit test


A statistical test conducted to determine whether to reject a hypothesized probability distribution
for a population.

Goodness of Measures
Attests to the reliability and validity of measures.

Gradient of similarly
The dimension along which your study context can be related to other potential contexts to which
you might wish to generalize. Contexts that are closer to yours along the gradient of similarity of
place, time, people, and son on can be generalized to with more confidence that ones that are
further away.

Graphic Rating Scale


A s illustrates the responses that can be provided, rather than specifying any discrete response
categories.

Grey Documents
They are an umbrella heading for the paperwork which circulates around governmental and
private organisations, such as committee minutes, internal discussion documents, planning
papers and so forth. It is literature which is not 'published' in the conventional sense, but is
usually available on request.

Grounded Theory
Usually relates to qualitative research. The researcher starts by collecting evidence on a topic (or
phenomenon), and then sees what theoretical propositions the evidence will support. This is
described as an inductive process, or one in which the theory that arises is 'grounded' in the
evidence.

Grounded theory
A theory rooted in observation about phenomena of interest. Also, a method for achieving such a
theory.

Group threats
Internal validity threats to between subjects designs protection against such threats is proved by
random assignment to group.

313
Group Videoconferencing
Video transmittal technology that enables remote groups of people to participate in a conference
using video cameras and monitors.

Grouped data
Data available in class intervals as summarized by a frequency distribution. Individual values of
the original data are not available.

Groupware
A software that enables teams on a network to work on joint projects and access data
simultaneously.

Guardian
An individual who is authorized under applicable state or local law to give permission on behalf of
a child to general medical care.

Hard Data
Precise data, like dates of birth or income levels, which can reasonably be subjected to precise
forms of analysis, such as statistical testing.

Hawthorne effect
A type of demand characteristic in which the researcher’s attention was supposed to increase
subject’s effort; not confirmed by recent research.

Heterogeneity sampling
Sampling for diversity or variety.

Heterogeneous irrelevancies
Variations across studies in method, population, place, and time that are not expected to affect
the outcome of replications.

Hierarchical modeling
The incorporation of multiple units of analysis within a single analytical model. For instance, in an
educational study, you might want to compare student performance with teacher expectations. To
examine this relationship would expectations. To examine this relationship would require
averaging student performance for each class because each teacher has multiple students and
you are collecting data at both the teacher and student level.

High leverage points


Observations with extreme values for the independent variables.

Histogram
A graphical presentation of a frequency distribution, relative frequency distribution, or percent
frequency distribution of quantitative data. It is constructed by placing the class intervals on the
horizontal axis and the frequencies on the vertical axis.

Histogram
Type of frequency distribution in which vertical bars represent the frequencies of the scores.

314
Historical Controls
Control participants (followed at some time in the past or whose data are available through
records) who are used for comparison with participants being treated concurrently. The study is
considered historically controlled when the present condition of participants is compared with their
own condition on a prior regimen or treatment.

History Effects
A threat to the internal validity of the experimental results, when events unexpectedly occur while
the experiment is in progress and contaminate the cause-and-effect relationship.

History threat
A threat to interval validity that occurs when some historical event affects your study outcome.

History
Time threat to internal validity in which some event unrelated to the experimental intervention
causes the observed change.

Hyper geometric probability function


The function used to compute the probability of x successes in n trials when the trials are
dependent

Hypothesis
A statement which research sets out to prove or disprove. There
are two types of hypothesis: 'experimental' where the hypothesis is
a positive statement, such as 'carers who attend a support group
have better coping skills' or 'null' where thestatement contains a
negative, for example, 'carers who attend a support group do not
have better coping skills'. [#] An educated conjecture about the
logically developed relationship between two or more variables,
expressed in the form of testable statements. [#]A testable
statement of how two or more variables are expected to be related to one another.

Hypothesis test
Procedure by which a hypothesis is checked for its fit or agreement with observations.

Hypothesis Testing
A means of testing if the if-then statements generated from the the-oretical framework hold true
when subjected to rigorous examination.

Hypothesis
Prediction about operational variables, usually drawn by deduction from a theory.

Hypothetical-deductive model
A model in which two hypotheses are tested and so that all possible outcomes exist and so that if
one hypotheses is accepted the second must therefore be rejected.

Hypothetico-deductive Method of Research


A seven-step process of observing, preliminary data gathering, theorizing, collecting further data,
analyzing data, and interpreting the results to arrive at conclusions.

315
Idiographic
Laws or rules that relate to individuals.

Idiographic
Research approach that tries to understand persons or situations for their unique characteristics
without trying to generalize (as opposed to nomothetic approach).

Incapacity
Refers to a person’s mental status and means the inability to understand information presented,
to appreciate the consequences of acting (or not acting) on that information, and to make a
choice.

Incidence
Number of new cases of a disorder appearing in a given time period.

Incompetence
Used as a legal term to indicate the inability to manage one’s own affairs.

Incomplete factorial design


A design in which some cells or combinations in a fully-crossed factorial design are intentionally
left empty.

Incomplete within subjects design


Multiple intervention design in which not all possible orders of presentation of the experimental
manipulations are given to each subject.

Independent samples
samples selected from two (or more) populations n such a way that the elements making up one
sample are chosen independently of the elements making up the other sample (s).

Independent variable
The presumed cause in a study. Also a variable that can be used to predict the values of another
variable. Compare dependent variable. Some authors use the term "independent variable" for
experimental research only; for non-experimental research, they use predictor variable. [#] The
variable that is doing the predicting or explaining. It is denoted by x.
[#] A variable that influences the dependent or criterion variable and
accounts for (or explains) its variance. [#] The conditions of an
experiment that are systematically manipulated by the investigator.
A variable that is not impacted by the dependent variable, and that
itself impacts the dependent variable. [#] The variable that you
manipulate. For instance, program or treatment is typically an
independent variable. [#] In a research project which seeks to
establish cause and effect between variables, the potential causal variable is known as the
independent variable, and the variable(s) where effects are under scrutiny is dependent. [#] In
hypothesis tests, a variable that is supposed to cause one or more other variables and is not
caused by them, that is, it is independent of them.

In-Depth Interview
A method of data collection in which a participant is interviewed in detail about a certain research

316
participant. In this format, the interviewer leads the discussion flexibly along some pre-structured
topics, but also allows the participant to expand upon topics in-depth and to explore new avenues
of discussion.

Index
In the context of an online catalog search, the part of the instructions that tells the computer what
type of file to search- author, title, or subject.

Indirect causation
A set of two or more causal connections by which one construct of variable causes a second
indirectly via one or more intervening constructions of variables.

Indirect measure
An unobtrusive measure that occurs naturally variable.

Individual- level data


Based on unit of analysis consisting of individuals.

Individual Review
A review of a single paper or article. In a review of a paper or piece of research you must analyse
method as well as content, and say whether or not you agree with the arguments and why. If you
disagree you need to be able to say what evidence supports your objection. You also need to
identify gaps in the literature. Also known as critiquing.

Induction
Creation of general principles from specific observations. [#] The process by which general
propositions based on observed facts are established.

Inductive inference (in research cycle)


This refers to a process of creating meaningful and consistent explanations,
understandings, conceptual frameworks, and/or theories by integrating (a) the current knowledge
gleaned from the literature, (b) concrete observations or facts, (c) results of data analysis in a
research project, and (d) findings of a previous strand of a mixed methods study
(Tashakkori & Teddlie, 1998).

Inductive
Bottom-up reasoning that begins with specific observations and measures and ends up as
general conclusion or theory.

Inference quality
This is proposed as a mixed methods term to incorporate the QUAN term internal validity and the
QUAL terms trustworthiness and credibility of interpretations (Tashakkori & Teddlie, 1998, 2003).
The definition of the term is as follows: the degree to which the interpretations and conclusions
made on the basis of the results meet the professional standards of rigor, trustworthiness, and
acceptability as well as the degree to which alternative plausible explanations for the obtained
results can be ruled out. Inference quality consists of Design Quality (within-design consistency)
and Interpretive Rigor[conceptual (or inferential) consistency, interpretive agreement (or
interpretive consistency), and interpretive distinctiveness.

Inference transferability
This refers to generalizability or applicability of inferences obtained in a study to other individuals
or entities (see population transferability), other settings or situations(see ecological
transferability), other time periods (see temporal transferability), or other methods of

317
observation/measurement (see operational transferability). It subsumes the QUAN terms external
validity and generalizability as well as the QUAL term transferability.

Inference
This is an umbrella term referring to a final outcome of a study. The outcome may consist of a
conclusion about, an understanding of, or an explanation for an event, a behavior, a relationship,
or a case. (#) This is “a conclusion reached” where there is either (a) a “deduction from premises
that are accepted as true” or (b) an induction by “deriving a conclusion from factual statements
taken as evidence for the conclusion” (Angeles, 1981, p. 133).

Inferential statistics
Statistical analyses used to reach conclusions that extend beyond the immediate data alone. [#]
Statistics that help to establish relationships among variables and draw conclusion therefore. [#]
Statistics with a know probability distribution that can be computed to determine whether an effect
observed in a sample or samples is due to chance.

Influential observation
An observation that has a strong influence or effect on the regression results.

Informant
In field research, a person who is “native” to the social situation being studied, who assists the
researcher by providing insider information and serving as a go-between.

Information System
The system that acquires, stores, and retrieves all relevant information for a specific group of
functions (e.g., manufacturing information system).

Information
In everyday conversation we often use 'information' and 'data' as meaning much the same thing,
but it is necessary to differentiate them. In a computerised database or information system it is
common to refer to the raw material entered into the computer as 'data', and it is then processed
by the computer to become 'information'. Knowledge can be seen as a higher level interpretation
of the information.

Informed Consent
An agreement to take part in research which is based on a full explanation and understanding of
why the research is being undertaken and any impact/effects it
might have on participants. How you obtain informed consent is a
major ethical consideration in research, especially with people who
are mentally confused or who have a learning disability. [#] A
policy of informing study participants about the procedures and
risks involved in research that ensures that all participants must
give their consent to participate. [#] The principle that
potential participants are given adequate and accurate information
about a study before they are asked to agree to participate, and
that they do in fact agree (consent) to participate. In giving
informed consent, participants many not waive or appear to waive
any of their legal rights, or release or appear to release the
investigator, the sponsor, the institution or agents thereof from liability for negligence. [#] Key
requirement for IRB approval in which the subject must give voluntary written consent before
participating based on adequate information and ability.

318
Inkblot Tests
A motivational research technique that uses colure patterns of inkblots to be interpreted by the
subjects.

Institutional Review Board (IRB)


A specially constituted review body established or designated by an entity to protect the welfare
of human participants recruited to participate in biomedical or behavioral research. [#] A panel of
people who review grant proposals with respect to ethical implications and decide whether
additional actions need to be taken to assure the safety and rights of participants.

Institutional review board’s (IRBs)


Committees established by U.S federal regulations at each research institution to protect human
subjects from abuses through prior review of research proposals.

Institutionalized Cognitively Impaired


Persons who are confined, either voluntarily or involuntarily, in a facility for the care of the
mentally or otherwise disabled (e.g. a psychiatric hospital, home, or school for the mentally
disabled).

Institutionalized
Confined, either voluntarily or involuntarily (e.g., a hospital, prison, or nursing home).

Instrumental utilization
Evaluation use that actually affects decision making.

Instrumentation Effects
The threat to internal validity in experimental designs caused by changes in the measuring
instrument between the pre test and the post test.

Instrumentation threat
A threat to internal validity that arises when the instruments (or observes) used on the posttest
and the pretest differ.

Integrity
In science, utter honesty in conducting research including seeking and reporting data contrary to
one’s own belief.

Interaction
The effect produced when the levels of one factor interact with the levels of another factor
influencing the response variable.

Interaction effect
An effect that occurs when differences on one factor depend on which level you are on another
factor.

Interaction
Effect of one independent variable depends on the level of another independent variable.

Interactive causation
direct causation of one variable by another that varies with the level of another variable.

Interactive model
(Maxwell & Loomis, 2003) Applied to mixed methods research, this model indicates that “the
different components of actual mixed methods studies are … connected in a network or web

319
rather than a linear or cyclic sequ

Intercept
In regression analysis, the point on the vertical axis where it meets the regression line, that is the
estimated value of the outcome variable when the predictor variable has the value of zero; usually
symbolized by the letter a.

Internal Consultants
Homogeneity of the items in the measure that tap a construct.

Internal rate of return (IRR)


Yyield in monetary terms of a program investment expressed as an annual rate.

Internal Validity of Experiments


Attests to the confidence that can be placed in the cause-and-effect relationship found in
experimental designs.

Internal validity
The extent to which the results of a study (usually an experiment) can be attributed to the
treatments rather than a flaw in the research design; in other words, the degree to which one can
draw valid conclusions about the causal effects of one variable on another. [#] The approximate
truth about inferences regarding cause-effect relationships. [#] Truthfulness of the assertion that
the observed effect is due to the independent variable(s) in the study.

Internet
A vast network of computers connecting people and information worldwide.

Interpretive agreement (or interpretive consistency)


This refers to consistency of interpretations across people (e.g., consistency among scholars,
consistency with participants’ construction of reality).

Interpretive distinctiveness
This is the degree to which the inferences are distinctively different from (and superior to) other
possible interpretations of the results and the rival explanations are ruled out (eliminated).

Interrater Reliability
The consistency of the judgment of several ratters on how they see a phenomenon or interpret
the activities in a situation.

Interrupted time series design


Single –group quasi experiment that assesses a treatment with numerous pre and posttests.

Interterm Consistency Reliability


A test of the consistency of responses to all the items in a measure to establish that they hang
together as a set.

Interval estimate
An estimate of a population parameter that provide an interval believed to contain the value of
the parameter.

Interval level response


A response measured on as interval level, where the size of the interval between potential
response values is meaningful. Most 1-to-5 rating responses can be considered interval level.

320
Interval level
Type of measurement that assigns scores on a scale with equal intervals.

Interval scale
A scale of measurement for a variable that has the properties of ordinal data and the interval
between observations is expressed in terms of a fixed unit of measure. Interval data are always
numeric. [#] A multipoint scale that taps the differences, the order, and the equality of the
magnitude of the differences in the responses.

Intervening Variable
A variable that surfaces as a function of the independent variable, and helps in conceptualizing
and explaining the influence of the independent variable on the dependent variable. [#]
Measured varioable in a hypothesis test or a theoretical variable in a theory that is the effect of
one variable and a cause of another.

Interview guide
Chelist of topics that the qualitative interviewer wants to cover.

Interview: Group
Interviewing people in a group, rather than as separate individuals, is a way of finding out if a
consensus view exists in a homogeneous group on any given subject, or to identify the range of
views that might be held. It also allows the opportunity to observe and record group dynamics,
and to gather data which arises from individuals in the group stimulating each other. The focus
group is a well known example of a group interview.

Interview: Individual
The individual interview allows you to obtain personalised data about each respondent, and, if
using a structured format, develop a data set for quantitative analysis. Individual interviews can
take place face to face, by telephone or through the internet, via email.

Interview: Semi-structured
Contains a mix of structured questions, often to get factual data, and more general open-ended
questions which allow the respondent to elaborate on particular issues.

Interview: Structured
Uses questions, commonly in a questionnaire, which are precisely worded and where possible
responses are 'pre-coded'.

Interview: Unstructured
Also sometimes called 'in depth' or 'free story' interviews. Unstructured interviews can be thought
of as conversations with a purpose, but don't be tempted to regard them as vague chats.

Interviewing
A data collection method in which the researcher asks for information verbally from the
respondents.

Intranet
A network that connects people and resources within the organization.

Intrasession history
Events internal to the experimental procedure.

Intrquartile range (IQR)


A measure of variability defined as the difference between the third and first quartiles.

321
Investigator
In clinical trials, the individual who actually conducts the investigation.

Itemized Rating Scale


A scale that offers several categories of responses, out of which the respondent picks the one
most relevant for answering the question.

Judgment Sampling
A purposive, non probability sampling design in which the sample subject is chosen on the basis
of the individual’s ability to provide the type of special information needed by the researcher. [#]
A non-probabilistic method of sampling whereby element selection is based on the judgment of
the person doing the study.

Justice
An ethical principle that requires fairness in the distribution of burdens and benefits; often
expressed in terms of treating persons of similar circumstances or characteristics similarly

Kappa
Measure of interrater agreement adjusted for chance agreement.

Key Informants
Are people who are known to have knowledge, experience, expertise and/or opinions specific to
the subject of the research, and who are selected as data sources for this reason. [#] Member of
social setting who serves as a major source of information about the setting for a qualitative
researcher.

Key Words
[or sometimes key concepts] Are those words or short phrases you identify as best describing
important aspects of your subject area. In practice there are two places
where key words are most frequently encountered - in the index of a
book, and as labels for carrying out a bibliographic search, usually by
computer. Key words are sometimes also known as 'descriptors'. [#] In
the context of an online catalog search, the part of the instructions that
tells the computer the specific term to search for – for example, the
author’s name or the book’s title.

Kinesics
Pertaining to bodily movements, especially the study of the communicative aspects of such
movements.

Kruskal-Wallis test
A non-parametric test for identifying differences among three or more populations.

322
Kurtosis
Degree to which the frequency distribution is flat or peaked.

Lab Experiment
An experimental design set up in an arterially contrived setting where controls and manipulations
are introduced to establish cause-and-effect relationships among variables of interest to the
researcher.

Lambda
Pre measure of association for nominal variables.

Laspeyres index
A weighted aggregate price index in which the weight for each item is its base-period quantity.

Latent variable
Unmeasured variable constructed statistically from two or more measured variable.

Law of large numbers


Principle that larger sample sizes make for better estimates of population values.

Leading Questions
Questions phrased in such a manner as to lead the respondent to give the answers that the
researcher would like to obtain.

Least squares method


The method used to develop the estimated regression equation. It minimizes the sum of squared
residuals (the deviations between the observed values of the dependent variable, yi), and the
estimated values of the dependent variable, yi).

Least squares
The criterion fir fitting a regression line so that you minimize the sum of the squares of the
residuals from the regression line.

Legally Authorized Representative


A person authorized either by statute or by court appointment to make decisions on behalf of
another person. In human participants research this refers to an individual (or judicial) authorized
under applicable law to consent on behalf of a participant to the participant’s participation in the
research.

Level of significance
The maximum probability of a Type I error.

Level
A subdivision of a factor into components or features.

Levels of measurement
Categorization of measurement into types based on the amount of information in the measure:
nominal, ordinal, interval, ratio.

323
Leverage
A measure of how far the values of the independent variables are from their mean values.

Life History
It is a record of an event/events in a respondent's life told (written down, but increasingly audio or
video recorded) by the respondent from his/her own perspective in his/her own words. A life
history is different from a 'research story' in that it covers a longer time span, perhaps a complete
life, or a significant period in a life.

Likert Scale
An interval scale that specifically uses the five anchors of Strongly Disagree, Disagree, Neither
Disagree nor Agree, Agree, and Strongly Agree. [#] A method of scaling in which the items are
assigned interval-level scale values and the responses are gathered using an interval level
response format.

Linear assumption
In regression analysis, the assumption that the relationship between the studied variables is best
described by a straight line.

Linear model
Any statistical model that uses equations to estimate lines.

Literature Review
It brings together individual reviews of papers, etc. It should weave
together the individual reviews into an overview of the area. The aim is
to convey an awareness of the current state of knowledge in the subject.
It is commonly used to set the scene for introducing new research or a
new perspective on the research. [#] Analysis of all research on a topic
that tires to identify consensus findings or to resolve conflicts in the
work.

Literature Search
A manual and/or electronic search of the literature to find out what research has been carried out
in your area of enquiry.

Literature Survey >>> Literature Review.

Literature
All of the research reports on a single question.

Loaded Questions
Questions that would elicit highly biased emotional responses from subjects.

Local Area Network (LAN)


Computers in close proximity connected together, enabling people to share information, files, and
other necessary materials.

Longitudinal
A study that takes place over time.

Longitudinal correlation
Association of variable measured at different times. One variable is said to “lead” (come before)
or “lag” (come after” the other.

Longitudinal Research
May use any method of data gathering (observation, survey, experiment, etc.), but its particular
324
characteristic is that the process is repeated on several occasions over a period of time, as far as
possible replicating the chosen methodology each time. It follows that a key aim of such research
is to monitor changes over time.

Longitudinal Study
A research study for which data are gathered at several points in time to answer a research
question. [#] A study in which data are collected from the same sample at least two different
times. A study designed to follow participants through time.

Longitudinal survey
Survey conducted at two or more times (for example, panel, trend, and cohort longitudinal survey
designs).

Lot
A group of items such as incoming shipments of raw materials or purchased parts as well as
finished goods from final assembly.

Main effect
An outcome that shows consistent difference between all levels of a factor.

Management Information System (MIS)


A generic term for information within an enterprise, facilitated by software and technology.
Record – keeping procedures for a program by which managers or others can routinely monitor
its operation .

Manipulation Checks
Measures used to assess the effectiveness of the manipulation.

Manipulation
How the researcher exposes the subjects to the independent variable to determine cause-and-
effect relationships in experimental designs.

Mann-Whitney-Wilcox on (MMW) test


A non-parametric statistical test for identifying differences between tow population based on the
analysis of two independent samples.

Margin of error
The + - value added to and subtracted from a point estimate in order to develop a confidence
interval.

Matched samples
Samples in which each data value of one sample is matched with a corresponding data value of
the other sample.

Matching
A method of controlling known contaminating factors in experimental studies, by deliberately
spreading them equally across the experimental and control groups, so as not to confound the
cause-and effect relationship. [#] Assigning subjects to experimental and control conditions to
325
equalize the groups on selected characteristics; can be combined with random assignment but
when used alone cannot guarantee group equivalence on variable not used in the matching.

Math anxiety
Fear of math and statistics, which can result in avoidance of math- based courses or careers.

Math phobia
The fear and consequent avoidance of math – related material.

Matrix
A table of numbers such as correlations.

Maturation Effects
A threat to internal validity that is a function of the biological, psychological, and other processes
taking place in the respondents as a result of the passage of time.

Maturation threat
A threat to validity that occurs as a result of natural maturation that occurs between pre- and post-
measurement.

Maturation
Time threat to internal validity in which internal developmental process cause the observed
change.

Mature Minor
Someone who has not reached adulthood (as defined by state law) but who may be treated as an
adult for certain purposes (e.g. consenting to medical care). A mature minor is not necessarily
an emancipated minor (See >>> Emancipated Minor).

Mean (M)
Measure of central tendency consisting of the sum divided by the number of observations,
symbolize by M of X.

Mean Square Error


2
The unbiased estimate of the variance of the error term. It is denoted by MSE or s . It measures
the accuracy of a forecasting model.

Mean
The average of a set of figures. [#] A description of the central tendency in which you add up all
the values and divide by the number of values. [#] A measure of central location for a data set. It
is computed by summing all the data values and dividing by the number of items. [#] The
arithmetic average of a set of data in which the values of all observations are added together and
divided by the number of observations.

Measure of Central Tendency


Descriptive statistics of a data set such as the mean, median, or mode.

Measure of Dispersion
The variability in a set of observations, represented by the gang, variance, standard deviation,
and the intrquartile range.

Measurement association
The correlation between observed variable that derives from their serving as measures of a latent
variable.

326
Measurement decay
Time threat to intervals validity in which changes in the measurement process cause the
observed change, also called instrumentation.

Measurement error
Any influence on an observed score not related to what you are attempting to measure.

Median
The central item in a group of observation arranged in an
ascending or descending order. [#] A measure of central
location. It is the value that splits the data into two equal
groups, one with values greater than or equal to the median
and one with values less than or equal to the median.[#] The
idle number in a series of numbers. For example in Thurstone
scaling the median is the value above and below which 50
percent of the ratings fall. [#] The outcome that divides an
ordered distribution exactly into halves. [#] The score found at
the exact middle or fiftieth percentile of the set of values. One
way to compute the median is to list all scores in numerical
order and then locate the score in the center of the sample.

Memoing
A process for recording your thoughts and ideas as t hey evolve throughout the study.

Meta-analytic review
Literature review approach that reduces each study to a few summary effect sized, which can
then be analyzed by statics.

Meta-inference (or integrated mixed inference)


This is an inference developed through an integration of the inferences that are obtained on the
basis of QUAL and QUAN strands of a mixed methods study. Back to the top

Meta-interpretation
Forthofer (2003) describes this term as follows: “Mixed methods designs are inherently more
complex, and those that attempt any integration or synthesis of results across methodologies
require an additional phase of “meta-interpretation.”

Method effects
Source of construct invalidity in which measures of different constructs using the same
procedure fail to diverge.

Method/Methodology
While 'method' describes what you as a researcher have done, methodology is about your
reasons for doing it.

Methodological triangulated design


(Morse, 2003) This is a project that is comprised of two or more subprojects, each of which
exhibits methodological integrity. While complete in themselves, these projects fit to complement
or enable the attainment of the overall programmatic research goals.

Methodology
The methods you use to try to understand the world better.

327
Minimal Risk
A risk is minimal where the probability and magnitude of harm or discomfort anticipated in the
proposed research are not greater, in and of themselves, than those ordinarily encountered in
daily life or during the performance of routine physical or psychological examinations or tests. The
definition of minimal risk for research involving prisoners differs somewhat from that given for
non-institutionalized adults.

Mixed design
Experimental design that includes both within – subjects and between – subjects features.

Mixed Method Design


This is a design that includes both QUAL and QUAN data collection and analysis in parallel form
(concurrent mixed method design, in which two types of data are collected and analyzed), in
sequential form (sequential mixed method design, in which one type of data provides a basis for
collection of another type of data), or where the data are converted (qualitized or quantitized) and
analyzed again (conversion mixed method design). (#) (Bazeley, 2003) This design includes
studies that “use mixed data (numerical and text) and alternative tools (statistics and text
analysis) but apply the same method, for example, in developing a grounded theory.” See also
Mixed Model Design.

Mixed methods sampling


(Kemper, Stringfield, & Teddlie, 2003). This is a simultaneous selection of units of study through
both probability (to increase generalizability/transferability) and purposive sampling strategies (to
increase inference quality).

Mixed model design


This is a design in which mixing of QUAL and QUAN approaches occurs in all stages of the study
(formulation of research questions, data collection procedures and research method, and
interpretation of the results to make final inferences) or across stages of the study (e.g., QUAL
questions, QUAN data). In multistrand designs, either the strands are parallel (concurrent mixed
model design) or sequential (sequential mixed model design, in which inferences of one strand
lead to questions of the next strand) or the data are converted and analyzed again to answer
different questions (conversion mixed model design).

Modal instance sampling


Sampling for the most typical case.

Mode
A measure of location, defined as the most frequently occurring data value. [#] The most
frequently occurring value in the set of scores. [#] Measure of central tendency consisting of the
most frequently occurring sources; if the distribution has two modes, the distribution is called
bimodal.

Model specification
The process of stating the equation that you believe best summarizes the data for a study.

Model
One possible set of causal paths that we can compare with observed data.

Modem
A device for linking computers by telephone line an abbreviation for “modulator demodular”.

Moderating Variable
A variable on which the relationship between two other variables is contingent. That is, if the
328
moderating variable is present, the theorized relationship between the two variables will hold
good, not otherwise. [#] In interactive causation, the variable that determines the effect of one
variable on another.

Monitoring
The collection and analysis of data as the project progresses to assure the appropriateness of the
research, its design and participant protections.

Mono-method bias
A threat to construct validity that occurs validity that occurs when you rely on only a single
implementation of your independent variable, cause, program, or treatment in your study.

Monomethod design: See >>> monostrand design.

Monostrand design
These designs use a single research method or data collection technique (QUAN or QUAN) and
corresponding data analysis procedures to answer research questions. They are also known as
single-phase designs.

Mortality
The loss of research subjects during the course of the experiment, which confounds the cause-
and-effect relationship.

Mortality threat
A threat to validity that occurs because a significant number of participants drop out.

Mortality
Subject attrition from retest to posttest, which casts doubt on the validity of the study; here
conceptualized as a threat to measurement construct validity. Protection against this threat is not
provided by a control group or random assignment but rather by care in defining the subjects to
be measured in evaluation experimental impact.

Motivational Research
A particular data gathering technique directed toward surfacing information, ideas, and thoughts
that are not either easily verbalized, or remain at the unconscious level in the respondents.

Moving averages
A method of forecasting or smoothing a time series by averaging each successive group of data
points.

Multicollinearity
The term used to describe the correlation among the independent variables.

Multilevel mixed methods design


This is a design in which QUAL data are collected at one level (e.g., child), and QUAN data are
collected at another level (e.g., family) in a concurrent or sequential manner to answer different
aspects of the same research question. Both types of data are analyzed accordingly, and the
results are used to make inferences. Because the questions and inferences all are in one
approach (QUAL or QUAN), this is a predominantly QUAL or QUAN study with some added
components. In practice, because research questions and the inferences that are made at the
end of the study are usually both QUAL and QUAN (using mixed models), this design is not
common. See also multilevel mixed model design.

329
Multilevel mixed model design
This is a design in which QUAL data are collected at one level (e.g., child) and QUAN data are
collected at another level (e.g., family) in a concurrent or sequential manner to answer
interrelated research questions with multiple approaches (QUAL and QUAN). Both types of data
are analyzed accordingly, and the results are used to make multiple types of inferences (QUAL
and QUAN) that are pulled together at the end of the study in the form of “global inferences.” See
>>> multilevel mixed method design.

Multilevel mixed sampling


(Kemper et al., 2003).This is sampling strategy in which probability and purposive sampling
techniques are used at different levels of the study (e.g., student, class, school, district).

Multimethod matrix (MTMM)


A matrix of correlations arranged to facilitate the assessment of construct validity.

Multimethods design
This refers to designs in which the research questions are answered by using two data collection
procedures or two research methods, both with either the QUAL or QUAN approach. See also
multimethods QUAL study and multimethods QUAN study.

Multimethods QUAL study


This refers to designs in which the research questions are answered by using two QUAL data
collection procedures or two QUAL research methods.

Multimethods QUAN study


This refers to designs in which the research questions are answered by using two QUAN data
collection procedures or two QUAN research methods.

Multinomial population
A population in which each element is assigned to one and only one of several categories. The
multinomial probability distribution extends the binomial probability distribution from two to three
or more categories.

Multioption variable
A question format in which the respondent can pick multiple variables from a list.

Multiple – baseline design


Multiple intervention design in which the group or subject receives more than one experimental
manipulation.

Multiple coefficient of determination


A measure of the goodness of fit of the estimated multiple regression equation. It can be
interpreted as the proportion of the variation in the dependent variable that is explained by the
estimated regression equation.

Multiple comparison procedures


Statistical procedures used to conduct statistical comparisons between pairs of the population
means or treatments.

Multiple group threat


An internal validity threat that occurs in studies that use multiple groups, for instance, a program
and a comparison group.

330
Multiple methods design
This refers to designs in which more than one research method or data collection and analysis
technique is used to answer research questions. They include mixed methods designs (QUAL +
QUAN) and multimethods designs (QUAN + QUAN or QUAL + QUAL).Back to the top

Multiple Regression Analysis


A statistical technique to predict the variance in the dependent variables against it. [#]Regression
analysis involving two or more independent variables. [#] The mathematical equation relating the
expected or mean value of the dependent variable to the values of the independent variables.

Multiple regression model


The mathematical equation the describes how the dependent variable y is related to the
independent variables x1, x2 ………………….xp and an error term E.

Multiple regression
One statistical procedure for conduction multivariate research.

Multiple sampling plan


A form of acceptance sampling in which more than one sample of stage is used. On the basis of
the number of defective items found in a sample, a decision will be made to accept the lot, reject
it, or continue sampling.

Multiplication law
A probability law used to compute the probability of an intersection of two events, denoted by A
and B . It is P(A = P(A) P(B/A) or P(A = P(B)P*A/B). For independent events, it
reduces to P(A = P(A)P(B).

Multiplicative time series model


A model whereby the separate components of the time series are multiplied to identify the actual
time series values. When the four components of trend, cyclical, seasonal, and irregular are
assumed present, we obtain Yt = Tt*St*Lt.

Multistage Cluster Sampling


A probability sampling design that is a stratified sampling of clusters.

Multistage Sample
A probability sample that involves several stages (and frequently a cluster sampling stage) , such
as randomly selecting clusters from a population, then randomly selecting people from each of
the clusters.

Multi-stage sampling
The combining of several sampling techniques to create a more efficient or effective sample than
the use of any one sampling type can achieve on its own.

Multistrands design This refers to designs that use more than one research method or data
collection procedure. See also multimethods design.

Multitrait-multimethod matrix
This is a matrix of correlations between multiple methods of measuring each of a set of attributes.
The diagonal values indicate the reliability of each measure/method. The off-diagonal values
indicate the convergent validity and discriminant validity of each procedure/instrument. This

331
method was introduced by Campbell and Fiske (1959) to evaluate the quality of data obtained
from measurement instruments.

Multivariate Analysis
Any of several methods for examining multiple variables at the same time. Usage varies. (a)
Stricter usage reserves the term for designs with two or more independent variables and two or
more dependent variables. (b) More loosely, multivariate analysis applies to designs with more
than one independent variable or more than one dependent variable or both. Whichever usage
you prefer, either allows researchers to examine the relation between two variables while
simultaneously controlling for the influence of other variables. Examples include path analysis,
factor analysis, multiple regression analysis, MANOVA, LISREL, canonical correlations, and
discriminant analysis.

Multivariate research statistics


Analytic procedures that can study three or more variables stimulateously: for example, two or
more predictors in relation to an outcome variable.

Mundane realism
Extent to which a research setting resembles in physical detail a real social setting.

Mutually exclusive
Said of two events, conditions, or variables which cannot occur at the same time. For example,
one cannot be both male and female, or both Protestant and Catholic. Thus, the categories male
and female, or Catholic and Protestant are said to be mutually exclusive.

Mutually exclusive
The property of a variable that ensures that the respondent is not able to assign two attributes
simultaneously. For example, gender is a variable with mutually exclusive potions if it is
impossible for the respondents to simultaneously claim to be both male and female.

Narrative
[or biographical] approaches to research are primarily qualitative, and include gathering/ using
data in the form of diaries, stories and life histories.

Natural selection theory of knowledge


A theory that ideas have survival value and that knowledge evolves through a process of
variation, selection, and retention.

Near Market
It is a descriptor given to research which is placed in an economic context of being commercially
exploitable. It is commonly used to describe applied research (see entry) set in the market sector,
where there is or should be a market interest in supporting and funding it.

Negative relationship
A relationship between variables in which high values for one variable are associated with low
values on another variable.

332
No sampling error
All types of errors other than sampling errors, such as measurement error, interviewer error and
processing error.

Nominal response format


A response format that has a number beside each choice where the number has no meaning
except as a placeholder for that response.

Nominal scale
A scale of measurement for a variable that uses a label or name to identify an attribute of an
element. Nominal data may be non-numeric or numeric. [#] A scale that categorizes individuals
or objects into mutually exclusive and collective exhaustive groups, and offers basic, categorical
information on the variable of interest.

Nomological network
A network that includes the theoretical framework for what you are trying to measure, an
empirical framework for how you are going to measure it, and specification of the linkages among
and between these two frameworks.

Nomothetic
Refers to laws or rules that pertain to the general case.

Non contrived Setting


Research conducted in the natural environment where activities take place in the normal manner
(i.e. the field setting)

Non directional Hypothesis


An educated conjecture of a relationship between two variables, the directionality of which cannot
be guessed.

Non participant-Observer
A researcher who collects observational data without becoming an integral part of the system.

Non probability Sampling


A sampling design in which the elements in the population do not have a known or predetermined
chance of being selected as sample subjects.

Non-affiliated Member
Member of an Institutional Review Board who has no ties to the parent institution, its staff, or
faculty. This individual is usually from the local community (e.g., minister, business person,
attorney, teacher).

Nonequivalent Dependent Variables (NEDV) design


A single-group pre-post quasi-experimental design with two outcome measures where only one
measure is theoretically predicted to be affected by the treatment and the other is not.

Nonequivalent-Group design (NEGD)


A pre-post two-group quasi-experimental design structured like a pretest-posttest randomized
experiment, but lacking random assignment to group.

Non-probabilistic methods
Statistical methods that require few, if any, assumptions about the population probability
distributions and the level of measurement. These methods can be applied when nominal or
ordinal data are available.

333
Non-probability Sample
A subset of the population chosen in a way that does not give every member of the population a
known (nonzero) chance of being selected.

Non-probability sampling
Sampling that does not involve random selection.

Non-proportional quota sampling


A sampling method where you sample until achieve a specific number of sampled units for each
subgroup of a population, where the proportions in each group are not the same.

Non-response Bias
The bias that results from differences between those who agree to participate in a survey and
those who don’t.

Non-therapeutic Research
Research that has no likelihood or intent of producing a diagnostic, preventive, or therapeutic
benefit to the current participants, although it may benefit participants with a similar condition in
the future.

Normal probability distribution


a continuous probability distribution. Its probability density function is bell-shaped and determined
by its mean standard deviation.

Normal probability plot


A graph of normal scores plotted against value of the standardized residuals. This plot helps
determine whether the assumption that the error term has a normal probability distribution
appears to be valid.

np chart
A control chart used to monitor the output of a process in terms of the number of defective items.

Nuisance Variable
A variable that contaminates the cause-and-effect relationship.

Null case
A situation in which the treatment has no effect.

Null hypothesis
The hypothesis that describes the possible outcomes other than the alternative hypothesis.
Usually the null hypothesis predicts there will be no effect of a program or
treatment you are studying. [#] The conjecture that postulates no
differences or no relationship between or among variables. [#] The
hypothesis tentatively assumed true in the hypothesis testing procedure.
[#] The proposition, to be tested statistically, that the experimental
intervention has “no effect,” meaning that the treatment and control
groups will not differ as a result of the intervention. Investigators usually
hope that the data will demonstrate some effect from the intervention, thus allowing the
investigator to reject the null hypothesis.

Numerical Scale
A scale with bipolar attributes with five points or seven points indicated on the scale.

334
O
Symbol used in design diagrams to represent one or more observations collected at some time
point.

Objectivity
Interpretation of the result on the basis of the results of data analysis, as opposed to subjective or
emotional interpretations.

Observation bias
Data collection bias that occurs in interviewing or measurement stage (for example, the tendency
of respondents to give answers that are socially desirable).

Observation: Non-participant
Where the researcher attempts to remove or detach themselves as an actor from the research
situation.

Observation: Participant
Observing something as an insider, as someone who is involved in the processes being
observed.

Observational Survey
Collection of data by observing people or events in the work environment and recording the
information.

Observed Score
Also called fallible score, the value obtained by the measurement procedure and assumed to
contain some degree of error.

Obtrusive versus unobtrusive measurement


Dimension of measurement that separates observations known to the subject from those
occurring outside the subject’s awareness.

Office of Research Integrity (ORI)


Institutional unit that protects against and judges cases of research misconduct such a fraud.

Ogive
A graph of a cumulative distribution.

OMRS. Optical Mark Reader System


An electronic method of analysing responses to questionnaires/surveys.

One-Shot-Study >>> cross sectional Study.

One-tailed hypothesis
A hypothesis that specifies a direction, for example, when your hypothesis predicts that your
program will increase the outcome.

335
One-tailed test
A hypothesis test in which rejection of the null hypothesis occurs for values of the test statistic in
one tail of the sampling distribution.

Online data collection


Using a computer to receive and store data directly from an experiment in progress.

Online
Connected to a computer for direct interaction with the electric database.

Ontological assumptions
The assumptions you hold about reality. For instance, realism is an ontological assumption that
holds that there is an external reality apart from your experience of it.

Ontology
Branch of philosophy dealing with the ultimate nature of things.

Open coding
A phase of the grounded theory method where you consider the data in minute detail while
developing some initial categories.

Open Design
An experimental design in which both the investigator(s) and the participants know the treatment
group(s) to which participants are assigned.

Open-ended Questions
Leave the answer entirely to the respondent, either with a blank
space on the questionnaire for recording the reply, or by
phrasing a question in an interview in such a way as to elicit a
longer answer. This approach is used when there is no way of
knowing what answers the respondents are likely to give, or if
you want quotable responses. Often they are used in pilot
studies in order to develop a pre-coded version for the main
study. [#] Questions that the respondent can answer in a free-
flowing format without restricting the range of choices to a set of
specific alternatives suggested by the researches. [#]
Survey questions that allow respondents to answer in their own
words.

Operating characteristic curve


A graph showing the probability of accepting the lot as a function of the percentage defective in
the lot. This curve can be used to help determine whether a particular acceptance sampling plan
meets both the producer’s and the consumer’s risk requirements.

Operational Definition
Definition of a construct in measurable terms by reducing it from its level of abstraction through
the delineation of its dimensions and elements. [#] Statements of the specific ways in which the
absence, presence, and/or the degree of presence of a phenomenon will be determined in a
specific research process. [#] Procedure that translates a construct into manifest or observable
form.

Operational transferability
This is the degree to which the inferences that are made on the basis of the results of the study
are generalizable to other methods of observing/measuring the entities or attributes that the
336
inference is about. Subsumes the QUAN terms external validity of operations and operational
external validity.

Operationalization
The act of translating a construct into its manifestation, for example translating the idea of your
treatment or program into the actual program, or translating the idea of what you want to measure
into the real measure. The result is also referred to as an operationalization, that is, you might
describe your actual program as an operationalized program. [#] Your translation of an idea or
construct into something real and concrete.

Operations Research
A quantitative approach taken to analyze and solve problems of complexity.

Opportunity cost
Value of the best alternative use of the project’s resources, that is, the value forgone by the
decision to invest in the project.

Ordinal level
Type of measurement that assigns observations to ordered categories.

Ordinal response format


A response format in which respondents are asked to rank the possible answers in order of
preference.

Ordinal scale
A scale of measurement for a variable that has the properties of nominal data and can be used to
rank or order the data. Ordinal data may be non-numeric or numeric. [#] A scale that not only
categorizes the qualitative differences in the variable of interest, but also allows for the rank-
ordering of these categories in a meaningful way.

Outlier
A data point or observation that does not fit the pattern shown by the remaining data; an
unusually small or unusually large data value.

Oversample
Drawing a disproportionately large number of elements to assure an adequate number of
elements from small clusters or strata.

P
Symbol for probability that an observed inferential statistic occurred by chance (for example p <
.05).

P chart
A control chart used when the output of a process is measured in terms of the proportion
defective.

Paasche index
A weighted aggregate price index in which the weight for each item is its proportion defective.

337
Paired Comparisons
Respondents choose between tow objects at a time , with the process repeated with a small
number of objects.

Panel Studies
Studies conducted over a period of time to determine the effects of certain changes made in a
situation, using a panel or group of subjects as the sample base.

Panel survey
Longitudinal survey design involving multiple interviews with the same subjects: are known as a
panel.

Panel
Correlational design in which a group of subjects is surveyed or measured at more than one time
point; also the group itself is called a panel.

Paradig
Shared framework involving common theory and data collection tools in which researchers
ordinarily approach scientific problems.

Paradigm shift
The revolution in assumptions about and perception of a research problem during which one
perception replaces another.

Paradigm
(Mertens, 2003). A conceptual model of a person’s worldview, complete with the assumptions
that are associated with that view. (#) (Caracelli and Green, 2003) paradigms are social
constructions, historically and culturally embedded discourse practices, and therefore neither
inviolate nor unchanging. Back to the top

Paradox
Apparent contradiction between two different theories, between, two different observations, or
between a theory and observations.

Parallel mixed model design See >>> Concurrent Mixed Model Design.

Parallel-Form Reliability
That form of reliability which is established when responses to two comparable sets of measure
tapping the same construct are highly correlated.

Parameter
A numerical characteristic of a population, such as a population mean, a population standard
deviation, a population proportion, and so on.

Parametric Statistics
Statistics used to test hypothesis when the population from which the sample is drawn is
assumed to be normally distributed.

Parsimony
Efficient, expiation of the variance in the dependent variable of interest through the use of a
smaller, rather than a larger number of independent variables.

Parsimony
Theory attribute of being simple or sparing of constructs and relationships.

338
Partial correlation
Measure of association between two variables after statistically controlling one or more other
variables. Order of correlation is the number of variables controlled (for example, zero-order is
simple correlation, first – order partial controls one variable and so on).

Participant observation
A method of qualitative observation where the researcher becomes a participant in the culture or
context being observed. [#] Common qualitative research method in which the researcher enters
the social setting to be studied and actively joins the subjects in their normal activities.

Participant
Individuals whose physiological or behavioral characteristics and responses are the object of
study in a research project. Under federal regulations, human participants are defined as: living
individual(s) about whom an investigator conducting research obtains: (1) data through
intervention or interaction with the individual; or (2) identifiable private information.

Participant-Observer
A researcher who collects observational data by becoming a member of the system from which
data are collected.

Partitioning
The process of allocating the total sum of squares and degree of freedom to the various
components.

Paternalism
Making decisions for others against or apart from their wishes with the intent of doing them good.

Path analysis
Diagram of a causal model that includes statistical estimates of relationships.

Path coefficient
Standardized regression coefficient from a multiple regression analysis that describes the
association between two variables in path analysis.

Pattern matching
The degree of correspondence between two data items. For instance, you might look at a pattern
match of a theoretical expectation pattern with an observed pattern to see if you are getting the
outcomes you expect.

Pattern-Matching NEDV design


A single-group per-post quasi-experimental design with multiple outcome measures where there
is a theoretically-specified pattern of expected effects across the measures. To asses the
treatment effect, the theoretical pattern of expected outcomes is correlated or matched wit the
observed pattern of outcomes as measured.

Pearson product moment correlation coefficient (R)


Non- PRE measure of association best suited for interval or ratio variables.

Pearson Product Moment Correlation


A particular type of correlation used when both variables can be assumed to be measured at an
interval level of measurement.

339
Percent frequency distribution
A tabular summary of data showing the percentage of items in each of several non-overlapping
classes.

Percentile
A value such that at least p percent of the items are less than or equal to this value and at least
th
(100-p) percent of the items are greater than or equal to this value. The 50 percentile is the
median.

Permission
The agreement of parent(s) or guardians to the participation of their child or ward in research.

Persuasive utilization
Evaluation use in which the research justifies decisions already made also called symbolic
utilization.

Phenomenology
A philosophical perspective as well as an approach to qualitative methodology that focuses on
people’s subjective experiences and interpretations of the world. [#] Philosophical perspective
that emphasizes the discovery of meaning from the point of view of the studied group or
individual.

Pie chart
A graphical device for presenting data summaries based on sub-division of a circle into sectors
that correspond to the relative frequency for each class.

Pilot Study
A trial, both to examine the effectiveness of various aspects of the proposed research, such as
procedures for data gathering, and to aid the completion of detailed project plans. [#] Small scale
research with the experimental manipulation to determine its effective ness before using it in the
main study.

Placebo
A chemically inert substance (e.g., sugar pills) given to control groups as if it were the medicine
or treatment for its psychologically suggestive effect; it is used in controlled clinical trials to
determine whether improvement and side effects may reflect imagination or anticipation rather
than actual power of a drug. [#] Intervention that simulates an authentic treatment but with no
active ingredient.

Plagiarism
Falsely claiming credit for work authored by another.

Plausible rival hypothesis


Believable or possible alternative explanation for an observation.

Point estimate
A single numerical value used as an estimate of a population parameter.

Point estimator
The sample statistic that provides the point estimate of the population parameter.

Poisson probability distribution


A probability distribution showing the probability of x occurrences of an event over a specified
interval of time or space.
340
Poisson probability function
The function used to compute Poisson probabilities used to compute Poisson probabilities.

Pooled variance
An estimate of the variance of a population based on the combination of two (tow or more)
sample results. The pooled variance estimate is appropriate whenever the variances of two (or
more) populations are assumed equal.

Population Frame
A listing f all the elements in the population from which the sample s drawn.

Population parameter
A numerical value used as a summary measure for a population of data ( e.g. the population
mean, the population variance , and the population standard deviation). [#] The mean or average
you would obtain if you were able to sample the entire population

Population transferability
This refers to generalizability or applicability of inferences obtained in a study to other individuals
or entities. Subumes the QUAN term population validity and population external validity, and the
QUAL term transferability. See >>> Inference transferability.

Population validity (or population external validity) See >>> Inference transferability

Population
A group of persons that one wishes to describe or about which one wishes to generalize. To
generalize about a population, one often studies a sample that is
meant to be representative of the population. [#] The entire group
(or set or type) of people from which a researcher samples, and
to which she or he would ideally like to generalize. [#] The entire
group of people, events, or things that the researcher desires to
investigate. [#] The group you want to generalize to and the
group you sample from in a study. [#] The set of all elements of
interest in a particular study. [#] Collection of all elements to
whom survey results are to be generalized.

Positive relationship
A relationship between variables in which high values for one variable are associated with high
values on another variable and low values are associated with low values.

Positivism
An approach to knowledge based on the assumption of an objective reality that can be
discovered with observed data. [#] The philosophical position that the only meaningful inferences
are ones that can be verified through experience or direct measurement. Positivism is often
associated with the stereotype of the hard-headed, lab-coat scientist who refuses to believe in
something if it can’t be seen or measured directly.

Post-positivism
The rejection of positivism in favour of a position that one can make reasonable inference about
phenomena based upon theoretical reasoning combined with experience-based evidence.

Posttest
A test given to the subject to measure the dependent variable after exposing them to a treatment.

341
Posttest-only non-experimental design
Na research design in which only a posttest is given. It is referred to as nonexperimental because
no control group exists.

Posttest-only randomized experiment


An experiment in which the group are randomly assigned and receive only a posttest.

Power curve
A graph of the probability of rejecting Ho for all possible values of the population parameter not
satisfying the null hypothesis. The power curve provides the probability of correctly rejecting the
null hypothesis.

Power
The probability of Ho when it is false. [#] The probability of a statistical test correctly rejecting a
false null hypothesis (or 1- beta).

Precision
The degree of closeness of the estimated sample characteristics to the population parameters,
determined by the extend of the variability of the sampling distribution of the sample mean.

Pre-coded Questions
These have a list of answers from which to choose in order to facilitate analysis, or to better
control the interview process. In a self-completion questionnaire the respondent chooses the
option or options. In an interview the options are either read out or shown to the respondent who
then chooses. In this type of question care must be taken that the options are exclusive and
exhaustive. The category 'Other' is often added in case the list is not complete, but keep in mind
that if there are possible answers which are not on your list, bias can ensue.

Precosopm statement
A probability statement about the sampling error.

Prediction interval estimate


The interval estimate of an individual value of y for a given value of x

Predictive Research
It is concerned with identifying indicators of future behaviour or demand in a population on the
basis of the current behaviour and demands of a sample. Predictive techniques use a number of
statistical approaches.

Predictive Study
A study that enables the prediction of the relationships among the variables in a particular
situation.

Predictive validity
A type of construct validity based on the idea that your measure is able to predict what it
theoretically should be able to predict. [#] The ability of the measure to differentiate among
individuals as to a criterion predicted for the future.

Predictor Variable>>> Independent Variable.

Pre-experiments
Class of experimental design that are very vulnerable to threats to internal validity.

342
Pre-post nonequivalent groups quasi-experiment
A research design in which groups receive both a pre- and posttest and group assignment is not
randomized and therefore the groups may be nonequivalent, making it a quasi-experiment.

Pre-project Research
It is an activity undertaken in the planning stages of research to clarify issues such as the focus of
the research, its aims, access to sampling frame, likely response rate, most appropriate
methodology and means of analysis. Overlaps somewhat with a pilot study.

Pre-randomization
Class of experimental designs that are very vulnerable to threats to internal validity.

Present values
Value of future program benefits adjusted downward by some discount rate.

Pretest sensitization
Production of changes in later interviews by the experience of a prior interview.

Pretest
A test given to subjects to measure the dependent variable before exposing them to a treatment.

Pretesting Survey Questions


Test of the understand ability and appropriateness of the questions planned to be included in a
regular survey, using a small number of respondents.

Prevalence
Number of cases existing at some time.

Primary Data
Data collected firsthand for subsequent analysis to find solutions to the problem researched.

Primary Sources
A primary source is that which provides the initial basic data set under discussion, while a
secondary source is one which offers further analysis or commentary on the data. Generally it is
better, if you can, to make reference to primary sources.

Principal Investigator
The scientist or scholar with primary responsibility for the design and conduct of a research
project.

Prisoner
An individual involuntarily confined in a penal institution, including persons: (1) sentenced under
a criminal or civil statue; (2) detained pending arraignment, trial, or sentencing; and (3) detained
in other facilities.

Privacy
A person’s capacity to control the extent, timing, and circumstances of shared oneself (physically,
behaviorally, or intellectually) with others.

Probabilistic
Based on probabilities

Probabilistic equivalence
The notion that two groups, if measured infinitely, would on average perform identically. Note that

343
two groups that are probabilistically equivalent would seldom obtain the exact same average
score.

Probabilistic sampling
Any method of sampling for which the probability of each possible sample can be computed.

Probability density function


A function used to compute probabilities for a continuous random variable. The area under the
graph of a probability density function over an interval represents probability.

Probability distribution
A description of how the probabilities are distributed over the values the random variables can
assume.

Probability distribution
In inferential statistics, the likelihood of occurrence (or probability ) of each level of the inferential
statistic for any number of degree of freedom.

Probability function
A function, denoted by f(x), that provides the probability that x assumes a particular value for a
discrete random variable.

Probability proportionate to size (PPS)


Method of preserving equal probability of sampling across all elements in the population by which
the number of elements drawn from each cluster is proportionate to the size of the cluster.

Probability Sample
A subset of the population chosen in such a way that every member of the population has a
known (nonzero) chance of being selected into the sample.

Probability sampling
Method of sampling that utilizes some form of random selection. [#] The sampling design in
which the elements of the population have some known chance or probability of being selected
as sample subjects.

Probability sampling
Sampling method in which all elements have equal probability of being drawn.

Problem Definition
A precise, succinct statement of the question or issued that is to be investigated.

Problem Statement >>> Problem Definition.

Process analysis
Procedure for measuring selected grammatical or nonlexical forms in speech or text.

Producer’s risk
The risk of rejection a good-quality lot. This a type I error.

Program audit
Nonroutine evaluation by an outsider of a program’s operation.

Program evaluation
Social research that judges a program’s success, usually in one or more of the following, program
impact, efficiently analysis, or utilization.
344
Program impact
Stage of evaluative research that determines whether the program has an effect.

Program monitoring
Stage evaluative research that checks whether the program’s operation follows its plan.

Projective Methods
Ways of eliciting responses difficult to obtain, otherwise than through such means as word
association, sentence completion, and thematic apperception tests.

Projective question
A hypothetically framed question. For example, you might ask the respondent how much money
people they know typically give in a year to charitable causes

Projective tests
Measurement procedures by which subjects respond to ambiguous stimuli; presumed to reflect
significant personality characteristics.

Prompt
Blinking symbol on a computer screen showing readiness for the next step in the procedure.

Proportional quota sampling


A sampling method where you sample until you achieve a specific number of sampled units for
each subgroup of a population, where the proportions in each group are the same.

Proportional reduction of error (PRE)


Quality of some correlation coefficients that measure association as the degree of improvement in
predicting one variable from the other.

Proportionate Stratified Random Sampling


A probability sampling design in which the number of sample subjects drawn from each stratum is
proportionate to the total number of elements in the respective strata.

Prospective Studies
Studies designed to observe outcomes or events that occur after the group of participants has
been identified. Prospective studies do not have to involve manipulation or intervention but may
be purely observational or involve only the collection of data instead.

Protocol
The formal design or plan of an experiment or research activity; specifically, the plan submitted to
an IRB for review and to an agency for research support. The protocol includes a description of
the research design or methodology to be employed, the eligibility requirements for prospective
participants and controls, the treatment regimen(s), and the proposed methods of analysis that
will be performed on the collected data.

Proxemics
Pertaining to interpersonal spacing, especially the study of communicative aspects, causes, and
effects of spacing.

Proximal Similarity Model


A model for generalizing from your study to other contexts based upon the degree to which the
other context is similar to your study context.

345
Proxy-Pretest design
A post-only design in which, after the fact, a pretest measure is constructed from pre-existing
data. This is usually done to make up for the fact that the research did not include a true pretest.

Pseudo-effect
Apparent treatment effect caused by contrast with noncomparable control group.

Pseudoscience
Body of assertions that appears scientific because it involves observation, but is not for lack of
falsifiability, for example, astrology.

Psychometrics
Research devoted to evaluating and improving reliability and validity of social research measures.

Pure
Often used in the same way as 'Basic Research', though sometimes to imply the purity of
methodological approach (that is, an emphasis on what is methodologically correct with minimal
compromise with practical issues).

Purposive Sampling
A non probability sampling design in which the required information is gathered form special or
specific targets or groups of people on some rational basis. [#] Non-probability sampling method
that involves choosing elements with certain characteristics.

Purposiveness in Research
The situation in which research is focused on solving a well-identified and defined problem, rather
than aimlessly looking for answers to vague questions.

p-value
The probability, when the null hypothesis is true, of obtaining a sample result that is at least as
unlikely as what is observed. It is often called the observer level of significance.

Qualitative data
Data in which the variables are not in a numerical form, but are in the form of text, photographs,
sounds bytes, and so on. [#] Data that are labels or names used to identify an attribute of each
element. Qualitative data may be non-numeric or numeric.
[#] Data that are not immediately quantifiable unless they are coded and categorized in some
way.

Qualitative independent variable


An independent variable with qualitative data.

Qualitative measures
Data not recorded in numerical form.

Qualitative Research
(a) When referring to variables, "qualitative" is another term for categorical or nominal. (b) When
speaking of kinds of research, "qualitative" refers to
346
studies of subjects that are hard to quantify, such as art history. Qualitative research tends to be
a residual category for almost any kind of non-quantitative research. [#] The collection of non-
numerical data. Often multi-method in focus, qualitative research involves an interpretive,
meaning-driven approach to its participant matter. [#] Social research based on observation made
in the field and analyzed in nonstatistical ways.

Qualitative Study
Research involving analysis of data / information that are descriptive in nature and not readily
quantifiable.

Qualitative variable
A variable that is not in numerical form.[#] A variable with qualitative data.

Qualitative variables
Types of variable for which observations are assigned to levels rather than given precise
quantitative values rather than given precise quantitative values, example, religious preference

Qualitative
These data will normally be presented discursively (though multi-media is increasingly used) and
will focus on depth and subtlety in a single or small number of settings rather than counting
characteristics over a larger number of settings or responses from more people. This method can
provide a rich and more in-depth data set. Researchers will often use qualitative methods to
complement quantitative methods and vice versa.

Qualitizing
This is the process by which quantitative data are transformed into data that can be analyzed
qualitatively.

Quality control
A series of inspections and measurements that determine whether quality standards are being
met.

Quantitative
These data can be stored in reasonably well-defined categories, and in sufficient volume (ie
number of responses) to permit tabular and cross-tabular presentations, and possibly statistical
analysis. In other words it is about counting and offering findings as numbers or percentages. The
strength of this approach lies in the precision and clarity with which findings can be stated, and
the scope which exists (via appropriate statistical tests) for establishing general validity. In some
sectors statistical presentation is respected more than any other format.

Quantitative data
Data that appears in numerical form. [#] Data that indicate how much or how many of something.
Quantitative data are always numeric.

Quantitative measurement
Collecting and reporting observations numerically.

Quantitative Research
Said of variables or research that can be handled numerically. Usually (too sharply) contrasted
with qualitative variables and research. [#] The collection of numerical data in order to describe,
explain, predict and/or control phenomena of interest.

Quantitative variable
Data in the form of numbers. [#] A variable with quantitative data.

347
Quantitative Variables
Type of variable for which observations are assigned measured values for example, temperature.

Quantity index
An index that is designed to measure changes in quantities over time.

Quartiles
th th th
The 25 , 50 and 75 percentiles, referred to as the first quartile, the second quartile (median)
and third quartile, respectively. Quartiles can be used to divide the data set into four parts, each
part containing approximately 25% of the data.

Quasi-experiment
A type of research design for conducting studies in field or real-life situations where the
researcher may be able to manipulate some independent variables but cannot randomly assign
subjects to control and experimental groups. The procedures of quasi-experimentation were
developed mainly in the context of evaluation research projects. For example, you cannot cut off
someone's unemployment benefits to see how well he or she could get along without them or to
see whether an alternative job-training program would be more effective for some unemployed
persons. But you could try to find volunteers for the new program. You could compare the results
for the volunteer group (experimental group) with those of people in the regular program (control
group). The study is quasi-experimental because you were unable to assign subjects at random
to treatment and control groups. Questionnaire - A group of written questions to which subjects
respond. Some restrict the use of the term "questionnaire" to written responses. [#]
An experimental design that is missing one or more aspects of the (classic) controlled
experiment.

Quasi-experimental design
Experiment approach in which the researcher does not assign subjects randomly to treatment
and control conditions.

Quasi-experimental designs
Research designs that have several of the keys features of randomized experimental designs,
such as pre-post measurement and treatment-control group comparisons, but lack random
assignment to treatment group. [#] Research designs that look like randomized or true
experiments (they have multiple groups and pre-post measurement) but use nonrandom
assignment to assign the groups.

Quasi-Experimental
A quasi-experiment is an experiment in which a potential cause (independent variable) has been
manipulated, but conditions do not permit the use of a random selection of research subjects
and/or the effective control of extraneous variables. Most field research which seeks to be an
experiment is likely to fall into the quasi-experimental category. See also Experimental research.

Questionnaire
A questionnaire comprises the questions to be asked of respondents. There are three main types:
questionnaires to be used in face to face or telephone interviews; self completion questionnaires,
which are read, completed and returned by respondents; and computer administered
questionnaires, which allow more complex question patterns than paper questionnaires. [#] A
data collection method in which participants read and answer questions in a written format. [#] A
pre formulated written set of questions to which the respondent records the answers, usually
within rather closely delineated alternatives.

Quota Sampling
A form of purposive sampling in which a predetermined proportion of people from different

348
subgroups is sampled. [#] Any sampling method where you sample until you achieve a specific
number of sampled units for each subgroup of a population.

Quota sampling
Nonprobability sampling method that creates and samples matching a predetermined
demographic profile .

R chart
A control chart used when the output of a process is measured in terms of the range of a variable.

R
Symbol used in design diagrams to represent random assignment to groups.

Random assignment
Method of placing subjects in different condition so that each subject has an equal chance of
being in any group, thus avoiding systematic subject differences between the experimental and
control groups. [#] Process of assigning your sample into two or more subgroups by chance.
Procedure for random assignment can vary from flipping a coin to using a table of random
numbers to using the random number capability built into a computer.

Random digit dialing (RDD)


Method of drawing a sample of respondents for telephone interviews by composing phone
numbers randomly in order to reach phone subscribers with unlisted numbers.

Random error
Random deviation, which tends to average to zero over numerous sample subjects or items.

Random Sample
A specific type of probability sample in which participants are selected from a population list using
a table of random numbers or a random number generator. (A random
sample requires a list of population members in which each member can
be assigned a discrete number.) The assignment of participants to
different treatments, interventions, or conditions according to chance,
rather than systematically. Random assignment of participants increases
the probability that differences observed between participant groups are
the result of the experimental intervention.

Random sampling
Drawing a representative group from a population by a method that gives every member of the
population an equal chance of being drawn.

Random selection
Process or procedure that assures that the different units in your population are selected by
chance.

349
Random variable
A numerical description of the outcome of an experiment.

Randomization
The process of controlling the nuisance variables by randomly assigning members among the
various experimental and control groups, so that the confounding variables are randomly
distributed across all groups.

Randomized block design


An experimental design employing blocking. [#] Experimental design in which the sample is
grouped into relatively homogeneous subgroups or blocks within which your experiment is
replicated. This procedure reduces noise or variance in the data.

Range
The highest value minus the lowest value. [#] A measure of variability, defined as the largest
value minus the smallest value. [#] The spread in a set of numbers indicated by the difference in
the tow extreme values in the observations. [#] A measure of variability consisting of the span
between the lowest and highest scores.

Ranking Scale
Scale used to tab preferences between tow or among more objects or items.

Rapport
Trusting relationship between interviewer and interviewee.

Rating scale
Scale with several responses categories that evaluate an object on a scale.

Ratio level
Type of measurement that assigns scores on a scale with equal intervals and a true zero point.

Ratio scale
A scale of measurement for a variable that has all the properties of interval that has all the
properties of interval data and the ratio of two values is meaningful. Ratio data are always
numeric.

Ration Scale
A scale that has an absolute zero origin, and hence indicates not only the magnitude, but also the
proportion of the differences.

Reactivity
Extent to which a measure causes a change in the behaviour of the subject.

Realism
In ontology, the view that the sources of our perceptions are real and not fictions.

Recall-dependent Question
Questions that elicit from the respondents information that involves recall of experiences from the
past that may be hazy in their memory.

Reciprocal causation
A two-way causal connection between two constructs or variables in which each causes the
other.

Record Sheets
The medium on which your recorded data will appear, whether on paper, through the computer or
350
via audio or video. The record may be anything from a blank page on which you write, through to
a structured recording schedule where you perhaps just have to fill in relevant ticks or ring
numbers, like the pre-coded answer block on a questionnaire.(See also Optical Mark Reader
System)

References
In the context of a research report a reference is a formal system for drawing attention to a
literature source, usually published, both in the report itself (for instance, when you want to
identify the source of a quotation) and in the bibliography or reading list at the end of the report.

Regression analysis
A general statistical analysis that enables us to model relationships in data and test for treatment
effects. In regression analysis, we model relationships that can be depicted in graphic from with
lines that are called regression lines.

Regression artifact >>> Regression Threat.

Regression equation
The equation that describes how the mean or expected value of the dependent variable is related
to the independent variable.

Regression line
A line that describes the relationship between two or more variables.

Regression model
The equation describing how y is related to x and an error term.

Regression Point Displacement (RPD)


A pre-post quasi-experimental research design where the treatment is given to only one unit in
the sample with all remaining units acting as controls. This design is particularly useful to study
the effects of community-level interventions where outcome data is routinely collected at the
community level. [#] A pre-post quasi-experimental design where the program or treatment is
given to only a single unit (for example, person or school) but many units are used as a
comparison group.

Regression threat
A statistical phenomenon that causes a group’s average performance on one measure to regress
towards or appear closer to the mean of that measure than anticipated or predicted. Regression
occurs whenever you have a nonrandom sample from a population and two measures that are
imperfectly correlated. A regression threat will bias your estimate of the group’s posttest
performance and can lead to incorrect causal inferences.

Regression to the mean >>> Regression threat.

Regression toward the mean


When the same measure is applied more than one time, the movement toward the mean of
subsequent scores, to the extent that the measure is unreliable; also a group threat to internal
validity caused by the tendency of extreme scores on unreliable measures to move toward the
mean on subsequent tests.

Regression-Discontinuity (RD)
A pretest-posttest, program-comparison group quasi-experimental design in which a cutoff
criterion on the preprogram measure is the method of assignment to group.

351
Regularity theory
Causation is shown by a nonspurious association between two variables, a view of causation that
requires a weak test.

Rejection region
The range of values that will lead to the rejection of a null hypothesis.

Relationship
Refers to the correspondence between two variables.

Relative efficiency
Given two unbiased point estimators of the same population parameter, the point estimator with
the smaller standard deviation is more efficient.

Relative frequency distribution


A tabular summary of data showing the fraction of proportion (relative frequency) of data items in
each of several non-overlapping classes.

Relative frequency method


A method of assigning probabilities on the basis of experimentation or historical data.

Relevance
It is about the closeness with which the data being gathered feeds into the aims of the study.

Reliability
The degree to which a measure is consistent or dependable; the degree to which it would give
you the same result over again, assuming the underlying phenomenon is not changing. [#]
Attests to the consistency and stability of the measuring instruments. [#] The degree to which a
measure yields consistent results. [#] The consistency or stability of a measure or test from one
use to the next. When repeated measurements of the same thing give identical or very similar
results, the measure is said to be reliable.

Reliability coefficient
Estimate of the extent to which a measure is free of random error, usually arrived at by correlating
measures of the same type-for example, inter-item, interrater, parellel, form, or test- retest.

Reliability
Extent to which a measure reflects systematic or dependable sources of variation rather than
random error.

Remuneration
Payment for participation in research; this is different from compensation, which typically refers to
payment for research-related injuries.

Repeated measures
Two or more waves of measurement over time.

Repeated Measures Design: Same as Within-participants Design.

Replicability
The repeatability of similar results when identical research is conducted at different times or in
different organizational settings.

352
Replicability/Replication
As the name suggests, replication is the process of repeating a study undertaken by someone
else, in the sense of using the same methodology. Commonly the location and research subjects
will be different, though sometimes studies return to the same group of subjects after a period of
time has passed - e.g. with child development studies. A good research report always includes
enough information on the methods used to enable someone else to carry out a replication.

Replication
Repetition of a study to see if the same results are obtained. [#] The number of times each
experimental condition is repeated in an experiment.

Representative Sample
A sample in which the participants closely match the characteristics of the population, and thus,
all segments of the population are represented in the sample. A representative sample allows
results to be generalized from the sample to the population.

Representativeness of the Sample


The extent to which the sample that is selected possesses the same characteristics as the
population from which it is drawn.

Requests For Proposals (RFPs)


These RFPs, published by government agencies and some companies, describe some problem
that the agency would like researchers to address. Typically, the RFP describes the problem that
needs addressing, the contexts in which it operates, the approach they would like you to take to
investigate to address the problem, and the amount they would be willing to pay for such
research.

Research
An organized, systematic, critical, scientific inquiry or investigation into a specific problem,
undertaken with the objective of finding answers or solutions thereto.

Research Design
The science and art of planning procedures for conducting studies so as to get the most valid
findings. Called "design" for short. When designing a research study, one draws up a set of
instructions for gathering evidence and for interpreting it.

Research Mindedness
Includes the following essential elements: A faculty for critical reflection informed by knowledge
and research; An ability to use research to inform practice which counters unfair discrimination,
racism, poverty, disadvantage and injustice, consistent with core social work values; An
understanding of the process of research and the use of research to theorise from practice.

Research Plan
This is the researcher's guidebook for the project, and the yardstick against which the various
stages of progress can be judged. It states the outputs to be delivered and the timescale.

Research Population [or its derivatives such as 'survey population' and 'experimental
population']
Is the total number of potential subjects for your research. If this population is larger than you
need or can cope with, then you should use a rational and unbiased process for reducing the
number (sampling).

353
Research proposal
A document that sets out the purpose of the study and the research design details of the
investigation to be carried out by the researcher.

Research question
The central issue being addressed in the study, which is typically phrased in the language of
theory.

Research
A systematic investigation (i.e., the gathering and analysis of information) designed to develop or
contribute to generalizable knowledge.

Researcher Interference
The extent to which the person conducting the research interferes with the normal course of work
at the study site.

Resentful demoralization
A social threat to internal validity that occurs when the comparison group knows what the
program group is getting and instead of developing a rivalry, control group members become
discouraged or angry and give up.

Residual analysis
The analysis of the residuals used to determine whether the assumptions made about the
regression model appear to be valid. Residual analysis is also used to identify unusual and
influential observations.

Residual Error
In regression analysis, the distance or deviation between the estimated value from the regression
line and the actual value of the outcome variable for any value of the predictor variable; also
called residuals or deviations.

Residual path
In path analysis, the causal path representing the effect on the outcome variable of all variables
not specified in the study.

Residual plots
Graphical representations of the residuals that can be used to determine whether the
assumptions made about the regression model appear to be valid.

Residual
The difference between the observed value of the dependent variable and the value predicted
using the estimated regression equation. [#] The vertical distance from the regression line to
each point. The residual in regression analysis refers to the portion of the outcome or dependent
variable that you cannot predict with your regression equation.

Respect for Persons


An ethical principle requiring that individual autonomy be respected and that persons with
diminished autonomy be protected ( >>> Cognitively Impaired, Incompetence,
and Incapacity).

Respondents
Research participants, who fill out a survey, are interviewed, participate in an experiment, are
observed in a naturalistic setting, or who are otherwise studied.

354
Response brackets
A question response type that includes groups of answers, such as between 30 and 40 years old,
or between $50,000 and $100,000 and annual income.

Response format
The format you use to collect the answer from the respondent.

Response Rate
The proportion of people asked to take part in research who actually become respondents. Non-
response occurs when you have selected a sample and some of them do not provide data - for all
sorts of reasons, but basically through your inability to make contact or their refusal. Usually face-
to-face surveys will have a non-response rate of around 25% and postal surveys nearer 50%.

Response scale
A sequential numerical response format, such as a 1-to-5 rating format.

Response set
Tendency of a person to answer items in a way designed to produce a preferred image (for
example, social desirability); can reduce construct validity.

Response style
Peron’s habitual manner or responding to test items that is independent of item content (for
example, acquiscence); can reduce the construct validity of the measure.

Response
A specific measurement value that a sampling unit supplies.

Restricted Probability Designs >>> Complex Probability Sampling.

Retrospective Studies
Research conducted by reviewing records from the past (e.g., birth and death certificates,
medical records, school records, or employment records) or by obtaining information about past
events elicited through interviews or surveys. Case studies are an example of this type of
research.

Reverse causation
Threat to internal validity for an observed association in which the causal direction is opposite to
that hypothesized.

Review (of Research)


The concurrent oversight of research on a periodic basis by an IRB. In addition to the at least
annual reviews mandated by the federal regulations, reviews may, if deemed appropriate, also be
conducted on a continuous or periodic basis.

Right to service
The ethical issue involved when participants do not receive a service that they would be eligible
for if they were not in your study. For example, the member of a control group might not receive a
drug because they are in a study.

Rigor
The theorelogotical and methodological precision adhered to in conducting research.

Risk
The probability of harm or injury (physical, psychological, social, or economic) occurring as a

355
result of participation in a research study. Both the probability and magnitude of possible harm
may vary from minimal to significant (See >>> Minimal Risk).

Robust
Relative immunity of an inferential statistic to violation of its assumptions.

Rules of integration
(Erzberger & Kelle, 2003) A set of rules can be formulated that may be helpful for drawing
inferences from the results of qualitative and quantitative studies in mixed methods designs.
These rules should be understood as general guidelines whose significance vary according to the
research question, the empirical domain under investigation, and the specific methods employed.
Erzberger and Kelle list eight rules of integration. >>> Inference

Sample bias
When numerous samples are on average unrepresentative of the population.

Sample error
Unavoidable, random deviations of different sample estimates from each other.

Sample point
An element of the sample space. A sample point represents an experimental outcome.

Sample population
The population from which the sample is taken.

Sample size
The actual number of subjects chosen as a sample to represent the population characteristics.

Sample space
The set of all experimental outcome.

Sample stages
Steps at which elements or clusters are drawn as part of the sampling design.

Sample statistic
A numerical value used as a summary measure for a sample (e.g. the sample mean, the sample
variance and the sample standard deviation). The value of the sample statistic is used to estimate
the value of the population parameter.

Sample
A subset of a given population used for research purposes. [#] A subset of the population. [#] A
subset or subgroup of the population. [#] Subset of individuals selected from a larger population.
[#] The actual units you select to participate in your study.

356
Sampling distribution
The theoretical distribution of an infinite number of samples of the population of interest in your
study. [#] A probability distribution consisting of all possible values of the sample statistic.

Sampling error
Error in measurement associated with sampling. [#] The absolute value of the difference
between an unbiased point estimator and the corresponding population parameter. It is the error
that occurs because a sample, and not the entire population, is used to estimate a population
parameter.

Sampling Frame
Available list of elements from which sample can actually be
drawn, usually not a complete enumeration. [#] The list from
which you draw your sample. In some cases, there is no list; you
draw your sample based upon an explicit rule. For instance, when
doing quota sampling of passers-by at the local mall, you do not
have a per se, and the sampling frame is the population of people
who pass by within the time frame of your study and the rule(s)
you use to decide whom to select. [#] The sampling frame is
what you have when you have checked out your research population (that is, all potential
respondents), and have left the names and contact details of all of those from the research
population who genuinely can become respondents, if they are willing.

Sampling model
A model for generalizing in which you identify your population, draw a fair sample and conduct
your research, and finally generalize your results to other populations groups.

Sampling Unit
Either the element or grouping of elements selected at a sampling stage.[#] The units selected for
sampling. A sampling unit may include several elements.

Sampling with replacement


Once an element has been included in the sample, it is returned to the population. A previously
selected element can be selected again and therefore may appear in the sample more than once.

Sampling without replacement


Once an element has been included in the sample, it is removed from the population and cannot
be selected a second time.

Sampling
The process by which you reduce the total number of possible respondents for a research project
(the research population) to a number which is practically feasible and theoretically acceptable
(the sample). [#] The process of selecting items from the population so that the sample
characteristics can be generalized to the population. Sampling involves both design choice and
sample size decisions.

Sampling-Non random
'Not random' sampling means that the principle of randomness has not been maintained in the
selection of a sample. Often it involves structured sampling whereby the sample group is carefully
matched to the overall population on key variables. 'Non-random sampling' is often convenient, or
the only approach possible in the circumstances

Sampling-Random
The aim of random sampling is to combine chance (that everyone in the frame has the same
357
chance of being chosen) with balance (that the chosen sample will be an accurate microcosm of
the research population as a whole).
Scalar structure
Hierarchical pattern in a set of items in which harder or less frequently chosen items are chosen
only if easier or more frequently chosen items are also chosen by most respondents.

Scale
A group of related measures of a variable. The items in a scale are arranged in some order of
intensity or importance. A scale differs from an index in that the items in an index need not be in a
particular order, and each item usually has the same weight or importance. [#] A tool or
mechanism by which individuals, events , or objects are distinguish on the variables of interest in
the some meaningful way.

Scaling
The branch of measurement that involves the construction of an instrument that associates
qualitative construct with quantitative metric units.

Scatter diagram
A graphical presentation of the relationship between two quantitative variables. The independent
variable is shown on the horizontal axis and the dependent variable is shown on the vertical axis.

Scattergram
Graphic presentation of an association in which each point indicates the two scores on an
individual.

Scenario writing
A qualitative forecasting method that consists of developing a conceptual scenario of the future
based on a well-defined set of assumptions.

Scientific Investigation
A step-by-step, logical, organized and rigorous effort to solve problems.

Search Engine
Software program designed to search and locate information through “keywords”, typically in
documents on the world wide web.

Secondary analysis
Analysis that makes use of already existing data sources.

Secondary Data
That have already been gathered by researchers, data published in statistical and other journals,
and information available from any published or unpublished source available either within or
outside the organized, all of which might be useful to the researcher.

Secondary source
A primary source is that which provides the initial basic data set under discussion, while a
secondary source is one which offers further analysis or commentary on the data. For example,
the primary source for demographic data in the UK is likely to be the publications of the Office of
Population and Census Studies - OPCS - while there are many secondary sources which make
use of OPCS output. Generally it is better, if you can, to make reference to primary sources.

358
Selection Bias
A bias in the way the experimental and or comparison groups are selected, resulting in pre-
existing differences between the groups that may serve as confounding factors.

Selection Effects
The threat to internal validity that is a function of improper or unmatched selection of subjects for
the experimental and control groups.

Selection threat <<< selection bias

Selection
Group threat to internal validity in which difference observed between groups at the end of the
study existed prior to the intervention because of the way members were sorted into groups.

Selection-by-time interaction
Group internal validity threat in which subject with different likelihoods of experiencing time –
related changes (for example, maturation or history) are placed into different groups.

Selection-history threat
A threat to internal validity that arises from any differential rates of normal growth between pretest
and posttest for the groups.

Selection-instrumentation
A threat to internal validity that results from differential changes in the test used for each group
from pretest to posttest.

Selection-maturation threat
A threat to internal validity that arises from any differential rate of normal growth between pretest
and posttest for the groups.

Selection-mortality
A threat to internal validity that arises when there is differential nonrandom dropout between
pretest and posttest.

Selection-regression
A threat to internal validity that occurs when there are different rates of regression to the mean in
the groups.

Selection-testing threat
A threat to internal validity that occurs when a differential effect of taking the pretest exists
between groups on the posttest.

Selective coding
Systematic coding with respect to a previously developed core concept.

Semantic differential
A scaling method in which an object is assessed by the respondent on a set of bipolar adjective
pairs.

Semantic Differential Scale


Usually a seven-point scale with bipolar attributes indicated at its extremes.

Separate Pre-Post Samples


A design in which the people who receive the pretest are not the same as the people who take
the posttest.

359
Sequential explanatory design
This design “is characterized by the collection and analysis of quantitative data followed by the
collection and analysis of qualitative data. Priority is typically given to the quantitative data, and
the two methods are integrated during the interpretation phase of the study.” [#] This design “is
characterized by an initial phase of qualitative data collection and analysis, followed by a phase of
quantitative data collection and analysis. Therefore, the priority is given to the qualitative aspects
of the study.

Sequential Mixed Method Design


A design in which one type of data (e.g. QUAN) provides a basis for the collection of another type
of data (e.g. QUAL). It answers one type of question (QUAL or QUAN) by collecting and
analyzing two types of data (QUAL and QUAN). Inferences are based on the analysis of both
types of data. This term subsumes “sequential study, “two-phase design ,” “sequential QUAL-
QUAN Analysis” and “sequential QUAN-QUAL analysis”.

Serial correlation
Assumption made in the regression analysis of a time series that the residuals or differences
between estimated and actual values will be uncorrelated; such correlation is called serial
correlation of the errors and requires special care in the analysis. [#] Same as auto-correlation

Setting
Experimental arrangement of the study or, more broadly, the larger social context in which the
study takes place.

Shadow prices
Estimated monetary value of program resources that are otherwise unpriced (for example,
volunteer time), based on estimated value in the marketplace.

Sign test
A non-parametric statistical test for identifying differences between two populations based on the
analysis of nominal data.

Simple linear regression


Regression analysis involving one independent variable and one dependent variable in which the
relationship between the variables is approximated by a straight line.

Simple random sampling bell curve


The histogram or bar char graph describing the frequency of each possible measurement
response.

Simple random sampling


A method of sampling that involves drawing a sample from a population so that every possible
sample has an equal probability of being selected. [#] Arability sampling design in which every
single element in the population has a known and equal chance of being selected as a subject.
[#] Finite population : a sample selected such that each possible sample of size n has the same
probability of being selected. Infinite population : a sample selected such that each element
comes from the same population and the elements are selected independently.

Simulation
A model - building technique for assessing the possible effects of changes that might be
introduction in a system.

Single-Blind Design
Typically, a study design in which the investigator, but not the participant, knows the identity of

360
the treatment assignment. Occasionally the participant, but not the investigator, knows the
assignment.

Single-factor experiment
An experiment involving only one factor with k populations or treatments.

Single-group threat
A threat to internal validity that occurs in a study that uses only a single program or treatment
group and no comparison or control.

Single-Masked Design: >>> Single-Blind Design.

Single-option variable
A question response list from which the respondent can check only one response.

Single-subject design
Also called n –l design usually a multiple-intervention design applied to a single subject.

Site Visit
A visit by agency officials, representatives, or consultants to the location of a research activity to
assess the adequacy of IRB protection of human participants or the capability of personnel to
conduct the research.

Skepticism
Attitude of doubting and challenging assertions.

Skewness
Degree of departure from symmetry (that is, one side or tails of the distribution is longer than the
other). If the longer tail is to the right, it is called positive skew; to the left, negative skew.

Slope
In regression analysis, the angle of the best, fitting line, that is, how many units on the vertical
axis the line rises or falls for each unit on the horizontal axis; commonly symbolized by the letter
b. [#] The change in y for a change in x of one unit.

Smoothing constant
A parameter of the exponential smoothing model that provides the weight given to the most
recent time series value in the calculation of the of the forecast value.

Snowball Sample
A non-probability sample that is created by using members of the group of interest to identify
other members of the group (for example, asking a participant at the end of an interview for
suggestions about who else to interview).

Snowball sampling
A sampling method in which you sample participants based upon referral from prior participants.

Social area analysis


Community description usually based on archival records and used in needs assessment.

Social Desirability
The respondents’ need to give socially or culturally acceptable responses to the questions posed
by the researcher even if they are not true.

361
Social discount rate
Discount rate used in adjusting future returns to social programs, usually lower than the prevailing
private investment rate of return.

Social Experimentation
Systematic manipulation of, or experimentation in, social or economic systems; used in planning
public policy.

Social impact assessment (SIA)


Evaluation of all possible future costs and benefits of one or more intervention plans.

Social indicators
Regular reports on the psychological and social well-being of the population similar to what
economic indicators such as unemployment rates do for economic well-being.

Social significance
In contrast to statistical significance, the societal value or importance placed on a study or its
outcome.

Social threats to internal validity


Threats to internal validity that arise because social research is conducted in real-world human
contexts where people will react to not only what effects them, but also to what is happening to
others around them.

Sociometry
Measurement approach that described a person’s social relationships from the number of
“choices” of that person made by others.

Soft data
Data such as people's ideas and opinions. A characteristic of qualitative research.

Software
Technology that is capable of designing programs to meet the different computing needs of
individuals and companies.

Solomon Four-Group Design


The experimental design that sets up two experimental group and two control groups, subjecting
one experimental group and one control group to both the pre test and the other experimental
group and control group to only the post test. [#] This design has four groups. Two pf the groups
receive the treatment and two do not. Furthermore, one of the treatment groups and one of the
controls receive a pretest and the other two do not. By explicitly including testing as a factor in the
design, you can assess experimentally whether a testing threat is operating.

Spearman rank-correlation coefficient


A correlation measure based on rank-ordered data for two variables.

Specification error
In regression analysis, omission of an important causal variable; can lead to mis-estimation of the
relationships among variables included in the analysis.

Split-Half Reliability
Coefficient between one half of the items measuring a concept and the other half.

362
Spontaneous remission
Apparent natural improvement of control subjects that may be due in part to compensatory
contamination, that is, their acquisition of unreported treatment.

Spuriousness
Two variables associated because both are caused by another variable.

Stability of a Measure
The ability of the measure to repeat the same over time with low vulnerability to changes in the
situation.

Stakeholders
People with an interest (stake) in the research. Examples are subjects of the research, including
service users and carers, researchers, agency staff, funding bodies and commissioners of
research. Politicians and policy makers are also increasingly research stakeholders.

Standard deviation (SD)


Measure of variability; a type of weighted average of distances
from the mean of all the observations. [#] A measure of
dispersion for parametric data; the square root of the variance.
[#] A measure of variability for a data set, found by taking the
positive square root of the variance. [#] The spread or variability
of the scores around their average in a single sample. [#] The
square root of the variance. The standard deviation and variance
both measure dispersion, but because the standard deviation is
measure in the same units as the original measure and the
variance is measured in squared units, the standard deviation is usually the more directly
interpretable and meaningful.

Standard error of the difference


A statistical estimate of the standard deviation of would obtain from the distribution of an infinite
number of estimates of the difference between the means of two groups.

Standard error of the estimate


The square root of the mean square error, denoted by s. It is the estimated of standard deviation,
the standard deviation of the error term E.

Standard error of the mean


Sampling variability estimate based on the standard deviation of the sample, which is divided by
the square root of the sample size.

Standard error
The spread of the averages around the average of averages in a sampling distribution. [#] The
standard deviation of a point estimator.

Standard normal probability distribution


A normal distribution with a mean of zero and a standard deviation of one.

Standard score (z)


Individual’s score, minus the group mean, divided by the group’s standard deviation.

Standard variable
In social science research, especially in survey analysis, there are a range of variables which are
usually considered 'standard' or 'key', in the sense that some analysis is undertaken in relation to

363
each of them. The list will change according to the specific research project, but may well include
such items as age, gender, socio-economic group, ethnicity, employment, family background,
housing.

Standardization
Arrangement of measurement procedures so that they will be identical (or nearly so) when
applied to different subjects, at different times, or by different raters; also, in experimentation, the
reduction of human experimenter variability in treatment of subjects of use of fixed script.

Standardized residual
The value obtained by dividing a residual by its standard deviation.

Stapel Scale
A scale that measures both the direction and intensity of the attributes of a concept.

Statical Panel
A panel that consists of the sane group of people serving as subjects over an extended period of
time for a research study.

Statical Regression
The threat to internal validity that results when various groups in the study have been selected on
the basics extreme(very high or very low) scores on some important variables.

Statistic
A value that is estimated from data.

Statistical inference validity


Type of validity tested by inferential statistics, namely, the confidence that a sample finding is not
due to chance.

Statistical inference
The process of using data obtained form a sample to make estimates or test hypothesis about the
characteristics of a population.

Statistical power
The probability of correctly concluding that there is a treatment or program effect inyour data.

Statistical probability
Being able to draw a conclusion from a sample and generalising it to a wider population.

Statistical significance
Tests of statistical significance, of which the best known is probably the Chi-square, which is a
measure of probability. Where a research sample has been used, it is important to know, whether
the findings are valid or came about by chance. [#] In inferential statistics, the judgment that a
finding was not due to chance. [#] The probability that difference between the outcomes of the
control and experimental group are great enough that it is unlikely it is due solely to chance. The
probability that the null hypothesis can be rejected at a predetermined significance level (0.05 or
0.01).

Statistical Tests
Researchers use statistical tests to make quantitative decisions about whether a study’s data
indicate a significant effect from the intervention and allow the researcher to reject the null
hypothesis. That is, statistical tests show whether the differences between the outcomes of
the control and experimental groups are great enough to be statistically significant. If differences

364
are found to be statistically significant, it means that the probability (or likelihood) that these
differences occurred solely due to chance is relatively low. Most researchers agree that a
significance value of .05 or less (there is a 95% probability that the differences are real)
sufficiently determines significance.

Statistics
Numerical summaries of observations, either descriptive or inferential. [#] The art and science of
collecting, analyzing, presenting and interpreting data.

Stem-and-leaf display
An exploratory data analysis technique that simultaneously ranks quantitative data and provides
insights into the shape of the distribution.

Stories
From a research viewpoint a story is a record of an event in a respondent's life told by the
respondent from his/her own perspective in his/her own words.

Stratification
Categorization of elements having some common characteristic. The group of all elements having
such a common characteristic is called a stratum, for example males or females.

Stratified random sampling


A method of sampling that involves dividing your population into homogeneous subgroups and
then taking a simple random sampling in each group. [#] A probability sampling method in which
the population is first divided into strata and a simple random sample is then taken from each
stratum. [#] Probability sampling design that first divides the population into meaningful, non
overlapping subsets and then randomly chooses the subjects from each subset. [#] Probability
sampling that includes elements from each of the mutually exclusive strata within a population.

Stratified Sample
A probability sample that is determined by dividing the population into groups or strata defined by
the presence of certain characteristics and then random sampling from each of the strata. This is
a good way to make sure that a student sample is racially diverse (for instance). [#] A method of
selecting a sample in which the population is first divided into strata and a simple random sample
is then taken form each stratum.

Stratified sampling, disproportionate


A probability sampling technique in which each stratum’s size is not proportionate to the stratum,s
share of the population, allocation is usually based on variability of measures expected from the
stratum, cost of sampling from a given stratum, and size of various strata.

Stratified sampling, proportionate


A probability sampling techniques in which each stratum,s size is proportionate to the stratum,s
share of population; higher statistical efficiency than a simple random sample.

Stress index
An index used in multidimensional scaling that ranges from1 (worst fit) to 0 (perfect fit).

Structural equation modeling (SEM)


Uses analysis of covariance structures to explain causality among constructs.

Structural Variables
Factors related to the form and design of the organization such as the roles and positions,
communication channels, control systems, reward systems, and span of control.
365
Structured interview
An IDI that often uses a detailed interview guide similar to a questionnaire to guide the question
order; questions generally use an open-ended response strategy. [#] A data collection method in
which an interviewer reads a standardized interview schedule to the respondent and records the
answers. (Not to be confused with an in-depth interview.)

Structured Interviews
Interviews conducted by the researcher with a predetermined list of questions to be asked of the
interview.

Structured Observational Studies


Studies in which the researcher observes and notes specific activities and behaviour that have
been clearly delinked as o

Structured response formats


A response format that is predetermined prior to administration.

Structured response
Participant’s response is limited to specific alternatives provided; a.k.a closed response.

Sub of squares
Sum of the squared deviations from the mean in calculations of variance or from the regression
line in assessing its fit.

Subject
A single member of the sample. [#] An individual who is studied.

Subjective method
A method of assigning probabilities on the basis of judgment.

Subjectivist
The belief that there is no external reality and that the world as you see it is solely a creation of
your own mind.

Suggestion
In social research, the effect on subjects of their beliefs about their situation.

Summated rating scale


Category of scale where participants agree or disagree with evaluation statements; the Likert scale
is most known of this type of scale.

Summative evaluations
Evaluations that examine the effects or outcomes of some program or treatment.

Supergroup
A group interview involving up to 20 people.

Survey
A measurement process using a highly structured Interview
; employs a measurement tool called a questionnaire,
measurement instrument, interview schedule. [#] A research
design in which a sample of subjects is drawn from a
population and studied (usually interviewed) to make
inferences about the population. This design is often
contrasted with the true experiment in which subjects are

366
randomly assigned to conditions or treatments. [#] A study in which the same data are collected
from all members of the sample using a highly structured questionnaire and analyzed
using statistical tests

Survey via personal interview


A two-way communication initiated by an interviewer to obtain information from a participant; face
to-face, phone, or internet.

Systematic Sample
th th th
A probability sample that is determined by selecting every ‘nth’ (5 , 10 , 50 , etc.) person from a
list of the entire population, after the first person has been randomly selected.

Switching-Replication design
A two-group design in two phases defined by three waves of measurement. The implementation
of the treatment is repeated in both phases. In repetition of the treatment, the two groups switch
roles; the original control group in phase 1 becomes the treatment group in phase 2; whereas the
original treatment acts as the control. By the end of the study, all participants have received the
treatment.

Symbolic interactionism
Theoretical perspective concerned with the meanings that things and events have for human
beings and the production of these meanings in human interchange.

Symmetric matrix
A square table where the value for each numbered row and column is identical to the value for
the same numbered column and row.

Symmetrical relationship
When two variables vary together but without causation.

Symmetry
In a frequency distribution, the degree of similarity of shape of the left and right sides of the
distribution.

Syndicated data provider


Tracks the change of one or more measures over time, usually in a given industry.

Synergy
The process at the foundation of group interviewing that encourages members to react to and
build on the contributions of others in the group.

Synopsis
A brief summary of the research study.

Systematic error
Error that results from bias; see also systematic variance.

Systematic observation
Data collection through observation that employs standardized procedures, trained observes,
schedules for recording, and other devices for the observer that mirror the scientific procedures of
other primary data methods.

Systematic random sample


A sampling method where you determine randomly where you want to start selecting I the

367
sampling frame and then follow a rule to select every xth element in the sampling frame list
(where the ordering of the list is assumed to be random).

Systematic review
A summary of relevant literature that uses explicit methods to perform a thorough literature
search and critical appraisal of individual studies and that uses appropriate statistical techniques
to combine these valid studies.

Systematic sampling
A method of choosing a sample by randomly selecting the first element and then selecting every
th
k element thereafter. [#] A probability sample drawn by applying a calculated skip interval to a
sample frame; population (N) is divided by the desired sample (n) to obtain a skip interval (k).
Using a random star between l and k, each kth element is chosen from the sample frame; usually
treated as a simple random sample but statistically more efficient. [#] A probability sampling
design that involves choosing every nth elements in the population for the sample.

Systematic sampling
th
Probability sampling technique in which every n element is sampled from an existing list of
available elements.

Systematic variance
The variation that causes measurements to skew in one direction or other.

t distribution
A family of probability distributions that can be used to develop interval estimates of a population
mean whenever the population standard deviation is unknown and the population has a normal or
near-normal probability distribution.

Tabular review
Traditional approach to literature review that summarizes each study by a line in one or more
tables.

Tabulations
A set of data which provide a count of the number of occasions on which a particular
answer/response has been given across all of those respondents who tackled the question.

Tactics
Specifies, timed activities that execute a strategy.

Target population
The population about which inferences are made.

Target question
Measurement question that addresses the core investigative questions of a specific study; these
can be structured or unstructured questions.

368
Target question, structured
A measurement question that presents the participant with a fixed set of categories per variable.

Target question, unstructured


Measurement question that present the participant with the context for participant-framed answer;
a.k.a. open-ended question, free response question ( nominal, ordinal, or ratio data).

Tau (T )
A measure of association that uses table marginal’s to reduce prediction errors, with measure from
0 to 1.0 reflecting percentage of errors estimates for prediction of one variable based on another
variable.

Tau b (T b)
A refinement of gamma for ordinal data that consider “tied” pairs, not only discordant and
concordant (values from -1.0 to +1.0); used best on square tables 9 one of the most widely used
measures for ordinal data).

Tau c (T c)
A refinement of gamma for ordinal data that considered “tied” pairs, not only discordant and
concordant pair (values from -1.0 to +1.0); used for any size table (one of the most widely used
measures for ordinal data).

t-distribution
A normal distribution with more tail area than that in a Z normal distribution.

Technical report
A report written for an audience of researchers.

Technology
Any mechanism that transforms inputs to outputs.

Telephone focus group


A type of focused group where participants are connected to the moderator and each other by
modern teleconferencing equipments; participants are often in separate teleconferencing facilities;
may be remote-moderate or monitored.

Telephone interview
A study conducted wholly by telephone contact between participant and interviewer.

Telephone Interview
The information-gathering method by which the interviewer asks the interviewee over the
telephone, rather than face , for information needed for the research.

Temporal precedence
One criterion for establishing a casual relationship that hold s that the cause must occur before
the effect,

Tertiary sources
Aids to discover primary or secondary sources, such as indexes, bibliographies, and internet
search engines; also may be an interpretation of a secondary source.

Test market
A controlled experiment conducted in a carefully chosen marketplace (e.g., Web site, store, town,
or other geographic location) to measure an predict sales or profitability of a product.

369
Test reactivity
A time threat to internal validity that occu5rs when the observed change over time stems from the
pretest rather than the experimental variable.

Test statistic
A graphical representation helpful in identifying the sample points of an experiment involving
multiple steps.

Test unit
An alternative term for a subject within an experiment (a person, an animal, a machine, a
geographic entity, an object, etc.).

Testibility
The ability to subjects the data collected to appropriate statical tests ,in order to substantiate or
reject the hypotheses developed for the research study.

Testing Effects
The distorting effects on the experimental results (the post test scores) caused by the prior
sensitization of the respondents to the instruments through the pre test.

Testing threat
A threat to internal validity that occurs when taking the pretest affects how participants do on the
posttest.

Test-retest Reliability
A way of established the stability of measuring instrument by correlating the scores obtained
through its administration to the same set of respondents at two different points in time.

Textual analysis
Used in analysis of secondary source data and also in qualitative research. It involves working on
a text in depth, looking for keywords and concepts and making links between them. The term also
extends to literature reviewing.Increasingly, much textual analysis is done using computer
programs.

Thematic Apperception Test (TAT)


A projective test that requires the respondent to develop a story around a picture.

Thematic Apperception Test


A projective technique where participants are confronted with a picture (usually a photograph or
drawing) and asked to describe how the person in the picture feels and thinks.

Theoretical
Pertaining to theory. Social research is theoretical, meaning that much of it is concerned with
developing, exploring, or testing the theories or ideas that social researchers have about how the
world operates.

Theoretical Framework
A logically developed ,described, and explained network of associations among variables of
interest to the research study.

Theoretical sampling
A nonprobability sampling process where conceptual or theoretical categories of participants
develop during the interviewing process; additional participants are sought who will challenge
emerging patterns.

370
Theoretical variable
Concept or construct as distinct from a variable that is measured.

Theory
A set of systematically interrelated concept, definitions, and propositions that are advanced to
explain or predict phenomena (facts); the generalizations we make about variables and the
relationships among variables.

Theory
A general explanation about a specific behavior or set of events that is based on known
principles and serves to organize related events in a meaningful way. A theory is not as specific
as a hypothesis. [#] Tentative or preliminary explanations of causal relationships.

Third-variable problem
An unobserved variable that accounts for a correlation between two variables.

Threat to conclusion validity


Any factor that can lead you to reach an incorrect conclusion about a relationship in your
observations.

Threats to validity
Reasons your conclusion or inference might be wrong.

Time sampling
The process of selecting certain time points or time intervals to observe and record elements,
acts, or conditions from a population of observable behaviors or conditions to represent the
population as a whole; three types include time point sample, time-interval samples, and
continuous real time samples.

Time series
Data for a variable at numerous time points, for example, the unemployment rate for each month
for several years. [#] Many waves of measurements over time.

Time threats
Internal validity threats to within subjects designs; protected against by control groups. (See table
9-1 regarding the four time threats of history, maturation, instrumentation, (or measurement
decay) and test reactivity).

Topic outline
Report planning format; uses key word or phrases rather than complete sentences to draft each
report section.

Total survey error


The sum of all bias and errors from sampling and data collection of the total difference between
the sampled survey estimate and the true or population value.

Tow-tailed test
A hypothesis test in which rejection of the null hypothesis occurs for values of the test statistic in
either tails of the sampling distribution.

Traces
Physical records based either on wear or erosion or on leavings or accretions.

Trait
Personality characteristic or behavioral style.
371
Translation validity
A type of construct validity related to hoe well you translated the idea of your measure into its
operationalization.

Treatment
In experiments, a treatment is what researchers do to the subjects in the experimental group, but
not to those in the control group. A treatment is thus an independent variable. [#] The
manipulation of the independent variable in experimental designs so as to determine its effects on
a dependents variable of interest to the researcher.

Treatment levels
The arbitrary or natural grouping within the independent variable of an experiment.

Treatment
The experimental factors to which participants are exposed. [#] Different levels of a factor.

Tree diagram
A graphical representation helpful in identifying the sample points of an experiment involving
multiple steps.

Trend survey
Longitudinal survey design involving a series of cross-sectional -surveys each based on a
different sample.

Trends
Patterns in time series marked by longterm increases or decreases.

Triad
A group interview involving three people.

Trials
Repeated measures taken from the same subject or participant.

Triangulation
Research design that combines several qualititative with quantitative methods; most common are
simultaneous QUAL/QUANT in single or QUANT-QUAL, sequential QUAL-QUANT-QUAL. [#] A
multi-method or pluralistic approach, using different methods in order to focus on the research
topic from different viewpoints and to produce a multi-faceted set of data. Also used to check the
validity of findings from any one method. [#] Method of comparing observations from different
times and sources to arrive at a correct analysis.

True experiment
Experimental design in which suspects are randomly assigned to two or more differently treated
conditions.

True score theory


A theory that maintains that every measurement is an additive composite of two components: the
true ability of the respondent and random error.

True score
That part of the observed score that reflects the construct of interest.

Truncation
A search protocol that allows a symbol (usually”?” or”*”) to replace one or more characters or
letters in a word or at the end of a word root.

372
t-test
A parametric test to determine the statistical significance between a sample distribution mean and
population parameter; used when the population standard deviation is unknown and sample
standard deviation is used as a proxy. [#] A statistical test established a significant mean
difference in a variable between two groups. [#] A statistical test of the difference between the
means of two groups, often a program and comparison group. The t-test is the simplest variation of
the one-way Analysis of Variance (ANOVA).

t-value
The estimate of the difference between the groups relative t the variability of the scores in the
groups.

Two-group posttest-only randomized experiment


A research design in which two randomly assigned groups participate. Only one group
receives the program and both groups receive a posttest.

Two-independent-sample tests
Parametric and nonparametric tests used when the measurements are taken from two samples
that are unrelated (Z test, t-test, chi-square, etc.).

Two-related- sample tests


Parametric and nonparametric tests used when the measurements are taken from closely
matched samples or phenomena are measured twice from the sample (t-test, McNemar test, etc.).

Two-stage design
A design in which exploration as a distinct stage precedes a descriptive or casual design.

Two-tailed hypothesis
A hypothesis that does not specify a direction, for example in a study on self-esteem, you do not
predict whether your assisted-living program will have a positive or negative effect on the self-
esteem of the respondents in your study.

Two-tailed test
A nondirectional test to reject the hypothesis that the sample statistic is either greater than or less
than the population parameter.

Type II error
The error of accepting Ho when it is false.

Type I error
Error when one rejects a true null hypothesis (there is no difference); the beta(B); the alpha (
)value, called the level of significance, is the probability of rejecting the true null hypothesis. [#]
When a test wrongly shows an effect or condition to be present (e.g. that a woman is pregnant
when, in fact, she is not). When a researcher falsely rejects the null hypothesis [#] Error of
rejecting the null hypothesis when it is true. [#] The
error of rejecting Ho when it is true.

Type II Error
When a test wrongly shows an effect or condition
to be absent (e.g. that a woman is not pregnant
when, in fact, she is). When a researcher fails to
reject the null hypothesis [ #] Error of not rejecting
the null hypothesis when it is false. [#] When one
fails to reject a false null hypothesis; the beta (-)
373
value is the probability of failing to reject the false null hypothesis ; the power of the test 1-B and
is the probability that will correctly reject the false null hypothesis.

Typology of Research Purposes


A systematic classification of types of purposes for conducting mixed methods research.

Unbalanced rating scale


Has an unequal number of favorable and unfavorable response choices.

Unbalanced Rating Scale


An even-numbered scale that has to neutral point.

Unbiased Questions
Questions posed in accordance with the principles of wording and measurement, and the right
questioning technique, so as to elicit the least biased responses.

Unbiasedness
A property of a point estimator when the expected value of the point estimator is equal to the
population parameter it estimates.

Unforced-choice rating scale


Provides participants with an opportunity to express no opinion when they are unable to make a
choice among the alternatives offered.

Unidimentional scale
Instrument scale that seeks to measure only one attributes of the participant or object.

Unit effect
Pattern of preexisting differences across spatial units that accounts for discrepancy between
analysis over time and analysis over space.

Unit of analysis
The entity that you are analyzing in your analysis: for example individuals, groups,or social
interactions.

Unit of analysis
The level of aggregation of the data collected during data analysis.

Univariate Analysis
Studying the distribution of cases of one variable only--for example, studying the ages of welfare
recipients but not considering their gender, ethnicity, and so on.

Univariate statistics
Descriptive statistics for one variable.

Unobtrusive measures
A set of observational approaches that encourage creative and imaginative forms of indirect
observation, archival searches and variations on simple and contrive observation, including
physical traces observation (erosion and accretion). [#] Measurement of the data gathered from
sources other than people ,such as examination of birth and death records or count of the number
of cigarette burns in the ashtray. [#] Methods used to collect data without interfering in the lives of
the respondents.
374
Unobtrusive
With respect to qualitative research, unobtrusive observation involves disguised entry and
participation without the knowledge of the subjects that they are under scientific scrutiny.

Unrestricted Probability Sampling >>>Simple Random Sampling.

Unsolicited proposal
A suggestion by a contract researcher for research that might be done.

Unstructured interview
A customized IDI with no specific questions or order of topics to be discussed; usually starts with
a participant narrative.

Unstructured interviewing
An interviewing method that uses no predetermined interview protocol or survey and where the
interview questions emerge and evolve as the interview proceeds.

Unstructured Interviews
Interviews conducted with the primary purpose of identified some important issues relevant to the
problems situation ,without prior preparation of a planned or predetermined sequence of
questions.

Unstructured observational studies


Studies in which the researcher observes and makes notes of almost all activities and behaviour
that occur in the situation without predetermine what particular variables will be of specific interest
to the study.

Unstructured response formats


A response format that is not predetermined and allows the respondent or interviewer to
determine. An open-ended question is a type of unstructured response format.

Unstructured response
Where participant’s response is limited only by space, layout, instructions, or time; usually free
response or fill-in response strategies.

User-involvement in research
The politicization of service-users as active participants in service design and delivery, rather than
passive recipients, has extended to research activities. User involvement in research situates
users as active stakeholders in the research process and involvement may be on a range from
responsibility for the design, delivery and reporting of research to inclusion as a key stakeholder
group. User-involvement impacts on methodologies, which questions are asked, the way they are
asked who asks them and what happens to the answers. (>>> Emancipatory Research)

Utilitarianism
Ethical approach that seeks a rational balancing of costs and benefits of behaviors.

Utility score
A score in conjoint analysis used to represent each aspect of a product or service in a participant’s
overall preference ratings.

Utilization
Stage of evaluative research that gauges the exter to which the research report is used and
provides guidance for better dissemination of evaluation results.

375
Validity
A term to describe a measurement instrument or test that measures what it is supposed to
measure; the extent to which a measure is free of systematic error.
For example, a bathroom scale provides a reliable measure cannot
give a valid measure of height. [#]A characteristics of measurement
concerned with the extent that a test measures what the researcher
actually wishes to measure; and that differences found with a
measurement tool reflect true differences among participants drawn
from a population. [#]Concerns the extent to which your research
findings can be said to be accurate and reliable, and the extent to
which the conclusions are warranted. [#]Evidence that the instrument ,technique, or process used
to measure a concept does indeed measures the intended concept. [#]Extent to which a measure
reflects the intended phenomenon (for example, construct, Criterion, or content domain); more
generally the truth value of an assertion. [#]The best available approximation of the truth of a
given proposition, inference, or conclusion. [#]

Validity [Face]
At face value, does the measure seem valid?

Validity [Internal]
Does a study’s confusions about causal relationships agree with what is actually true?

Validity coefficient
Estimate of the agreement of the measure being vasidated with a criterion (criterion validity) or a
measure thought to reflect the target construct (convergent type construct validity).

Validity [Construct]
Does the measure of a given concept relate to a measure of another theoretically associated
concept?

Validity [Criterion]
Is the measure associated with expected behaviors?

Validity [Content]
Does the measure cover diverse meanings of the concept?

Validity, construct
The degree to which a researcher instrument is able to provide evidence based on theory.

Validity, content
The extent to which measurement scales provide adequate coverage of the investigative
questions.

Validity, criterion-related
The success of measurement scale for prediction or estimation; types are predictive and
concurrent.

376
Variability
Term for measures of spread or dispersion within a data set. [#] In descriptive statistics, the
dispersion of individuals scores from each other (for example, standard deviation).

Variable (research variable)


A characteristic, trait, or attribute that is measured; a symbol to which values are assigned;
includes several different types; continuous, control, decision, dependent, dichotomous, discrete,
dummy, extraneous, independent, intervening, and moderating variables.

Variable section procedures


Methods to select a subset of the independent variables for a regression model.

Variable
Any characteristic or trait that can vary from one person to another (race, sex, academic major) or
for one person over time (age, political beliefs). [#] Any entity that can take on different values.
For instance, age can be considered a variable because age can take different values for
different people at different times. [#] Any factor which may be relevant to a research study. In a
survey, for example, you may choose to analyse data by the age and gender of respondents.
'age' and 'gender' are variables. [#] Anything that can take on differing or varying values. [#]
Measure or indicator thought to represent an underlying construe or concept and produced by an
operational definition of the construct or concept.

Variance
A measure of score dispersion about the mean; calculated as the squared deviation scores from
the data distribution, s mean; greater the dispersion of scores, the greater the variance in the data
set. [#] A measure of the spread of scores in a distribution of
scores, that is, a measure of dispersion. The larger the variance,
the further the individual cases are from the mean. The smaller the
variance, the closer the individual scores are to the mean. [#] A
measure of variability of a data set, based on the squared
deviations of the data values about the mean. It is also a measure
of the variability, of dispersion, of a random variable. [#] Indicates
the dispersion of a variable in the data set, and is obtained by
subtracting the mean from each of the observations, squaring the
results, summing them ,and dividing the total by the number of
observations. [#] The spread of the scores around the mean of a distribution. Specifically, the
variance is the sum of the squared deviations from the mean divided by the number of
observations minus 1.

Verbal versus nonverbal measurement


The dimension of measurement that parates observations of verbal communication (for example,
self-report of ratings of speed behavior) from observations of other kinds.

Videoconferencing focus group


A type of focus group where researchers use the videoconference facilities of a firm to connect
participants with moderators and observes; unlike telephone focused groups, participants can see
each other; can be remotely moderated, and in some facilities can be simultaneously monitored by
client observes via Internet technology.

Virtual test market


A test of product using a computer simulation of an interactive shopping experience.

Visitors from another planet


A projective technique (imagination exercise) where participants are asked to assume that they

377
are aliens and are confronting the product for the first time; they then describe their reactions,
questions, and attitudes about purchase or retrial.

Visual aids
presentation tools used to facilitate understanding of content (e.g., chalkboard, whiteboards,
handouts, flip chart, overhead transparencies, slides, computer-drawn visuals, computer
animation).

Vocational filter
Phenomenon of people removing themselves from desirable career paths due to math
avoidance.

Voice recognition
Computer systems programmed to record verbal answers to questions.

Voluntary Participation
The principle that study participants choose to participate of their own free will, rather than being
coerced or forced to participate. For IRB purposes, this is a key part of your study proposal; you
must demonstrate that participants will be participating voluntarily for a study to be approved by
the IRB. [#] For ethical reasons, researchers must ensure that study participants are taking part
in a study voluntarily and are nit coerced.

Weak inference
Conclusion of causality based on the regularity theory of causation. Involves finding
unconfounded covariation between the variables in question that is consistent with a model:
equally good-fitting alternative models are not ruled out.

Web site
Site accessible on the Internet or Internet, created by individuals and organizations for the
purpose of sharing information.

Web-based questionnaire
A measurement instrument both delivered and collected via the Internet; data processing is
ongoing. Two options currently exists; proprietary solutions offered through research firms and off-
the-shelf software for researcher who possess the necessary knowledge and skills; a.k.a. online
survey, online questionnaire, Internet survey.

Web-enabled test market


Test of product using online distribution.

Weighted mean
The mean for a data set obtained by assigning each data value a weight that reflects its
importance with the set.

378
Weighted moving average
A method of forecasting or smoothing a time series by computer a weighted average of past data
values. The sum of the weights must equal one.

Weighting
Stratifying your sampling to achieve an outcome which is typical of the research population can
result in some of the strata being too small for tabular or statistical presentation. For example,
carers stratified by gender may give a very small group of males. In this instance you may choose
to increase the size of the male stratum solely in order to get a viable number. This is weighting
your sample. In the same way you may want to weight clusters.

White noise
In time series analysis the residuals between the estimated and actual values that have no
correlation among themselves at any lag.

Wilcoxon signed-rank test


A non-parametric statistical test for identifying differences between tow populations based on the
analysis of two matched or paired samples.

Within subjects design


Experimental design in which the change in subjects from before to after the manipulation
measures the treatment effect.

Within-Design Consistency
The consistency of the procedures of the study from which the inferences emerged

Within-participants Design
A research design in which each participant experiences, at different times, all levels of
the independent variable (or both the experimental and control treatment). Thus, each participant
is tested once in each condition

Word association
A projective method of identifying respondents attitudes and feelings by asking them to associate
a specified word with the first thing that comes to their mind.

Word or picture association


A projective technique where participants are asked to match images, experiences, emotions,
products and services, and even people and places to whatever is being studied.

World wide web (The Web)


A mass market means of communition, the web is a collection of standards and protocols used
to access information available on the internet.

X Symbol used in design diagrams to represent presence of an experimental manipulation.

X chart
A control chart used when the output of a process is measured in terms of the mean value of a
variable such as a length, weight, temperature, and so on.

379
Yahoo
An acronym for yet another hierarchically officious oracle, a worldwide directory of Web sites
developed in 1994 by two Stanford University engineering students to organize Web content in
a hierarchical system of subject categories. Yahoo! also provides other Web-based services
(news, weather, travel, e-mail, shopping, games, etc.). It uses a smaller database than most other
Web search engines, but searches in Yahoo! usually have high precision because the Web sites
it lists are selected by human beings rather than robot software. Jonathan Swift coined
the term "yahoo" in Gulliver’s Travels (1726) to refer to an imaginary race of coarse, brutish
creatures in human form. Mark Twain later applied it to any boorish person.

Yapp binding
A form of limp or semi-limp leather binding with rounded corners and bent-
in edges that overlap the sections, sometimes by as much as half the thickness
of the text block, named after William Yapp, the 19th-century bookseller who
designed the style for pocket bibles sold in England (see this example). Geoffrey
Glaister notes in Encyclopedia of the Book (Oak Knoll/British Library, 1996) that
a similar style of binding with tooled edges was used in the mid-16th century.

Yearbook
An annual documentary, historical, or memorial compendium of facts, photographs, statistics,
etc., about the events of the preceding year, often limited to a specific country,
institution, discipline, or subject . Optional yearbooks are offered by some publishers of
general encyclopedias. Most libraries place yearbooks on continuation order and shelve them in
the reference collection. Yearbooks of historical significance may be stored in archives or special
collections.

Yellow press
A popular name for newspapers and periodicals of the early 20th century that published news
stories of a vulgarly sensational nature, comparable to the modern tabloid.

z-score
A value found by dividing the deviation bout the mean by the standard deviation s. a z- score is
referred to as a standers value and denotes the number of standard deviation a data value x is
form the mean.

Z distribution
The normal distribution of measurements assumed for comparison.

Z score >>> standard score.

380
Z test
A parametric test to determine the statistical significance between a sample distribution mean and
a population parameter; employs the Z distribution.

“The dictionary is the only reference book that has no index.”


<< Evan Esar

381
A2Z

PhD
Thesis
Reflections on Academic Research

Appendix - I

Detailed Guidelines for Chapters

381
Detailed Guidelines for Chapters
Following pages describe a proto-type format with details of subsections of each chapter
in a five chapter structure model of a PhD Thesis. This illustration forms a basis in
developing a draft thesis and guides the scholars in the process of completing the final
Thesis.

The salient features are:

 In each chapter, possible subsections are indicated.

 A range of subsections in each chapter is provided.

 Tips to develop the subsections are provided for Chapter I. Using these tips,
research scholars could develop the subsections of other chapters based on the
requirements of individual research study.

 Format gives the model for the contents of the first page for each chapter.

 First page of each chapter will contain a brief summary of the chapter.

 Proportion of each chapter in the total Thesis is given percentage.

 Maximum number of sections with possible number of subsections for each


chapter is indicated.

 Format offers very high flexibility so that within the 5 chapter structure, even the
title of the chapter could be changed; for example, chapter ‘Literature Review’
may be renamed as either ‘theoretical framework’ or ‘conceptual framework’.
Research scholars may have ample option to christen the title of these chapters.

382
Chapter I
Introduction

Contents [5%] [maximum 8 sections, each having 2/3 subsections]


------------------------------------------------------------------------------------------------------------

1.1 Background to the research


[Here the broad field of study is outlined and then the focus of the research problem is mentioned.
This section is short and aims to orient the readers and grasp their attention.]

1.2 Research problem and hypotheses


[Here the core or one big idea of the research is given, starting with the research problem
including sub-problems and also the thought process of the researcher is indicated.]

1.3 Justification for the research


[Here the importance of the specific area of study, relative neglect of the specific research
problem by previous researchers, relative neglect of the research's methodologies by previous
researchers and usefulness of potential applications of the research's findings are explained.]

1.4 Methodology
[Here an introductory overview of the methodology is placed here. This section should refer to
sections in chapter 2 and 3 where the methodology is justified and described.]

1.5 Outline of the Thesis


[Each chapter is briefly described in this section to give an over view of the Thesis.]

1.6 Definitions
[Definitions adopted by researchers are often not uniform, so key and controversial terms are
defined to establish positions taken in the PhD research. Definitions should match the underlying
assumptions of the research and scholars may need to justify some of their definitions.]

1.7 Delimitations of the Scope


[This section forms solid defence around the research findings that are additional to the
limitations and key assumptions established in the previous section about definitions.]

1.8 Conclusion
[Summary of the discussions and achievements of this chapter are provided here.]

------------------------------------------------------------------------------------------------------------
Brief description of Chapter I [in 10 – 15 sentences]

383
Chapter II
Literature Review

Contents [30%] [maximum 8 sections, each having 3/5 subsections]


------------------------------------------------------------------------------------------------------------

2.1 Introduction
2.2 Parent disciplines and classification models
2.3 Developing and Current Literature
2.4 Earlier Literature
2.5 Immediate discipline and analytical models
2.6 Research Gap in the available Literature
2.7 Area identified for the Research Study
2.8 Conclusion

------------------------------------------------------------------------------------------------------------
Brief description of Chapter II [in 10 – 15 sentences]

384
Chapter III
Research Methodology

Contents [15%] [maximum 5 sections, each having 2/3 subsections]


------------------------------------------------------------------------------------------------------------

3.1 Introduction
3.2 Justification of Methodology
3.3 Details of Research Procedures
3.4 Ethical considerations
3.5 Conclusion

------------------------------------------------------------------------------------------------------------
Brief description of Chapter III [in 10 – 15 sentences]

385
Chapter IV
Analysis of Data

Contents [25%] [maximum 5 sections, each having 2/6 subsections]


------------------------------------------------------------------------------------------------------------

4.1 Introduction
4.2 Statistical Tools used
4.3 Data about subjects
4.4 Detailed Pattern of data
4.5 Conclusion

------------------------------------------------------------------------------------------------------------
Brief description of Chapter IV [in 10 – 15 sentences]

386
Chapter V
Conclusions

Contents [25%] [maximum 8 sections, each having 2/4 subsections]


------------------------------------------------------------------------------------------------------------
5.1 Introduction
5.2 Discussion about each research question
5.3 Discussion about the research problem
5.4 Implications for theory
5.5 Delimitations
5.6 Major suggestions
5.7 Major recommendations
5.8 Scope for Further Research

------------------------------------------------------------------------------------------------------------
Brief description of Chapter V [in 10 – 15 sentences]

387
Typical Model of Chapter I
[to be positioned as the first page of Chapter I]

Chapter I
Introduction

Contents
------------------------------------------------------------------------------------------------------------

1.1 Background to the research 1

1.2 Research problem and hypotheses 3

1.3 Justification for the research 5

1.4 Methodology 8

1.5 Outline of the Thesis 10

1.6 Definitions 15

1.7 Delimitations of the Scope 17

1.8 Conclusion 19

------------------------------------------------------------------------------------------------------------
Background of the research is described leading to identification of research problem
and allied hypotheses. Based on these, conducting a research study is justified and
followed by a brief account of methodology to be adopted. Method and manner in which
the research scholar proposes to carry out the proposed research study is clearly
specified and explained. A bird’s eye-view of the proposed thesis is given enabling the
availability of full picture of the research. New words, phrases and jargon required for
the research study are narrated and definitions to be used in the research are identified
and explained. As in any research study, delimitations [or limitations!] contemplated and
anticipated are clearly explained. Finally, a summary listing the key achievements in this
chapter is provided.

388
A2Z

PhD
Thesis
Reflections on Academic Research

Appendix II
Simple Guide to SPSS
389
SIMPLE GUIDE - SPSS
When conducting any statistical analysis, you need to get familiar with your data and
perform an examination of it in order to lessen the odds of having biased results that can
make all of your hard work essentially meaningless or substantially weak.

Source: www.sociology-data.sju.edu/documents/data_analysis_guide_spss.doc

Getting to Know SPSS


When you run a procedure in SPSS, such as frequencies, you need to select the
variables in the dialog box. On the left side of the dialog box, you see the list of
variables in your data file. You click a variable name to select it, and then click the right-
arrow button to move the variable into the Variable(s) list.

TIP 1

You can use change the appearance of the variables so that they appear as variable
names rather than variable labels [see above], which is the default option. You can also
make the variables appear alphabetical. I recommend switching to the variable names
option and having them listed alphabetically so that you can more easily find the
variables of interest to you. By selecting this method, you can type the first letter of the
name of the variable that you want in the variable display section of the dialog box and
SPSS will jump to the first variable that starts with that letter, and every subsequent
variable that starts with that letter as well.

 Directions: Pull down the Edit Tab, Select Options, Select the general tab,
and under Variable lists select display names and alphabetical

390
TIP 2 [What is this variable?]

In case you forget the label that you gave to variables when you go to the dialog box,
such as the frequency dialog box above, highlight the variable that you are interested in
and click the right-mouse button. This will provide a pop-up window that offers the
Variable Information section. In other words, if you were presented with the frequency
table above, you might highlight “size of the company [size], click on the right-mouse
button and select variable information. This action provides you with the name of the
variable, its label, measurement setting [e.g. ordinal], and value labels for the variable
[i.e. categories]

TIP 3 [What is this statistic?]

If you are unsure of what a particular statistic is used for, then highlight the particular
item, right-click on the selected statistics [e.g. mean] and you will receive a brief
description of what the statistics available in the dialog box provide. If the variable is one
that seems useful to you, then you select it by placing a check in the available box
next to the statistic. Then click ok and you will return to the main frequencies
box.

TIP 4 [What is in this output?]

To obtain help on the output screen [i.e. the spss viewer], you need to double click a
pivot table in order to activate it so that you can make modifications. When activated, it
will appear to have “railroad track” lines surrounding it. See Table a below.

391
Favor or Oppose Death Penalty for Murder

Valid Favor 1074 71.6 77.4 Cumulative


77.4
Frequency Percent Valid Percent Percent
Oppose 314 20.9 22.6 100.0
Total 1388 92.5 100.0
Missing DK 106 7.1
NA 6 .4
Total 112 7.5
Total 1500 100.0

Once you have activated the pivot table, then you should right-click a row or
column header for a pop up menu, such as the column labeled valid percent.
Choose what’s this? This will bring up a pop-up window explaining what the
particular column or row is addressing. If you forget to activate the pivot table
and simply right-click on a column or row, you will get the following message:
[Displays output. Click once to select an object (for example, so that you can copy it to
the clipboard). Double-click to activate an object for editing. If the object is a pivot table,
you can obtain detailed help on items within the table by right-clicking on row and
column labels after the table is activated.]

If you want more than a pop-up delivers, choose results coach from the list
instead of what’s this? Essentially, this will take you through a subsection of the
SPSS tutorial.

Dressing Up Your Output

Changing Text

Activate the pivot table in the SPSS viewer [see Output screen] and then double-
click on the text that you wish to change. Enter the new text and then follow the
same procedure as needed. If you wish to get rid of the title, then you can select
the title and then hit the delete button on your keyboard and you will obtain a table
like the one below.

Changing the Number of Decimal Places

Select the cell entries with too many [or too few] decimal places. From the format
menu, choose Cell properties, select the number of decimal points that you want
and click OK.

Showing/Hiding Cells

392
Activate the table and then select the row or column you wish by using Ctrl-Alt-
click in the column heading or row label. From the view label, choose hide. To
resurrect the table at a later time, activate the table and then from the view menu,
choose show all. If you’re sure that you never want to resurrect the information,
then you can simply delete them and they will be permanently removed. See
Table a above, which shows a table with unhidden columns. See Table b for an
example of a table with hidden valid percent column.

Table b
Favor or Oppose Death Penalty for Murder

Valid Favor 1074 71.6 Cumulative


77.4
Frequency Percent Percent
Oppose 314 20.9 100.0
Total 1388 92.5
Missing DK 106 7.1
NA 6 .4
Total 112 7.5
Total 1500 100.0

Rearranging the rows, columns, and layers

Activate the pivot table, and from the pivot menu, choose pivoting trays. A
schematic representation of a pivot table appears with 3 areas [trays] labeled layer, row,
and column. Colored icons in these trays represent the contents of the table, one for
each variable and one for statistics. Place your mouse pointer over one of them to
see what it represents and if you wish to change the structure of the table, then
you can drag an icon and the table will rearrange itself. See Table b above for a
pre-modification version of the table and Table c below for a post-modification
version.

Table c = post-modification version of Table b

393
Favor or Oppose Death Penalty for Murder
Valid Favor Frequency 1074
Percent 71.6
Cumulative
77.4
Percent
Oppose Frequency 314
Percent 20.9
Cumulative
100.0
Percent
Total Frequency 1388
Percent 92.5
Missing DK Frequency 106
Percent 7.1
NA Frequency 6
Percent .4
Total Frequency 112
Percent 7.5
Total Frequency 1500
Percent 100.0

Editing Your Charts

Double click on the viewer option to open it in a new chart editor window. Note:
To access some chart editing capabilities, such as identifying points on a scatterplot or
changing the width of bars in a histogram, you must click an element of the chart to
select it. For example, you must click any point in a scatterplot or any bar in a
histogram or bar chart. You can change labels [double-click any text and
substitute your own], create reference lines, and change colors, line types and
sizes. When you close the Chart editor window, the original chart in the viewer
updates show any changes that you made.

Using Syntax

You should ALWAYS use syntax when running statistical analyses. There are 2 ways
that you can do this. You can select the paste tab when you run a statistical
analysis using one of the dialog boxes, such as that for frequencies. However,
when you use the paste function, you have to remember to go to the newly
created syntax window or one that you created in a previous session and
highlight the commands if you wish the analysis to actually run.

394
The other method is to open a new syntax file so that you can type in any
commentary and syntax or copy and paste from an already existing syntax file.
The syntax below is what you would receive if you did a paste command in SPSS after
using a dialog box, such as that for frequencies and you would also receive this
command if you did a copy and paste of prior commands in an already existing syntax or
a newly created one.

FREQUENCIES

VARIABLES=cappun

/PIECHART PERCENT

/ORDER= ANALYSIS .

Whichever method you use to create syntax, you MUST always type in commentary that
explains what the command does. This ensures that you have a way of checking back
to see the methodology that you used and the steps that were taken when you
conducted your analysis. This is useful in case something goes wrong and you need to
make corrections and just to provide you and others with a guide for how the analyses
occurred in case replications need to be done. Commentary should be written in the
following way when dealing with commands:

*frequencies of attitudes toward capital punishment and gun laws.*

Notice that there are asterisk at either end and that a period (.) is just before the closing
asterisk. This tells the computer that this is not command text, so that while the
computer may highlight it during a run of the analysis, it will not view it as command text.
If you were going to combine the commentary and the command syntax in a syntax file,
it would appear as you see it below.

395
*frequencies of attitudes toward capital punishment and gun laws.*

FREQUENCIES

VARIABLES=cappun

/PIECHART PERCENT

/ORDER= ANALYSIS .

In addition, you MUST keep a Log of the analyses that you run, which will appear in the
output [SPSS viewer] file. To do this, you need to go to Edit, then options, and select
the viewer tab. Under that tab, be sure that initial output state has “log” listed in the pull-
down tab and that display commands in the log is checked. This ensures that the
information that the program enters the text of any analysis that you do right before it
displays the results of the analysis, which is another way to let yourself and others know
what type of analysis you did and to evaluate whether it is the appropriate analysis and
whether it has been done properly in that case. See the information just below this text
for a sample.

FREQUENCIES

VARIABLES=cappun

/ORDER= ANALYSIS .

Frequencies
Statistics

N Valid 1388
Favor or Oppose Death Penalty for Murder
Missing 112

Favor or Oppose Death Penalty for Murder

Valid Favor 1074 71.6 Cumulative


77.4
Frequency Percent Percent
Oppose 314 20.9 100.0
Total 1388 92.5
Missing DK 106 7.1
NA 6 .4
Total 112 7.5
Total 1500 100.0

396
Introducing Data

Typing your own data

If your data aren’t already in a computer-readable SPSS format, you can enter the
information directly into the SPSS Data Editor. From the menus, choose file, then
new, then data, which opens the data editor in data view. If you type a number
into the first cell, SPSS will label that column with the variable name VAR00001.
To create your own variable names, click the variable view tab.

Assigning Variable names and properties

In the name column, enter a unique name for each variable in the order in which
you want to enter the variables. The name must start with a letter, but the remaining
part of the variable can be letters or digits. A name can’t end with a period, contain
blanks or special characters, or be longer than 64 characters.

Assigning Descriptive Labels

 Variable Labels: Assign descriptive text to a variable by clicking the cell and
then entering the label. For instance for the variable “cappun” the label says
“favor or oppose death penalty for murder.”
 Value Labels: To label individual values, click the button in the Value column.
This opens its dialog box. For cappun, the label is coded 1 = favor, 2 = oppose.
The sequence of operations is to: enter the value, enter its label, click add,
and repeat this process for each value.
o Note: Labels for individual values are useful only for variables with a
limited number of categories whose codes aren’t self-explanatory. You
don’t want to attach value labels to individual ages; however, you should
label the missing value codes for all variables if you use more than one
code.

Assigning Missing Values

To indicate which codes were used for each variable when information is not available,
click in the missing column, and assign missing values. Cases with these codes will be
treated differently during statistical analysis. If you don’t assign codes for missing
values, even nonsensical values are accepted. A value of -1 for age would be
considered a real age. The missing-value codes that you assign to a variable are
called user-missing values. System-missing values are assigned by SPSS to any
blank numeric cell in the Data Editor or to any calculated value that is not defined. A
system-missing value is indicated with a period (.).

 Note: You can’t assign missing values to a string variable that is more than
8 characters in width. For string variables, uppercase and lowercase
letters are treated as distinct characters. This means that if you use the code

397
NA (not available) as a missing value code, entries coded as na will not be
treated as missing. Also, if a string variable is 3 characters wide and the missing
value code is only 2 characters wide, the placement of the two characters in the
field of 3 affects what’s considered missing. Blanks at the end of the field (trailing
blanks) are ignored in missing-value specifications.
 Warning: DON’T use a blank space as a missing value. Use a specific number
or character to signify that I looked for this value and I don’t know what it is.
DON’T use missing-value codes that are between the smallest and largest valid
values, even if these particular codes don’t occur in the data.

Assigning Levels of Measurement

Click in a cell in the Measure column to assign a level of measurement to each variable.
You have 3 choices: nominal, ordinal, and scale.

 Warning 1: If you don’t specify the scale, SPSS attempts to divine it based on
characteristics of the data, but its judgment in this matter is fallible. For example,
string variables are always designated as nominal. In some procedures, SPSS
uses different icons for the 3 types of variables. The scale on which a variable is
measured doesn’t necessarily dictate the appropriate statistical analysis for a
variable. For example, an ID number assigned to subjects in an experiment is
usually classified as a nominal variable. If the numbers are assigned
sequentially, however, they can be plotted on a scale to see if subject responses
change with time. Vellemena and Wilkinson (1993) discuss the problems
associated with stereotyping variables.
 Warning 2: Although SPSS assigns a level of measurement to each variable,
this information is seldom used to guide you. SPSS will let you calculate means
for nominal variables as long as they have numeric values. Certain statistical
procedures don’t allow string variables in particular fields in the dialog boxes.
For example, you can’t calculate the mean of a string variable.

Saving the Data File

You MUST always save your data periodically so that you don’t have to start from
scratch if anything goes wrong. You can also include text information in an SPSS data
file by choosing utilities and data file comments, which will appear in the syntax
screen. Anyone using the file can read the text associated with it. You can also elect to
have the comments displayed in the output. This is similar to what you would do with
your own inclusion of comments alerting what steps you are taking in your data analysis.
I recommend the other way because you will already be in the syntax rather than having
to switch back and forth, but this is a possible option.

* Data File Comments.

398
PRESERVE.

SET PRINT OFF.

ADD DOCUMENT

'test of data file comments'.

RESTORE.

Selecting Cases for Analyses

If you wish to perform analyses on a subset of your cases, this command is


invaluable. For instance, consider that you want to examine gender differences in
support for or opposition to capital punishment. Choose select cases from the data
menu and all analyses will be restricted to the cases that meet the criteria you
specified. After choosing select cases, choose select if condition is satisfied and
also click on the “if” tab. This will take you to a dialog box that allows you to
complete the command syntax necessary to carry out the procedure. Considering
the example, that I gave you, Males are coded 1 and Females are coded 2. I am
interested in calculating results separately for both groups. Therefore, I click on
sex under the variable list and use the arrow to put it in the box allocated for
formulas. Once this variable has been transferred, I click on the = sign on the
calculator provided and then on a “1” so that I inform the computer that I am only
interested in selecting cases for males. Then I hit continue and go back to the
original select cases dialog box where I can choose “unselected cases are
filtered” or “unselected cases are deleted. If you wish to keep both males and
females in the dataset, but you want to conduct separate analyses for each group,
you want to choose “filtered.” If you wish to get rid of those cases that don’t meet
the criterion, i.e. you want to delete the females from the data set permanently,
you want to choose “deleted.” If you look at the Data Editor when Select Cases is
in effect, you’ll see lines through the cases that did not meet the selection criteria
[only for filtering of cases]. They won’t be included in any statistical analysis or
graphical procedures.

Repeating the Analysis for Different Groups of Cases

If you want to perform the same analysis for several groups of cases, choose Split File
from the Data menu. A separate analysis is done for each combination of values of the
variables specified in the Split File Dialog box.

SORT CASES BY sex .

SPLIT FILE

LAYERED BY sex .

399
Frequencies
Statistics

Male N Valid 607


Favor or Oppose Death Penalty for Murder
Missing 34
Female N Valid 781
Missing 78

Favor or Oppose Death Penalty for Murder

Male Valid Favor 502 78.3 82.7 Cumulative


82.7
Respondent's Sex Frequency Percent Valid Percent Percent
Oppose 105 16.4 17.3 100.0
Total 607 94.7 100.0
Missing DK 34 5.3
Total 641 100.0
Female Valid Favor 572 66.6 73.2 73.2
Oppose 209 24.3 26.8 100.0
Total 781 90.9 100.0
Missing DK 72 8.4
NA 6 .7
Total 78 9.1
Total 859 100.0

You can also select how you want the output displayed—all output for each subgroup
together or the same output for each subgroup together.

SORT CASES BY sex .

SPLIT FILE

SEPARATE BY sex .

Frequencies

Respondent's Sex = Male


Statistics
a

N Valid 607
Favor or Oppose Death Penalty for Murder
Missing 34
a.

Respondent's Sex = Male

400
Favor or Oppose Death
a Penalty for Murder

Valid Favor 502 78.3 82.7 Cumulative


82.7
Frequency Percent Valid Percent Percent
Oppose 105 16.4 17.3 100.0
Total 607 94.7 100.0
Missing DK 34 5.3
Total 641 100.0
a.

Respondent's Sex = Male

Respondent's Sex = Female


Statistics
a

N Valid 781
Favor or Oppose Death Penalty for Murder
Missing 78
a.

Respondent's Sex = Female

Favor or Oppose Death


a Penalty for Murder

Valid Favor 572 66.6 73.2 Cumulative


73.2
Frequency Percent Valid Percent Percent
Oppose 209 24.3 26.8 100.0
Total 781 90.9 100.0
Missing DK 72 8.4
NA 6 .7
Total 78 9.1
Total 859 100.0
a.

Respondent's Sex = Female

Preparing Your Data

Checking Variable Definitions

Using the Utilities Menu

Choose utilities and then variables to get data-definition information for each variable
in your data file. Make sure that all of your missing-value codes are correctly identified.

TIP 5

If you click, Go To, you find yourself in the column of the Data Editor for the
selected variable if the data editor is in data view. To edit the variable information
401
from the data editor in data view, double-click the variable name at the top of that
column. This takes you to the variable view for that variable.

TIP 6

To get a listing of the information for all of the variables without having to select
the variables individually, choose File, then display data file information, then
working file. This lists variable information for the whole data file. The
disadvantage is that you can’t quickly go back to the data editor to fix mistakes.
An advantage is that codes that are defined as missing are identified, so it’s easier
to check the labels.

Checking Your Case Count

Eliminating Duplicate Cases

If you have entered your own data, it is possible that you will enter the same case
twice or even more. To oust any duplicates, choose data, then identify duplicate
cases. If you entered a supposedly unique ID variable for each case, move the
name of that ID variable into the Define Matching Cases By list. If it takes more
than one variable to guarantee uniqueness (for example, college and student ID),
move all of these variables into the list. When you click OK, SPSS checks the file
for cases that have duplicate values of the ID variables.

TIP 7

DON’T automatically discard cases with the same ID number unless all of the other
values also match. It’s possible that the problem is merely that a wrong ID number was
entered.

Adding Missing Cases

Run any procedure and look at the count of the total cases processed. That’s always
the first piece of output. Table d shows the summary from the Crosstabs procedure for
sex by cappun.

Table d
Crosstabs Case Processing Summary

Cases
Valid Missing Total
Respondent's Sex *
Favor or Oppose Death N 1388 Percent
92.5% N 112 Percent
7.5% N 1500 Percent
100.0%
Penalty for Murder

402
You see that the data file has 1500 cases, but only 1388 have valid (nonmissing) values
for the sex and cappun variables. If the count isn’t what you think it should be and if you
assigned sequential numbers to cases, you can look for missing ID numbers.

Checking Your Case Count

Warning: Data checking is not an excuse to get rid of data values that you don’t like.
You are looking for values that are obviously in error and need to be corrected or
replaced with missing values. This is not the time to deal with unusual but correct data
points. You’ll deal with those during the actual data analysis phase.

Making Frequency Tables

Use the frequency procedure to count the number of times each value of a variable
occurs in your data. For example, how many people in the gss survey support capital
punishment? You can also graph this information using pie charts, bar charts, or
histograms.

TIP 8

You can acquire the information that you need for the descriptive statistics [e.g. mean,
minimum, maximum] through the frequency dialog box by selecting the statistics tab and
checking on those statistics of interest to you.

To obtain your frequencies and descriptives, follow the instructions below.

 Go to Analyze, scroll down to descriptive statistics and select frequencies.


Click on the variables of interests and move them into the box for analysis
using the arrow shown. To obtain the descriptives, check on the statistics
tab in the frequency dialog box and select the mean, minimum, maximum,
and standard deviation boxes. Then click on charts and decide whether
you wish to run a pie chart, a bar chart, or a histogram chart w/ a normal
curve. You can only run one type of graph/chart at a time.

When conducting frequency analyses, you want to consider the following questions
when reviewing the results presented in your output.

 Are the codes that you used for missing values labeled as missing values
in the frequency table? If the codes are not labeled, go back to the data editor
and specify them as missing values.
 Do the value labels correctly match the codes? For example, if you see that
50% of your customers are very dissatisfied with your product, make sure that
you haven’t made a mistake in assigning the labels.
 Are all of the values in the table possible? For example, if you asked the
number of times a person has been married and you see values of -2, you know
that’s an error. Go back to the source and see if you can figure out what the
correct values are. If you can’t, replace them with codes for missing values.
403
 Are there values that are possible, but highly unlikely? For example, if you
see a subject who claims to own 11 toasters, you want to check whether the
value is correct. If the value is incorrect, you’ll have to take that into account
when analyzing the data.
 Are there unexpectedly large or small counts for any of the values? If you’re
studying the relationship of highest educational degree to subscription to Web
services offered by your company and you see that no one in your sample has a
college degree, suspect problems.
TIP 9

To search for a particular data value for a variable, go to data view, highlight the
column of the variable that you are interested in, choose edit, then find, and then
type in the value that you are interested in finding.

 Looking At the Distribution of Values


 For a scale variable with too many values for a frequency table [e.g. income in
dollars], you need different tools for checking the data values because counting
how often different values occur isn’t useful anymore.
 Are the smallest and largest values sensible?
 You don’t want to look solely at the single largest and single smallest values;
instead, you want to look at a certain % or # of cases with the largest and
smallest values. There are several ways to do this. The simplest, but most
limited way, is to choose:
o Analyze, then Descriptive Statistics, then either descriptives or
explore. Click statistics in the explore dialog box and select outliers
in the explore statistics dialog box. You will receive a list of cases with
the 5 smallest and the 5 largest values for a particular variable. Values
that are defined as missing aren’t included, so if you see missing values
in the list, there’s something wrong. Check the other values if they
appear to be unusual. [see Table e]

404
Table e [using the Explore command]
Extreme Values

Hours Per Day Highest 1 1402 24


Watching TV Case Number Value
2 466 22
3 300 20
4 1360 20
5 115 16
a
Lowest 1 1500 0
2 1400 0
3 1373 0
4 1372 0
5 1356 0
b
a.

Only a partial list of cases with the value 16 are shown in


b.
the table of upper extremes.
Only a partial list of cases with the value 0 are shown in
the table of lower extremes.

You can see that one of the respondents claims to watch television 24 hours a day. You
know that’s not correct. It’s possible that he or she understood the question to mean
how many hours is the TV set on. When analyzing the TV variable, you’ll have to decide
what to do with people who have reported impossible values. In Table e, you see that
there are only 4 cases with values of 16 hours or greater and then there is a gap until 12
hours. You might want to set values greater than 12 hours to 12 hours when analyzing
the data. This is similar to what many people do when dealing with a variable for “age.”

 Is there anything strange about the distribution of values?


 The next task is to examine the distribution of the values using histograms or
stem-and-leaf plots. Make a stem-and-leaf plot [for small data sets] or a
histogram of the data using either Graphs/Histogram or the Explore Plots dialog
box. You want to look for unusual patterns in your data. For example, look at
the histogram of ages in Table f. Ask yourself where all of the 30-year-olds have
gone? Why are there no people above the age of 90? Were there really no
people younger than 18 in the survey?
 Looking At the Distribution of Values
 Are there logical impossibilities?
 For example, if you have a data file of hospital admissions, you can make a
frequency table to count the reason for admission and the number of male and
female admissions. Looking at these tables, you may not notice anything
strange. However, if you look at these 2 variables together in a Crosstabs table,
you may uncover unusual events. For instance, you may find males giving birth
to babies, and women undergoing prostate surgery.
 Sometimes, pairs of variables have values that must be ordered in a particular
way. For example, if you ask a woman her current age, her age at first marriage,

405
and the duration of her first marriage, you know that the current age must be
greater than or equal to the age at first marriage. You also know that the age at
first marriage plus the duration of first marriage cannot exceed the current age.
Start by looking at the simplest relationship: Is the age at first marriage less than
the current age? You can plot the two variables on a scatterplot and look for
cases that have unacceptable values. You know that all of the points must fall on
or above the identity line.
TIP 10

For large data files, the drawback to this approach is that it’s tedious and prone to error.
A better way is to create a new variable that is the difference between the current age
and the age at first marriage. Then use data, select cases to select cases with
negative values and analyze, then reports, then case summaries to list the
pertinent information. Once you’ve remedied the age problem, you can create a
new variable that is the sum of the age at first marriage and the duration of first
marriage. You can then find the difference between this sum and the current age.
Reset the select cases criteria and use case summaries to list cases with
offending values.

 Is there consistency?
 For a survey, you often have questions that are conditional. For example, first
you ask Do you have a car? and then, if the answer is Yes, you ask insightful
questions about the car. You can make Crosstabs tables of the responses to the
main question with those to the subquestions. You have to decide how to deal
with these inconsistencies: do you impute answers to the main question, or do
you discard answers to subquestions? It’s your call.
 Is there agreement?
 This refers to whether you have pairs of variables that convey similar information
in different ways. For example, you may have recorded both years of education
and highest degree earned. Or, you may have created a new variable that
groups age into 2 categories, such as less than 25, 25 to 50, and older than 50.
Compare the values of the 2 variables using crosstabs. The table may be large,
but it’s easy to check the correspondence between the 2 variables. You can also
identify problems by plotting the values of the 2 variables.
 Are there unusual combinations of values?
 Identify any outliers so that you can make sure the values of these variables are
correct and make any necessary adjustments. What counts as an outlier
depends on the variables that are being considered together.

TIP 11

You can identify points in a scatterplot by specifying a variable in the Label Cases By
text box in the Scatterplot dialog box. Double-click the plot to activate it in the Chart
Editor. From the Elements menu, choose Data Label Mode or click on the Data
Label Mode icon on the toolbar. This changes your cursor to a black box. Click
the cursor over the point that you want identified by the value of the labeling

406
variable. To go to that case in the Data Editor, right click on the point, and then
left click. Make sure that the Data Editor is in Data View. To turn Data Label Mode
off, click on the Data label Mode icon on the toolbar.

Transforming Your Data

Before you transform your data, be sure that you know that value of the variables that
you are interested in so that you know how to code the information. See earlier
instructions about how to use the utilities menu to obtain information on the variables
either individually or for the entire working data file.

Computing a New Variable

If you want to perform the same calculation for all of the cases in your data file, the
transformation is called unconditional. If you want to perform different computations
based on the values of 1 or more variables, the transformation is conditional. For
example, if you compute an index differently for men and women, the transformation is
conditional. Both types of transformations can be performed in the Compute Variable
dialog box.

One Size Fits All: Unconditional Transformation

Choose compute from the transform menu to open the compute variable dialog
box. At the top left, assign a new name to the variable that you will be computing.
To do so, click in the target variable box and type in the desired name. You must
follow the same rules for assigning variable names as you did when naming variables in
the Data Editor. Also, don’t forget to enter the information in the type and label tab in the
dialog box.

Warning: You MUST use a new variable name rather than one already in use. If you
reuse the same name and make a mistake specifying the transformation, you’ll replace
the values of the original variable with values that you don’t want. If you don’t catch the
mistake right away, and you save the data file, the original values of the variable are lost.
SPSS will ask you for permission to proceed if you try to use an existing variable name.

To specify the formula for the calculations that you want to perform, either type directly in
the Numeric Expression text box or use the calculator pad. Each time you want to refer
to an existing variable, click it in the variable list and then click the arrow button. The
variable name will appear in the formula at the blinking insertion point. Once you click
ok, the variable is added to your data file as the last variable. However, remember that
you want to click the paste button and then run the syntax command from the syntax
window so that you know what commands you specified. You also want to remember to
use commentary information above the pasted syntax in order to tell yourself and the
reviewer, in this case me, what you did to conduct your analysis.

407
TIP 12

Right-click your mouse on any button (except the #s) on the calculator pad or any
function for an explanation of what it means.

Using a Built-in Function

The function groups are located in the dialog box and can be used to perform your
calculations, if necessary. There are 7 main groups of functions: arithmetic, statistical,
string, data and time, distribution, random-variable, and missing-values. If you wish to
use it, click it when the blinking insertion point is placed where you want to insert
the function into your formula, and then click the up arrow button. The function
will appear in your formula, but it will have question marks for the arguments. The
arguments of a function are the numbers or strings that it operates on. In the expression
SQRT(25), 25 is the sole argument of this function. Enter a value for the argument, or
double-click a variable to move it into the argument list. If there are more
question-mark arguments, select them in turn and enter a value, move a variable,
or somehow supply whatever suits the needs of the function.

TIP 12

For detailed information about any function and its arguments, from the Help menu,
choose Topics, click the index tab, and type the word functions. You can then select the
type of function that you want.

If and Then: Conditional Transformation

If you want to use different formulas, depending on the values of one or more existing
variables, you have to enter the formula and then click the button labeled if at the bottom
of the compute variable dialog box. This will take you a secondary compute data dialog
box in which you choose, “include if cases satisfies condition.” To make your conditional
equation. For example, if you wish to compute a new variable, you would specify how
the new target variable is coded in reference to the “if, then expression.”

Changing the Coding Scheme

Recode into Same Variables

If you wish to change the coding of a variable but not create a totally different variable,
you would select transform, recode, into same variables, and click on the variable
or variables of interest and move them into the variable box by clicking the arrow.
Depending on how you wish to recode the values within a variable, you could select old
and new values and on the left side of the dialog box, choose the numbers that
you wish to change and on the right side of the dialog box, choose what you want
them to become and click add. When done, select continue, to go back to the
previous dialog box and paste command syntax so that you can run it. Again
408
don’t forget to type in a commentary of what the command is doing. In other
cases, you might choose the “IF” tab to compute the conditions under which a recode
will take place.

TIP 13

If you wish to recode a group of variables using the same coding scheme, such as
recode a 2 into a 1 for a set of variables even if the numbers stand for different value
labels, you can enter several variables into the dialog box at once.

Recode into Different Variables

If you want to recode an existing variable into a new one in which every original value
has to be transformed into a value of the new variable. Click transform, recode, into
different variables and you will get a dialog box. In this dialog box, select the
name of the variable that will be recoded. Then in the output variable name test
box, enter a name for the new variable. Click the change button and the new
name appears after the arrow in the central list. Once this is done, click “old and
new values” and enter the recode criteria that will comprise the command syntax.
SPSS carries out the recode specifications in the order they are listed in the old to new
list.

TIP 14

Always specify all of the values even if you’re leaving them unchanged. Select all other
values and then copy cold values. Remember to click the add button after
entering each specification to move it into the old to new list; otherwise, it is
ignored.

Checking the Recode

The easiest method is to make a crosstabs table of the original variable with the new
variable containing recoded values.

Warning: After you’ve created a new variable with recode, go to the variable view in the
Data Editor and set the missing values for each newly created variable.

Describing Your Data

Examining Tables and Chart Counts

Frequency Tables

409
Rap Music

Valid Like Very Much 41 2.7 2.9 Cumulative


2.9
Frequency Percent Valid Percent Percent
Like It 145 9.7 10.1 13.0
Mixed Feelings 266 17.7 18.6 31.6
Dislike It 401 26.7 28.0 59.6
Dislike Very Much 578 38.5 40.4 100.0
Total 1431 95.4 100.0
Missing DK Much About It 58 3.9
NA 11 .7
Total 69 4.6
Total 1500 100.0

Imagine that you were interested in analyzing respondents views regarding rap music.
You would run a frequency table like the one above to find a count of the level of like or
dislike of rap music reported by respondents. Each row of the table corresponds to one
of the recorded answers. Be sure to make sure that the counts presented appear to be
correct, including those for the missing data listing.

The 3rd-5th columns contain percentages. The 3rd column labeled simply percent is the
% of all cases in the data file with that value. 9% of respondents reported that they like
rap music. However, the 4th column, labeled valid percent indicates that 10% of
respondents like rap music. Why the difference? The 4th column bases the % only on
people who actually respondent to the question.

Warning: A large difference between the % and valid % columns can signal big
problems for your study. If the missing values result from people not being asked the
question because that’s the design of the study, you don’t have to worry. If people
weren’t asked because the interviewer decided not to ask them or if they refused to
answer, that’s a different matter.

The 5th column, labeled cumulative percent is the sum of the valid % for that row and all
of the rows before it. It’s useful only if the variable is measured at least on an ordinal
scale. For example, the cumulative % for “like” tells you that 13% of respondents either
reported that they like rap music or that they like it very much. The valid data value that
occurs most frequently is called the mode. For these data, “dislike very much” is the
modal category since 578 of the respondents reported that they disliked rap music very
much. The mode is not a particularly good summary measure, and if you report it, you
should always indicate the percentage of cases with that value. For variables measured
on a nominal scale, the mode is the only summary statistic that makes sense, but that
isn’t the case for this variable because there is a natural order to the responses [i.e.
ordinal variable].

410
Frequency Tables as Charts

You can display the numbers in a frequency table in a pie chart or a bar chart, although
prominent statisticians advise that one should “never use a pie chart.”

Rap Music
Like Very Much
Like It
Mixed Feelings
Dislike It
2.87%
Dislike Very Much
10.13%

40.39%
18.59%

28.02%

__

Warning: If you create a pie chart by choosing Descriptive Statistics, then frequencies, a
slice for missing values is always included. Use graph, then select pie if you don’t want
to include a slice for missing values. This was the way that I obtained the pie chart
above.

411
50.0%

40.0%

30.0%
Percent

20.0%

10.0%

0.0%
Like Very Much Like It Mixed Feelings Dislike It Dislike Very Much

Rap Music

Examining Tables and Chart Counts

Now you know how people as a group feel about rap music, but what about more
nuanced information about the kinds of people who hold these views. Are they male?
College Educated? Racial and Ethnic Minorities? To find out this information, you need
to look at attitudes regarding rap music in conjunction with other variables. A
crosstabualtion involving a 2-way table of counts, for attitudes toward rap music and
gender. Gender is the row variable since it defines the rows of the table, and attitudes
toward rap music is the column variable since it defines the columns. Each of the
unique combinations of the values of the 2 variables defines a cell of the table. The
412
numbers in the total row and column are called marginals because they are in the
margins of the table. They are frequency tables for the individual variables.

TIP 15

DON’T be alarmed if the marginals in the crosstabulation aren’t identical to the


frequency tables for the individual variables. Only cases with valid values for both
variables are in the crosstabulation, so if you have cases with missing values for one
variable but not the other, they will be excluded from the crosstabulation. Respondents
who tell you their gender but not their attitudes about rap music are included in the
frequency table for gender but not in the crosstabulation of the 2 variables.

The table below shows a crosstabulation that contains information solely on the number
of cases that meet both criteria, but not a % distribution.
Respondent's Sex * Rap Music Crosstabulation

Count
Rap Music
Respondent's Male Like Very
17 62 Mixed97 181 Dislike
258 615
Sex Much Like It Feelings Dislike It Very Much Total
Female 24 83 169 220 320 816
Total 41 145 266 401 578 1431

Percentages

The above information, i.e. the counts in the cell are the basic elements of the table, but
they are usually not the best choice for reporting findings because they cannot be easily
compared if there are different totals in the rows and columns of the table. For example,
if you know that 17 Males and 24 Females like rap music very much, you can conclude
little about the relationship between the 2 variables unless you also know the total of
men and women in the sample.

For a crosstabulation, you can compute 3 different percentages:

 Row %: the cell count divided by the number of cases in the row times 100
 Column %: the cell count divided by the number of cases in the column times
100
 Total %: the cell count divided by the total number of cases in the table times
100

The 3 % convey different information, so be sure to choose the correct one for your
problem. If one of the 2 variables in your table can be considered an independent

413
variable and the other a dependent variable, make sure the % sum up to 100 for each
category of the independent variable.

Respondent's Sex * Rap Music Crosstabulation

Rap Music
Respondent's Male Count Like17
Very 62 Mixed
97 181 Dislike
258 615
Sex Much Like It Feelings Dislike It Very MuchTotal
% within
2.8% 10.1% 15.8% 29.4% 42.0% 100.0%
Respondent's Sex
% within Rap Music 41.5% 42.8% 36.5% 45.1% 44.6% 43.0%
% of Total 1.2% 4.3% 6.8% 12.6% 18.0% 43.0%
Female Count 24 83 169 220 320 816
% within
2.9% 10.2% 20.7% 27.0% 39.2% 100.0%
Respondent's Sex
% within Rap Music 58.5% 57.2% 63.5% 54.9% 55.4% 57.0%
% of Total 1.7% 5.8% 11.8% 15.4% 22.4% 57.0%
Total Count 41 145 266 401 578 1431
% within
2.9% 10.1% 18.6% 28.0% 40.4% 100.0%
Respondent's Sex
% within Rap Music 100.0% 100.0% 100.0% 100.0% 100.0% 100.0%
% of Total 2.9% 10.1% 18.6% 28.0% 40.4% 100.0%

Since gender would fall under the realm of an independent variable, you want to
calculate the row % because they will tell you what % of women and men fall into each
of the attitudinal categories. This % isn’t affected by unequal numbers of males and
females in your sample. From the row % displayed above, you find that 2.8% of males
like rap music very much as do 2.9% of females. So with regard to strong positive
feelings about rap music, you note that there are no visible differences. Note: No
statistical differences are examined yet. From the column% displayed above, you find
that among those who like rap music very much, 41.5% are men and 58.5% are female.
This does not tell you that females are significantly more likely to report liking rap music
very much than males. Instead, it tells you that of the people who like rap music very
much, women tend to hold a stronger view than men. Note: The column % depend on
the number of men and women in the sample as well as how they feel about rap music.
If men and women have identical attitudes but there are twice as many men in the
survey than women, the column % for men will be twice as large as the column % for
women. You can’t draw any conclusions based on only the column %.

TIP 16

If you use row %, compare the % within a column. If you use column %, compare the %
within a row.
414
Multiway Tables of Counts as Charts

You can plot the % in the table above by using a clustered bar chart like the one below.
For each attitudinal category regarding rap music, there are separate bars for men and
women since gender is the cluster variable. The values plotted are the % of all men and
the % of all women who gave each response. You can easily that females are equally
likely to like rap music very much as much as males. Although the same information is
in the crosstabulation, it is easier to see in the bar chart.

50.0%
Respondent's Sex
Male
Female

40.0%

30.0%
Percent

41.95%
20.0% 39.22%

29.43%
26.96%

10.0% 20.71%

15.77%

10.08%10.17%

2.76% 2.94%
0.0%
Like Very Like It Mixed Dislike It Dislike Very
Much Feelings Much

Rap Music

TIP 17

Always select % in the clustered bar chart dialog boxes; otherwise, you’ll have a difficult
time making comparisons within a cluster, since the height of the bars will depend on the
number of cases in each subgroup. For example, you won’t be able to tell if the bar for
men who always read newspapers is higher because men are more likely to read a
newspaper daily or because there are more men in the sample.

Control Variables
415
You can examine the relationship between gender and attitudes toward rap music
separately for each category of another variable, such as education [i.e., the control
variable]. See the crosstabulation model below to show you how the information would
look when entered into the crosstabulation dialog box.

416
Respondent's Sex * Rap Music * RS Highest Degree Crosstabulation

Rap Music
Less than HS Respondent'sMale Count Like5Very 11 Mixed
14 30 Dislike
55 115
RS Highest Degree
Sex Much Like It Feelings Dislike It Very MuchTotal
% within
4.3% 9.6% 12.2% 26.1% 47.8% 100.0%
Respondent's Sex
Female Count 10 18 19 35 59 141
% within
7.1% 12.8% 13.5% 24.8% 41.8% 100.0%
Respondent's Sex
Total Count 15 29 33 65 114 256
% within
5.9% 11.3% 12.9% 25.4% 44.5% 100.0%
Respondent's Sex
High school Respondent'sMale Count 9 36 50 87 110 292
Sex % within
3.1% 12.3% 17.1% 29.8% 37.7% 100.0%
Respondent's Sex
Female Count 11 45 95 134 175 460
% within
2.4% 9.8% 20.7% 29.1% 38.0% 100.0%
Respondent's Sex
Total Count 20 81 145 221 285 752
% within
2.7% 10.8% 19.3% 29.4% 37.9% 100.0%
Respondent's Sex
Junior college Respondent'sMale Count 1 4 4 13 14 36
Sex % within
2.8% 11.1% 11.1% 36.1% 38.9% 100.0%
Respondent's Sex
Female Count 1 3 13 15 18 50
% within
2.0% 6.0% 26.0% 30.0% 36.0% 100.0%
Respondent's Sex
Total Count 2 7 17 28 32 86
% within
2.3% 8.1% 19.8% 32.6% 37.2% 100.0%
Respondent's Sex
Bachelor Respondent'sMale Count 2 8 22 32 41 105
Sex % within
1.9% 7.6% 21.0% 30.5% 39.0% 100.0%
Respondent's Sex
Female Count 2 11 30 27 52 122
% within
1.6% 9.0% 24.6% 22.1% 42.6% 100.0%
Respondent's Sex
Total Count 4 19 52 59 93 227
% within
1.8% 8.4% 22.9% 26.0% 41.0% 100.0%
Respondent's Sex
Graduate Respondent'sMale Count 3 7 19 38 67
Sex % within
4.5% 10.4% 28.4% 56.7% 100.0%
Respondent's Sex
Female Count 5 12 9 16 42
% within
11.9% 28.6% 21.4% 38.1% 100.0%
Respondent's Sex
Total Count 8 19 28 54 109
% within
7.3% 17.4% 25.7% 49.5% 100.0%
Respondent's Sex

You see that the largest difference in strong dislike of rap music between men and
women occurs among those with a graduate degree. 56.7% of males strongly dislike
417
rap compared to 38.1% of females. The % are almost equal for those with a high school
education.

As the number of variables in a crosstabulation increases, it becomes unwieldy to plot all


of the categories of a variable. Instead you can restrict your attention to a particular
responses.

T-Tests

When using these statistical tests, you are testing the null hypothesis that 2 population
means are equal. The alternative hypothesis is that they are not equal. There are 3
different ways to go about this, depending on how the data were obtained.

Deciding Which T-test to Use

Neither the one-sample t test nor the paired samples t test requires any assumption
about the population variances, but the 2-sample t test does.

TIP 18

When reporting the results of a t test, make sure to include the actual means,
differences, and standard errors. Don’t give just a t value and the observed significance
level.

One-sample T test

If you have a single sample of data and want to know whether it might be from a
population with a known mean, you have what’s termed a one-sample design, which can
be analyzed with a one-sample t test.

 Examples
 You want to know whether CEOs have the same average score on a personality
inventory as the population on which it was normed. You administer the test to a
random sample of CEOs. The population value is assumed to be known in
advance. You don’t estimate it from your data.
 You’re suspicious of the claim that the normal body temperature is 98.6 degrees.
You want to test the null hypothesis that the average body temperature for
human adults is the long assumed value of 98.6, against the alternative
hypothesis that it is not. The value 98,6 isn’t estimated from the data; it is a
known constant. You take a single random sample of 1,000 adult men and
women and obtain their temperatures.
 You think that 40 hours no longer defines the traditional work week. You want to
test the null hypothesis that the average work week is 40 hours, against the
alternative that it isn’t. You ask a random sample of 500 full-time employees how
many hours they worked last week.
 You want to know whether the average IQ score for children diagnosed with
schizophrenia differs from 100, the average for the population of all children.
You administer an IQ test to a random sample of 700 schizophrenic children.
418
Your null hypothesis is that the population value for the average IQ score for
schizophrenic children is 100, and the alternative hypothesis is that it isn’t.
 Data Arrangement
 For the one-sample t test, you have one variable that contains the values for
each case. For example:
 A manufacturer of high-performance automobiles produces disc brakes that must
measure 322 millimeters in diameter. Quality control randomly draws 16 discs
made by each of eight production machines and measures their diameters. This
example uses the file brakes.sav . Use One Sample T Test to determine whether
or not the mean diameters of the brakes in each sample significantly differ from
322 millimeters. A nominal variable, Machine Number, identifies the production
machine used to make the disc brake. Because the data from each machine
must be tested as a separate sample, the file must first be split into groups by
Machine Number.

Select compare groups in the split file dialog box. Select machine number from the
variable listing and move it into the box for “groups based on.” Select the “compare
groups circle” and since the file isn’t already sorted, be sure that you have selected, “sort
the file by grouping variables.”

419
Next select one-sample T test from the analyze tab.

 Select analyze, then compare means, and then one-sample T test.

Select the test variable, i.e. disc brake diameter (mm), type 322 as the test variables,
and click options.

420
In the options dialog box for the one-sample T test, type 90 in the confidence interval %,
then be sure that you have missing values coded as “exclude cases analysis by
analysis,” then click continue, then click paste so that the syntax is entered in the syntax
viewer, and then select ok.

 Note: A 95% confidence interval is generally used, but the examples below
reflect a 90% confidence interval.

The Descriptives table displays the sample size, mean, standard deviation, and standard
error for each of the eight samples. The sample means disperse around the 322mm
standard by what appears to be a small amount of variation.

The test statistic table shows the results of the one-sample T test.

421
The t column displays the observed t statistic for each sample, calculated as the ratio of
the mean difference divided by the standard error of the sample mean.

The df column displays degrees of freedom. In this case, this equals the number of cases
in each group minus 1.

The column labeled Sig. (2-tailed) displays a probability from the t distribution with 15
degrees of freedom. The value listed is the probability of obtaining an absolute value
greater than or equal to the observed t statistic, if the difference between the sample
mean and the test value is purely random.

The Mean Difference is obtained by subtracting the test value (322 in this example) from
each sample mean.

The 90% Confidence Interval of the Difference provides an estimate of the boundaries
between which the true mean difference lies in 90% of all possible random samples of 16
disc brakes produced by this machine.

Since their confidence intervals lie entirely above 0.0, you can safely say that machines 2,
5 and 7 are producing discs that are significantly wider than 322mm on the average.

Similarly, because its confidence interval lies entirely below 0.0, machine 4 is producing
discs that are not wide enough.

422
The one-sample t test can be used whenever sample means must be compared to a
known test value. As with all t tests, the one-sample t test assumes that the data be
reasonably normally distributed, especially with respect to skewness. Extreme or
outlying values should be carefully checked; boxplots are very handy for this.

Paired-Samples T test

You use a paired-samples (also known as the matched cases) T test if you want to test
whether 2 population means are equal, and you have 2 measurements from pairs of
people or objects that are similar in some important way. For example, you’ve observed
the same person before and after treatment or you have personally measures for each
CEO and their non-CEO sibling. Each “case” in this data file represents a pair of
observations.

 Examples
 You are interested in determining whether self-reported weights and actual
weights differ. You ask a random sample of 200 people how much they weigh
and then you weigh them on a scale. You want to compare the means of the 2
related sets of weights.
 You want to test the null hypothesis that husbands and wives have the same
average years of education. You take a random sample of married couples and
compare their average years of education.
 You want to compare 2 methods for teaching reading. You take a random
sample of 50 pairs of twins and assign each member of a pair to one of the 2
methods. You compare average reading scores after completion of the program.
 Data Arrangement
 In a paired-samples design, both members of a pair must be on the same data
record. Different variable names are used to distinguish the 2 members of a pair.
For example:
 A physician is evaluating a new diet for her patients with a family history of heart
disease. To test the effectiveness of this diet, 16 patients are placed on the diet
for 6 months. Their weights and triglyceride levels are measured before and after
the study, and the physician wants to know if either set of measurements has
changed.
 This example uses the file dietstudy.sav . Use Paired-Samples T Test to
determine whether there is a statistically significant difference between the pre-
and post-diet weights and triglyceride levels of these patients.

o Select Analyze, then compare means, then paired-samples T test

423
Select Triglyceride and Final Triglyceride as the first set of paired variables.

 Select Weight and final weight as the second pair and click ok.

424
The Descriptives table displays the mean, sample size, standard deviation, and standard
error for both groups. The information is disseminated in pairs such that pair 1 should
come first and pair 2 should come second in the table.

Across all 16 subjects, triglyceride levels dropped between 14 and 15 points on average
after 6 months of the new diet.

The subjects clearly lost weight over the course of the study; on average, about 8 pounds.

The standard deviations for pre- and post-diet measurements reveal that subjects were
more variable with respect to weight than to triglyceride levels.

At -0.286, the correlation between the baseline and six-month triglyceride levels is not
statistically significant. Levels were lower overall, but the change was inconsistent across
subjects. Several lowered their levels, but several others either did not change or
increased their levels.

On the other hand, the Pearson correlation between the baseline and six-month weight
measurements is 0.996, almost a perfect correlation. Unlike the triglyceride levels, all
subjects lost weight and did so quite consistently.

425
The Mean column in the paired-samples t test table displays the average difference
between triglyceride and weight measurements before the diet and six months into the
diet.

The Std. Deviation column displays the standard deviation of the average difference
score.

The Std. Error Mean column provides an index of the variability one can expect in
repeated random samples of 16 patients similar to the ones in this study.

The 95% Confidence Interval of the Difference provides an estimate of the boundaries
between which the true mean difference lies in 95% of all possible random samples of 16
patients similar to the ones participating in this study.

The t statistic is obtained by dividing the mean difference by its standard error.

The Sig. (2-tailed) column displays the probability of obtaining a t statistic whose absolute
value is equal to or greater than the obtained t statistic.

Since the significance value for change in weight is less than 0.05, you can conclude that
the average loss of 8.06 pounds per patient is not due to chance variation, and can be
attributed to the diet.

However, the significance value greater than 0.10 for change in triglyceride level shows
the diet did not significantly reduce their triglyceride levels.

 Warning: When you click the first variable of a pair, it doesn’t move to the list
box; instead, it moves to the lower left box labeled Current Selections. Only
when you click a second variable and move it into Current Selections can you
move the pair into the Paired Variable list.
Two-Independent-Samples T test

426
If you have 2 independent groups of subjects, such as CEOs and non-CEOs, men and
women, or people who received a treatment and people who didn’t, and you want to test
whether they come from populations with the same mean for the variable of interest, you
have a 2-independent samples design. In an independent-samples design, there is no
relationship between people or objects in the 2 groups. The T test you use is called an
independent-samples T test.

 Examples
 You want to test the null hypothesis that, in the U.S. population, the average
hours spent watching TV per day is the same for males and females.
 You want to compare 2 teaching methods. One group of students is taught by
one method, while the other group is taught by the other method. At the end of
the course, you want to test the null hypothesis that the population values for the
average scores are equal.
 You want to test the null hypothesis that people who report their incomes in a
survey have the same average years of education as people who refuse.
 Data Arrangement
 If you have 2 independent groups of subjects, e.g., boys and girls, and want to
compare their scores, your data file must contain two variables for each child:
one that identifies whether a case is a boy or a girl, and one with the score. The
same variable name is used for the scores for all cases. To run the 2
independent samples T test, you have to tell SPSS which variable defines the
groups. That’s the variable Gender, which is moved into the Grouping Variable
box. Notice the 2 question marks after a variable name. They will disappear
after you use the Define Groups dialog box to tell SPSS which values of the
variable should be used to form the 2 groups.

TIP 18

Right-click the variable name in the Grouping Variable box and select variable
information from the pop-up menu. Now you can check the codes and value labels that
you’ve defined for that variable.

Warning: In the define groups dialog box, you must enter the actual values that you
entered into the data editor, not the value labels. If you used the codes of 1 for male and
2 for female and assigned them value labels of m and f, then you enter the values 1 and
2, not the labels m and f, into the define groups dialog box.

An analyst at a department store wants to evaluate a recent credit card promotion. To


this end, 500 cardholders were randomly selected. Half received an ad promoting a
reduced interest rate on purchases made over the next three months, and half received
a standard seasonal ad.

 Select Analyze, then compare means, then independent samples T test

427
Select money spent during the promotional period as the test variable. Select type of
mail insert received as the grouping variable. Then click define groups.

Type 0 as the group 1 variable and 1 as the group 2 variable under define groups. For
the default, the program should have “use specified values” selected. Then click
continue and ok.

428
The Descriptives table displays the sample size, mean, standard deviation, and standard
error for both groups. On average, customers who received the interest-rate promotion
charged about $70 more than the comparison group, and they vary a little more around
their average.

The procedure produces two tests of the difference between the two groups. One test
assumes that the variances of the two groups are equal. The Levene statistic tests this
assumption.

In this example, the significance value of the statistic is 0.276. Because this value is
greater than 0.10, you can assume that the groups have equal variances and ignore the
second test. Using the pivoting trays, you can change the default layout of the table so
that only the "equal variances" test is displayed.

Activate the pivot table. Then under pivot, select pivoting trays.

429
Drag assumptions from the row to the layer and close the pivoting trays window.

With the test table pivoted so that assumptions are in the layer, the Equal variances
assumed panel is displayed.

The df column displays degrees of freedom. For the independent samples t test, this
equals the total number of cases in both samples minus 2.

The column labeled Sig. (2-tailed) displays a probability from the t distribution with 498
degrees of freedom. The value listed is the probability of obtaining an absolute value
430
greater than or equal to the observed t statistic, if the difference between the sample
means is purely random.

The Mean Difference is obtained by subtracting the sample mean for group 2 (the New
Promotion group) from the sample mean for group 1.

The 95% Confidence Interval of the Difference provides an estimate of the boundaries
between which the true mean difference lies in 95% of all possible random samples of
500 cardholders.

Since the significance value of the test is less than 0.05, you can safely conclude that the
average of 71.11 dollars more spent by cardholders receiving the reduced interest rate is
not due to chance alone. The store will now consider extending the offer to all credit
customers.

Churn propensity scores are applied to accounts at a cellular phone company. Ranging
from 0 to 100, an account scoring 50 or above may be looking to change providers. A
manager with 50 customers above the threshold randomly samples 200 below it,
wanting to compare them on average minutes used per month.

 Select analyze, then compare means, then independent samples T test


 Select average monthly minutes as the test variable and propensity to
leave as the group variable. Then select define groups.
 Select cut point and type 50 as the cut point value. Then click continue and
ok.

The Descriptives table shows that customers with propensity scores of 50 or more are
using their cell phones about 78 minutes more per month on the average than
customers with scores below 50.

The significance value of the Levene statistic is greater than 0.10, so you can assume
that the groups have equal variances and ignore the second test. Using the pivoting
trays, change the default layout of the table so that only the "equal variances" test is
displayed. Play around with the pivot tray link if you wish.

The t statistic provides strong evidence of a difference in monthly minutes between


accounts more and less likely to change cellular providers.

431
Analyzing Truancy Data: The Example

To perform this analysis in order to test your skills using a T test, please see the spss file
on the course blackboard page.

One-Sample T test

Consider whether the observed truancy rate before intervention [the % of school days
missed because of truancy] differs from an assumed nationwide truancy rate of 8%. You
have one sample of data [students enrolled in the TRP program-truancy reduction
program] and you want to compare the results to a fixed, specified in-advance
population value.

The null hypothesis is that the sample comes from a population with an average truancy
rate of 8%. [Another way of stating the null hypothesis is that the difference in the
population means between your population and the nation as a whole is 0.] The
alternative hypothesis is that you sample doesn’t come from a population with a truancy
rate of 8%.

To obtain the table below, you would do one of the following: Go to Analyze, choose
desciptive statistics, then descriptives, select the variable to be examined, in this
case prepct, then go to options in the descriptives dialog box and select, mean,
minimu, maximum, and standard deviation, then select continue and okay. You
can also choose frequencies under the descriptive statistics link, select the
variable to be examined, go to statistics and pick the same statistics as above,
select continue, and then okay.
Descriptive Statistics

prepct Percent truant


N 299 .00
Minimum 72.08
Maximum Mean14.2038 Std. Deviation
13.07160
days pre intervention
Valid N (listwise) 299

432
From the table above, you see that, for the 299 students in this sample, the average
truancy rate is 14.2%. You know that even if the sample is selected from a population in
which the true rate is 8%, you don’t expect your sample to have an observed rate of
exactly 8%. Samples from the population vary. What you want to determine is whether
it’s plausible for a sample of 299 students to have an observed truancy rate of 14.2% if
the population value is 8%.

TIP 19

Before you embark on actually computing a one-sample T test, make certain checks.
Look at the histogram of the truancy rates to make sure that all of the values make
sense. Are there percentages smaller than 0 or greater than 100? Are there values that
are really far from the rest? If so, make sure they’re not the result of errors. If you have
a small number of cases, outliers can have a large effect on the mean and the standard
deviation.

Checking the Assumptions

To use the one-sample T test, you have to make certain assumptions about the data:

 The observations must be independent of each other. In this data file, students
came from 17 schools, so its possible that students in the same school may be
more similar than students in different schools. If that’s the case, the estimated
significance level may be smaller than it should be, since you don’t have as much
information as the sample size indicates. [If you have 10 students from 10
different schools, that’s more information than having 10 students from the same
school because it’s plausible that students in the same school are more similar
than students from different schools.] Independence is one of the most important
assumptions that you have to make when analyzing data.
 In the population, the distribution of the variable must be normal, or the sample
size must be large enough so that it doesn’t matter. The assumption of normally
distributed data is required for many statistical tests. The importance of the
assumption differs, depending on the statistical test. In the case of a one-sample
T test, the following guidelines are suggested: If the number of cases is < 15, the
data should be approximately normally distributed; if the number of cases is
between 15 and 40, the data should not have outliers or be very skewed; for
samples of 40 or more, even markedly skewed distributions are acceptable.
Because you have close to 300 observations, there’s little need to worry about
the assumption of normality.

TIP 20

If you have reason to believe that the assumptions required for the T test are violated in
an important way, you can analyze the data using a nonparametric tests.

Testing the Hypothesis

433
Compute the difference between the observed sample mean and the hypothesized
population value. [14.2%-8% = 6.2%]

Compute the standard error of the difference. This is a measure of how much you
expect sample means, based on the same number of cases from the same population,
to vary. The hypothetical population value is a constant and doesn’t contribute to the
variability of the differences, so the standard error of the difference is just the standard
error of the mean. Based on the standard deviation in the table above, the standard
error equals:

 SE = std. deviation/SQRT of the sample size = 13.07/SQRT of 299 = .756 [Note:


You should be able to obtain this value using the frequencies command and
selecting standard error mean under statistics. This is a way for you to double
check if you are unsure of your calculations. See the table below.
Statistics

N Valid 299
prepct Percent truant days pre intervention
Missing 0
Mean 14.2038
Std. Error of Mean .75595
Std. Deviation 13.07160

You can calculate the t statistic by hand if you divide the observed difference by the
standard error of the difference.

 T = Observed Mean [prepct]-Predicted Mean/Std. Error of the mean


= 14.204-8/0.756 = 8.21

You can also conduct a one-sample T test using SPSS by going to analyze, compare
means, one-sample T test, selecting the relevant variable, [i.e. prepct] and
entering it into the test variable box and entering the number 8 in the test value
box at the bottom of the dialog box and running the analysis. You will get the
following output as shown below.

T-TEST

/TESTVAL = 8

/MISSING = ANALYSIS

/VARIABLES = prepct

/CRITERIA = CI(.95) .

T-Test
434
One-Sample Statistics

prepct Percent truant Std. Error


N 299 Mean14.2038 Std. Deviation
13.07160Mean.75595
days pre intervention

One-Sample Test

Test Value = 8
95% Confidence
Interval of the
Mean Difference
prepct Percent truant
t 8.207 df 298 .000
Sig. (2-tailed) 6.20378Lower4.7161 Upper7.6915
Difference
days pre intervention

Use the T distribution to determine if the observed t statistic is unlikely if the null
hypothesis is true. To calculate the observed significance level for a T statistic, you
have to take into account both how large the actual T value is and how many degrees of
freedom it has. For a one-sample T test, the degress of freedom [dof] is one fewer than
the number of cases. From the table above, you see that the observed significance level
is < .0001. Your observed results are very unlikely if the true rate is 8%, so you reject
the null hypothesis. Your sample probably comes from a population with a mean larger
than 8%.

TIP 21

To obtain observed significance levels for an alternative hypothesis that specifies


direction, often known as a one-sided or one-tailed test, divide the observed two-tailed
significance level by two. Be very cautious about using one-sided tests.

Examining the Confidence Interval

If you look at the 95% Confidence Interval for the population difference, you see that it
ranges from 4.7% to 7.7%. You don’t know whether the true population difference is in
this particular interval, but you know that 95% of the time, 95% confidence intervals
include the true population values. Note that the value of 0 is not included in the
confidence interval. If your observed significance level had been larger than 0.05, 0
would have been included in the 95% confidence interval.

TIP 22

435
There is a close relationship between hypothesis testing and confidence intervals. You
can reject the null hypothesis that you sample comes from a population with any value
outside of the 95% confidence interval. The observed significance level for the
hypothesis test will be less than 0.05.

Paired-Samples T test

You’ve seen that your students have a higher truancy rate than the country as a whole.
Now the question is whether there is a statistically significant difference in the truancy
rates before and after the truancy reduction programs. For each student, you have 2
values for unexcused absences. One is for the year before the student enrolled in the
program; the other is for the year in which the student was enrolled in the program.
Since there are two measurements for each subject, a before and an after, you want to
use a paired-samples T test to test the null hypothesis that averages before and after
rates are equal in the population.

TIP 23

The reason for doing a paired-samples design is to make the 2 groups as comparable as
possible on characteristics other than the one being studied. By studying the same
students before and after intervention, you control for differences in gender,
socioeconomic status, family supervision, and so on. Unless you have pairs of
observations that are quite similar to each other, pairing has little effect and may, in fact,
hurt your chances of rejecting the null hypothesis when it is false.

Before running the paired-samples T test procedure, look at the histogram of the
differences shown. You should see that the shape of the distribution is symmetrical [i.e.
not too far from normal]. Many of the cases cluster around 0, indicating that the
difference in the before and after scores is small for these students.

Checking the Assumptions

The same assumptions about the distributions of the data are required for this test as
those in the one-sample T test. The observations should be independent; if the sample
size is small, the distribution of differences should be approximately normal. Note that
the assumptions are about the differences, not the original observations. That’s
because a paired-samples T test is nothing more than a one-sample T test on the
differences. If you calculate the differences between the pre- and post-values and use
the one-sample T test with a population value of 0, you’ll get exactly the same statistic
as using the paired-samples T test.

Testing the Hypothesis

From the table below, you see that the average truancy rate before intervention is 14.2%
and the average truancy rate after intervention is 11.4%. That’s a difference about

436
2.8%. To get the table below, you should go to descriptives and select the prepct
and postpct variables and enter them into the variable list, be sure that the right
statistics are checked off [e.g. standard deviation], and then hit okay.
Paired Samples Statistics

Pair postpct Percent truant Std. Error


11.4378
Mean N 299 11.18297
Std. Deviation Mean.64673
1 days post intervention
prepct Percent truant
14.2038 299 13.07160 .75595
days pre intervention

To see how often you would expect to see a difference of at least 2.8% when the null
hypothesis of no difference is true, look at the paired-samples T test table below.

To obtain the table below, do the following: go to analyze, then select compare
means, then select paired-samples T test and choose the 2 variables of interest of
the pair to be selected, i.e., prepct and postpct, then select Ok.
Paired Samples Test

Paired Differences
95% Confidence
Interval of the
Std. Error Difference
Pair postpct Percent truant
1 days post intervention Mean Std. Deviation Mean Lower Upper t df Sig. (2-tailed)
-2.76602 12.69355 .73409 -4.21067 -1.32137 -3.768 298 .000
- prepct Percent truant
days pre intervention

The T statistic, 3.8, is computed by dividing the average difference [2.77%] by the
standard error of the mean difference [0.73]. The degrees of freedom is the number of
pairs minus one. The observed significance level is < .001, so you can reject the null
hypothesis that the pre-intervention and post-intervention truancy rates are equal in the
population. Intervention appears to have reduced the truancy rate.

Warning: The conclusions you can draw about the effectiveness of truancy reduction
programs from a study like this are limited. Even if you restrict your conclusions to the
schools from which these children are a sample, there are many problems. Since you
are looking at differences in truancy rates between adjacent years, you aren’t controlling
for possible increases or decreases in truancy that occur as children grow older. For
example, if truancy increases with age, the effect of the truancy reduction program may
be larger than it appears. There is also potential bias in the determination of what is
considered an “excused” absence.

The 95% confidence interval for the population change is from 1.3% to 4.2%. It appears
that if the program has an effect, it is not a very large one. One average, assuming a

437
180-day school year, students in the truancy reduction program attended school five
more days after the program than before. The 95% confidence interval for the number
of days “saved” is from 2.3 days to 7.6 days.

A paired-samples design is effective only if you have pairs of similar cases. If your
pairing does not result in a positive correlation coefficient between the 2 measurements
of close to 0.5, you may lose power [your computer stays on, but your ability to reject the
null hypothesis when it is false fizzles] by analyzing the data as a paired-samples
design. From the correlation coefficient table covering the correlation coefficient
between the pre- and post-intervention rates is close to 0.5, so pairing was probably
effective. See below.
Paired Samples Correlations

Pair postpct Percent truant


1 days post intervention N Correlation Sig.
299 .461 .000
& prepct Percent truant
days pre intervention

Warning: Although well-intentioned, paired designs often run into trouble. If you give a
subject the same test before and after an intervention, the practice effect, instead of the
intervention, may be responsible for any observed change. You must also make sure
that there is no carryover effect; that is, the effect of one intervention must be completely
gone before you impose another.

Two-Independent Samples T test

You’ve seen that intervention seems to have had a small, although statistically
significant effect. One of the questions that remains is whether the effect is similar for
boys and girls prior to intervention? Is the average truancy rate the same for boys and
girls after intervention? Is the change in truancy rates before and after intervention the
same for boys and girls?
Group Statistics

prepct Percent truant f Female 152 13.0998 12.25336Std. Error


.99388
days pre intervention gender Gender N Mean Std. Deviation Mean
m Male 147 15.3453 13.81620 1.13954
postpct Percent truant f Female 152 11.5130 11.43948 .92786
days post intervention m Male
147 11.3599 10.94995 .90314

diffpct Pre - Post f Female 152 1.5866 11.72183 .95077


m Male 147 3.9850 13.55834 1.11827

438
The table above shows summary statistics for the 2 groups for all 3 variables. Boys had
somewhat larger average truancy scores prior to intervention than did girls. The
average scores after intervention were similar for the 2 groups. The difference between
the average pre- and post-intervention is larger for boys. You must determine whether
these observed differences are large enough for you to conclude that, in the population,
boys and girls differ in average truancy rates. You can use the 2 independent-samples
T test to test all 3 hypotheses.

Checking the Assumptions

You must assume that all observations are independent. If the sample sizes in the
groups are small, the data must come from populations that have normal distributions. If
the sum of the sample sizes in the 2 groups is greater than 40, you don’t have to worry
about the assumption of normality. The 2-independent-samples T test also requires
assumptions about the variances in the 2 groups. If the 2 samples come from
populations with the same variance, you should use the “pooled” or equal-variance T
test. If the variances are markedly different, you should use the separate-variance T
test. Both of these are shown below.

Independent Samples Test

Levene's Test for


Equality of Variances t-test for Equality of Means
95% Confidence
Interval of the
Mean Std. Error Difference
prepct Percent truant Equal variances
F 5.248 Sig. .023 t -1.488 df 297 .138
Sig. (2-tailed) -2.24550Difference
Difference 1.50904Lower-5.21527Upper.72426
days pre intervention assumed
Equal variances
-1.485 290.226 .139 -2.24550 1.51207 -5.22151 .73051
not assumed
postpct Percent truant Equal variances
.122 .727 .118 297 .906 .15309 1.29578 -2.39698 2.70317
days post intervention assumed
Equal variances
.118 296.969 .906 .15309 1.29483 -2.39511 2.70130
not assumed
diffpct Pre - Post Equal variances
1.679 .196 -1.638 297 .102 -2.39839 1.46426 -5.28003 .48326
assumed
Equal variances
-1.634 287.906 .103 -2.39839 1.46782 -5.28740 .49063
not assumed

You can test the null hypothesis that the population variances in the 2 groups are equal
using the Levene test, shown above. If the observed significance level is small [in the
column labeled sig. under Levene’s Test], you reject the null hypothesis that the
population variances are equal. For this example, you can reject the null hypothesis that
the per-intervention truancy variances are equal in the 2 groups. For the other 2
variables, you can’t reject the null hypothesis that the variances are equal.

Testing the Hypothesis

In the 2-independent-samples T test, the T statistic is computed the same as for the
other 2 tests. It is the ratio of the difference between the 2 sample means divided by the
standard error of the difference. The standard error of the difference is computed
differently, depending on whether the 2 variances are assumed to be equal or not.
439
That’s why you see 2 sets of T values in the table above. In this example, the 2 T values
and confidence intervals based on them are very similar. That will always be the case
when the sample size in the 2 groups is almost the same.

The degrees of freedom for the t statistic also depends on whether you assume that the
2 variances are equal. If the variances are assumed to be equal, the degrees of
freedom is 2 fewer than the sum of the number of cases in the 2 groups. If you don’t
assume that the variances are equal, the degrees of freedom is calculated from the
actual variances and the sample sizes in the groups. The result is usually not an
integer.

From the column labeled Sig. [2-tailed], you can’t reject any of the 3 hypotheses of
interest. The observed results are not incompatible with the null hypothesis that boys
and girls are equally truant before and after the program and that intervention affects
confidence intervals.

Warning: When you compare 2 independent groups, one of which has a factor of
interest and the other that doesn’t, you must be very careful about drawing conclusions.
For example, if you compare people enrolled in a weight-loss program to people who
aren’t, you cannot attribute observed differences to the program unless the people have
been randomly assigned to two programs.

T-Tests

Crosstabulations

You classify cases based on values for 2 or more categorical variables [e.g. type of
health insurance coverage and satisfaction with health care.] Each combination of
values is called a cell. To test whether the two variables that make up the rows and
columns are independent, you calculate how many cases you expect in each cell if the
variables are independent, and compare these expected values to those actually
observed using the chi-square statistic. If your observed results are unlikely if the null
hypothesis of independence is true, you reject the null hypothesis. You can measure
how strongly the row and column variables are related by computing measures of
association. There are many different measures, and they define association in different
ways. In selecting a measure of association, you should consider the scale on which the
variables are measured, the type of association you want to detect, and the ease of
interpretation of the measure. You can study the relationship between a dichotomous
[2-category] risk factor and a dichotomous outcome [e.g. family history of a disease and
development of the disease], controlling for other variables [e.g. gender] by computing
special measures based on the odds.

Chi-Square Test: Are Two Variables Independent?

If you think that 2 variables are related, the null hypothesis that you want to test is that
they are not related. Another way of stating the null hypothesis is that the 2 variables
are independent. Independence has a very precise meaning in this situation. It means
that the probability that a case falls into a particular cell of a table is the product of the

440
probability that a case falls into that row and the probability that a case falls into that
column.

Warning: The word independent as used here has nothing to do with dependent and
independent variables. It refers to the absence of a relationship between 2 variables.

As an example of testing whether 2 variables are independent, look at the table below, a
crosstabulation of highest educational attainment [degree] and perception of life’s
excitement[life] based on the gssdata posted on blackboard. From the row %, you see
that the % of people who find life exciting is not exactly the same in the 5 degree groups,
although it is fairly similar for the 1st 2 degree groups. Slightly less than half of those
with less than a high school education or with a high school education find life exciting.
However, you see that there is substantial differences between those with some
exposure to college and those with a post-graduate degree. For those respondents,
almost 2/3 find that life is exciting.
degree Highest degree * life Is life exciting, routine or dull? Crossta

life Is life exciting, routine or dull?


degree 0 Lt high school Count 59 67 10 136
Highest 1 Exciting 2 Routine 3 Dull Total
Expected Count 70.8 60.2 5.0 136.0
degree % within degree
43.4% 49.3% 7.4% 100.0%
Highest degree
1 High school Count 218 232 18 468
Expected Count 243.7 207.1 17.2 468.0
% within degree
46.6% 49.6% 3.8% 100.0%
Highest degree
2 Junior college Count 41 23 2 66
Expected Count 34.4 29.2 2.4 66.0
% within degree
62.1% 34.8% 3.0% 100.0%
Highest degree
3 Bachelor Count 94 46 3 143
Expected Count 74.4 63.3 5.3 143.0
% within degree
65.7% 32.2% 2.1% 100.0%
Highest degree
4 Graduate Count 55 29 0 84
Expected Count 43.7 37.2 3.1 84.0
% within degree
65.5% 34.5% .0% 100.0%
Highest degree
Total Count 467 397 33 897
Expected Count 467.0 397.0 33.0 897.0
% within degree
52.1% 44.3% 3.7% 100.0%
Highest degree

Warning: The chi-square test requires that all observations be independent. This
means that each case can appear in only one cell of the table. For example, if you apply
441
2 different treatments to the same patients and classify them both times as improved or
not improved, you can’t analyze the data with the chi-square test of independence.

Computing Expected Values

You use the chi-square test to determine if your observed results are unlikely if the 2
variables are independent in the population. 2 variables are independent if knowing the
value of one variable tells you nothing about the value of the other variable. The level of
education one attains and one’s perception of life are independent if the probability of
any level of educational attainment/perception of life combination is the product of the
probability of that level of educational attainment times the probability of that perception
of life. For example, under the independence assumption, the probability of being a
college graduate and finding life exciting is:

P = Probability(bachelor degree) x Probability(life exciting)

P = 143/897 x 467/897 = .083

If the null hypothesis is true, you expect to find in your table 74 excited people with
bachelor’s degrees. You see this expected value in the row labeled Expected Count in
the table above

The chi-square test is based on comparing these 2 counts: the observed number of
cases in a cell and the expected number of cases in a cell if the 2 variables are
independent. The Pearson chi-square statistic is:

X2 = ∑ (observed-expected) 2/expected

TIP 24

By examining the differences between observed and expected values in the cells [the
residuals], you can see where the independence model falls. You can examine actual
residuals and residuals standardized by estimates of their variability to help you pinpoint
departures from independence by requesting them in the Cells dialog box of the
Analyze/Descriptive Statistics/Crosstabs procedure.

Determining the Observed Significance Level

From the calculated chi-square value, you can estimate how often in a sample you
would expect to see a chi-square value at least as large as the one you observed if the
independence hypothesis is true in the population. If the observed significance level is
small, enough you reject the null hypothesis that the 2 variables are independent. The
value of chi-square depends on the number of rows and columns in the table. The
degrees of freedom for the chi-square statistic is calculated by finding the product of one
fewer than the number of rows and one fewer than the number of columns. [the degrees

442
of freedom is the number of cells in a table that can be arbitrarily filled when the row and
column totals are fixed.] In this example, the degrees of freedom is 6.

From the table below, you see that the observed significance level for the Pearson chi-
square is 0.000, so you can reject the null hypothesis that level of educational attainment
and perception of life are independent.
Chi-Square Tests

Pearson Chi-Square 34.750


a 8 Asymp..000
Sig.
Value df (2-sided)
Likelihood Ratio 37.030 8 .000
Linear-by-Linear
29.373 1 .000
Association
N of Valid Cases 897
a.

2 cells (13.3%) have expected count less than 5. The


minimum expected count is 2.43.

Warning: A conservative rule for use of the chi-square test requires that the expected
values in each cell be greater than 1 and that most cells have expected values greater
than 5. After SPSS displays the pivot table with the statistics, it displays the number of
celss with expected values less than 5 and the minimum expected count. If more than
20% of your cells have expected values less than 5, you should combine categories, if
that makes sense for your table, so that most expected values are greater than 5.

Examining Additional Statistics

SPSS displays several statistics in addition to the Pearson chi-square when you ask for
a chi-square test as shown above.

 The likelihood-ratio-chi-square has a different mathematical basis than the


Pearson chi-square, but for large sample sizes, it is close in value to the Pearson
chi-square. It is seldom that these 2 statistics will lead you to different
conclusions.
 The linear-by-linear association statistic is also known as the Mantel-Haenszel
chi-square. It is based on the Pearson correlation coefficient. It tests whether
there is a linear association between the 2 variables. You SHOULD NOT use
this statistic for nominal variables. For ordinal variables, the test is more likely
to detect a linear association between the variables than is the Pearson-chi-
square test; it is more powerful.
 A continuity-corrected-chi-square [not shown here] is shown for tables with 2
rows and 2 columns. Some statisticians claim that this leads to a better estimate
of the observed significance level, but the claim is disputed.

443
 Fisher’s exact test [not shown here] is calculated if any expected value in a 2 by
2 table is < 5. You get exact probabilities of obtaining the observed table or one
more extreme if the 2 variables are independent and the marginals are fixed.
That is, the number of cases in the rows and columns of the table are determined
in advance by the researcher.

Warning: The Mantel-Haenszel test is calculated using the actual values of the row and column
variables, so if you coded 3 unevenly spaced dosages of a drug as 1, 2, and 3, those values are
used for the computations.

Are Proportions Equal?

A special case of the chi-square test for independence is the test that several
proportions are equal. For example, you want to test whether the % of people who
report themselves to be very happy has changed during the time that the GSS has been
conducted. The figure below is a crosstabulation of the % of people who say were very
happy for each of the decades. This uses the aggregatedgss.sav file. Almost 35% of
the people questioned in the 1970s claimed that they were very happy, compared to
31% in this millennium.
happy GENERAL HAPPINESS * decade decade of survey Crosstabulation

decade decade of survey


happy GENERAL 1 VERY HAPPY Count 3637 4475 4053 1296 13461
HAPPINESS 1 1972-1979 2 1980-1989 3 1990-1999 4 2000-2002 Total
Expected Count 3403.4 4516.7 4211.5 1329.4 13461.0
% within decade
34.3% 31.8% 30.9% 31.3% 32.1%
decade of survey
2 PRETTY HAPPY Count 6977 9611 9081 2850 28519
Expected Count 7210.6 9569.3 8922.5 2816.6 28519.0
% within decade
65.7% 68.2% 69.1% 68.7% 67.9%
decade of survey
Total Count 10614 14086 13134 4146 41980
Expected Count 10614.0 14086.0 13134.0 4146.0 41980.0
% within decade
100.0% 100.0% 100.0% 100.0% 100.0%
decade of survey

Calculating the Chi-Square Statistic

If the null hypothesis is true, you expect 32.1% of people to be very happy in each
decade, the overall very happy rate. You calculate the expected number in each decade
by multiplying the total number of people questioned in each decade by 32.1%. The
expected number of not very happy people is 67.9% multiplied by the number of people
in each decade. These values are shown in the table above. The chi-square statistic is
calculated in the usual fashion.

From the table below, you see that the observed significance level for the chi-square
statistic is < .001, leading you to reject the null hypothesis that in each decade people
are equally likely to describe themselves as very happy. Notice that the difference
between years isn’t very large; the largest % is 34.3% for the 1970s, while the smallest

444
is 30.9% for the 1990s. the sample sizes in each group are very large, so even small
differences are statistically significant, although they may have little practical implication.
Chi-Square Tests

Pearson Chi-Square 34.180


a 3 Asymp..000
Sig.
Value df (2-sided)
Likelihood Ratio 33.974 3 .000
Linear-by-Linear
25.746 1 .000
Association
N of Valid Cases 41980
a.

0 cells (.0%) have expected count less than 5. The


minimum expected count is 1329.43.

Introducing a Control Variable

To see whether both men and women experienced changes in happiness during this
time period, you can compute the chi-square statistic separately for men and for women,
as shown below:

 Go to Analyze, then Descriptive Statistics, then Crosstabs, then put the


variable happy in the row box and decade in the column box, then the
variable sex into layer 1 of 1, then select under the cells tab in the
crosstabs dialog box, the boxes marked observed and expected counts
and column %, then select ok and go back and select the statistics box in
order to order a chi-square test.
Chi-Square Tests

sexMale
1 RESPONDENTS Pearson Chi-Square 3.677
a 3 Asymp..298
Sig.
SEX Value df (2-sided)
Likelihood Ratio 3.668 3 .300
Linear-by-Linear
.901 1 .343
Association
N of Valid Cases 18442
2 Female Pearson Chi-Square 42.987
b 3 .000
Likelihood Ratio 42.712 3 .000
Linear-by-Linear
35.904 1 .000
Association
N of Valid Cases 23538
a.

0 cells (.0%) have expected count less than 5. The minimum expected count is
b.
586.01.
0 cells (.0%) have expected count less than 5. The minimum expected count is
742.96.

445
You see that for men, you can’t reject the null hypothesis that happiness has not
changed with time. You can reject the null hypothesis for women. From the line plot in
the graph below, you see that in the sample, happiness decreases with time for women,
but not for men. You can also graph the information. See the graph below, but also
note how to obtain the graph.

 Go to the graphs menu, choose line, then select the multiple icon and
summaries for groups of cases, and then click define. Next move decade
inot the category axis box and sex into the define lines by box in the dialog
box that appears. Select other statistic, then move happy into the variable
list, and then click change statistic. In the statistic subdialog box, select %
inside and type 1 into both the low and high text boxes. Click continue,
and then click OK.

36 RESPONDENTS SEX
Male
Female

35
%in(1,1) GENERAL HAPPINESS

34

33

32

31

30

1972-1979 1980-1989 1990-1999 2000-2002

decade of survey

Cases weighted by number of cases

446
Measuring Change: McNemar Test

The chi-square test can also be used to test hypotheses about change when the same
people or objects are observed at two different times. For example, the table below is a
crosstabulation of whether a person voted in 1996 and whether he or she voted in 2000.
[See gssdata.sav file]
vote00 DID R VOTE IN 2000 ELECTION * vote96 DID R VOTE IN 1996
Crosstabulation

Count
vote96 DID R VOTE IN
1996 ELECTION
vote00 DID R VOTE 1 VOTED 1539 2 DID151 1690
IN 2000 ELECTION 2 DID NOT VOTE 1 VOTED NOT VOTE Total
187 502 689
Total 1726 653 2379

An interesting question is whether people were more likely to vote in one of the years
than the other. The cases on the diagonal of the table don’t provide any information
because they behaved similarly in both elections. You have to look at the off-diagonal
cells, which correspond to people who voted in one election but not the other. If the null
hypothesis that likelihood of voting did not change is true, a case should be equally likely
tofallinto either of the 2 off-diagonal cells. The binomial distribution is used to calculate
the exact probability of observing a split between the 2 off-diagonal cells at least as
unequal as the one observed, if cases in the population are equally likely to fall into
either off-diagonal cell. This test is called the McNemar test.
Chi-Square Tests

McNemar Test Exacta.057


Sig.
Value (2-sided)
N of Valid Cases 2379
a.

Binomial distribution used.

McNemar’s test can be calculated for a square table of any size to test whether the
upper half and the lower half of a square table are symmetric. This test is labeled in the
table above. For tables with more than 2 rows and columns, it is labeled the McNemar-
Bowker test. From the figure below, you see that you can’t reject the null hypothesis that
people who voted in only one of the 2 elections were equally likely to vote in another.

447
Warning: Since the same person is asked whether he or she voted in 1996 and whether
he or she voted in 2000, you can’t make a table in which the rows are years and the
columns are whether he or she voted. Each case would appear twice in such a table.

How Strongly are 2 Variables Related?

If you reject the null hypothesis that 2 variables are independent, you may want to
describe the nature and strength of the relationship between the 2 variables. There are
many statistical indexes that you can use to quantify the strength of the relationship
between 2 variables in a cross-classification. No single measure adequately
summarizes all possible types of association. Measures vary in the way they define
perfect and intermediate association and in the way they are interpreted. Some
measures are used only when the categories of the variables can be ordered from
lowest to highest on some scale.

Warning: Don’t compute a large number of measures and then report the most
impressive as if it were the only one examined.

You can test the null hypothesis that a particular measure of association is 0 based on
an approximate T statistic shown in the output. If the observed significance level is small
enough, you reject the null hypothesis that the measure is 0.

TIP 25

Measures of association should be calculated with as detailed data as possible. Don’t


combine categories with small numbers of cases, as was suggested above for the chi-
square test of independence.
“Statistics are like bikinis. What they reveal is suggestive, but what they conceal is vital.”
<< Aaron Levenstein

448
A2Z

PhD
Thesis
Reflections on Academic Research

APAStyle
APA Citation Citation Style

Appendix III

449
APA CITATION STYLE
An Annotated Guide

Introduction

This guide provides a basic introduction to the APA citation style. It is based on the 6th edition of
the Publication Manual of the American Psychological Association published in 2010 (2009).

The Publication Manual is generally used for academic writing in the social sciences. The manual
itself covers many aspects of research writing including selecting a topic, evaluating sources,
taking notes, plagiarism, the mechanics of writing, the format of the research paper as well as the
way to cite sources.

This guide provides basic explanations and examples for the most common types of citations
used by research scholars. For additional information and examples, refer to the Publication
Manual.

Signal Phrases to Contextualize

Never--NEVER--simply use a quotation into your text without indicating why it’s there in
someway. The easiest way to incorporate quotations gracefully is with “signal phrases,” which
serve to link the quotation with your sentences and to name the author of the quoted material.

There are many “signaling verbs” one can use in a signal phrase:

acknowledges concludes emphasizes replies


agrees concurs observes responds
asserts describes offers suggests
claims disagrees remarks writes

[a] Use signal phrases to introduce quotations which support your view:

Body modification appears to be universal among human species, perhaps due to the
nature of the human body itself. As Germaine Greer writes, “Humans are the only animals
which can consciously and deliberately change their appearance according to their own
whims”

[b] Which extend an argument you make:

Many more mundane elements of our social life can be thought of as body modification
rituals. For instance, as Greer argues, “Fashion, because it is beyond logic, is deeply
revealing”

[c] Or which you wish to challenge:

While Greer argues that “beautification and mutilation are the same activity”, I wish to
suggest that some body modifications should be clearly seen and judged to be harmful
and cruel by civilized societies--in effect, that beautification and mutilation must be kept
separate.

[A] Direct Quotations


450
As a research author/writer/scholar, you may have to rely/refer to many
information/views/opinions that are already available and you have, in fact, clearly identified them
during survey of literature. All these may have to used/quoted in your study/paper to
support/enrich your work. These quotations are to be carefully cited and documented in your
paper enabling the readers to retrieve them, if necessary. Direct quotations allow you to
acknowledge a source within your text by providing a reference to exactly where in that source
you found the information. The reader can then follow up on the complete reference in the
Reference List page at the end of your paper.

 Quotations of usually less than 40 words should be incorporated in the text and enclosed
with double quotation marks. Provide the author, publication year and a page number.
Following two examples may be distinguished and appreciated.

She stated, "The 'placebo effect,' ...disappeared when behaviors were studied in this
manner" (Miele, 1993, p. 276), but he did not clarify which behaviors were studied.
[Quotation, truncated, ends within the sentence; page number immediately after the year]

Miele (1993) found that "the 'placebo effect,' which had been verified in previous
studies, disappeared when [only the first group's] behaviors were studied in this
manner" (p. 276).
[Quotation, in full, closes at the end of the sentence; page number at the end]

 When making a quotation of more than 40 words use a free-standing "block quotation" on
a new line, indented five spaces and omit quotation marks. Quotation may consist of two or
more sentences.

Miele (1993) found the following:


The "placebo effect," which had been verified in previous studies, disappeared
when behaviors were studied in this manner. Furthermore, the behaviors were
never exhibited, even when reel [sic] drugs were administered. Earlier studies were
clearly premature in attributing the results to a placebo effect. (p. 276).
[page number at the end]

 For electronic sources such as Web pages, provide a reference to the author, the year
and the page number (if it is a PDF document), the paragraph number if visible or a heading
followed by the paragraph number.

"The current system of managed care and the current approach to defining empirically
supported treatments are shortsighted" (Beutler, 2000, Conclusion section, ¶ 1)
[Note how para number is cited]

[B] Indirect Quotations

Some times, you may have to make use of other’s view/opinion, without using the author’s
original words. When using your own words to refer indirectly to another author's work, you must
identify the original source. A complete reference must appear in the Reference List at the end of
your paper.

 In most cases, providing the author's last name and the publication year are sufficient:

451
Smith (1997) compared reaction times...

Within a paragraph; you need not include the year in subsequent references.

Smith (1997) compared reaction times. Smith also found that...

 If there are two authors, include the last name of each and the publication year:

...as James and Ryerson (1999) demonstrated...


...as has been shown (James and Ryerson, 1999)...

 If there are three to five authors, cite all authors the first time; in subsequent citations,
include only the last name of the first author followed by "et al." and the year:

Williams, Jones, Smith, Bradner, and Torrington (1983) found...


Williams et al. (1983) also noticed that...

 The names of groups that serve as authors (e.g. corporations, associations, government
agencies, and study groups) are usually spelled out each time they appear in a text citation. If
it will not cause confusion for the reader, names may be abbreviated thereafter:

First citation: (National Institute of Mental Health [NIMH], 1999)


Subsequent citations: (NIMH, 1999)

 To cite a specific part of a source, indicate the page, chapter, figure, table or equation at
the appropriate point in the text:

(Czapiewski & Ruby, 1995, p. 10)


(Wilmarth, 1980, Chapter 3)

 For electronic sources that do not provide page numbers, use the paragraph number, if
available, preceded by the ¶ symbol or abbreviation para. If neither is visible, cite the heading
and the number of the paragraph following it to direct the reader to the quoted material.

(Myers, 2000, ¶ 5)
(Beutler, 2000, Conclusion section, para. 1)

 When citing a work which is discussed in another work, include the original author's name
in an explanatory sentence, and then include the source you actually consulted in your
parenthetical reference and in your reference list.[here, Andrews, 2007]

Smith argued that...(as cited in Andrews, 2007)

REFERENCE LIST

Your reference list should appear at the end of your paper. In some cases, a list of all references
may find a place at the end of the book. It provides the information necessary for a reader to
locate and retrieve any source you cite in the body of the paper. Each source you cite in the

452
paper must appear in your reference list; likewise, each entry in the reference list must be cited in
your text.

Your references should begin on a new page separate from the text of the essay; label this page
"References" centered at the top of the page (do NOT bold, underline, or use quotation marks for
the title). All text should be double-spaced just like the rest of your essay.

Important Features

 All lines after the first line of each entry in your reference list should be indented one-half
inch from the left margin. This is called hanging indentation.

Authors' names are inverted (last name first); give the last name and initials for all
authors of a particular work if it has three to seven authors. If the work has more than
seven authors, list the first six authors and then use ellipses after the sixth author's name.
After the ellipses, list the last author's name of the work.

Reference list entries should be alphabetized by the last name of the first author of each
work.

 If you have more than one article by the same author, single-author references or
multiple-author references with the exact same authors in the exact same order are listed
in order by the year of publication, starting with the earliest.
 When referring to any work that is NOT a journal, such as a book, article, or Web page,
capitalize only the first letter of the first word of a title and subtitle, the first word after a
colon or a dash in the title, and proper nouns. Do not capitalize the first letter of the
second word in a hyphenated compound word.

 Capitalize all major words in journal titles.

 Italicize titles of longer works such as books and journals.

 Do not italicize, underline, or put quotes around the titles of shorter works such as journal
articles or essays in edited collections.

Below are some examples of the most common types of sources including online sources.

I NON-ELECTRONIC SOURCES

Traditionally, books, articles and paper from print media, i.e., non-electronic, are quoted in the
text material relating to research work/paper. To make the process systematic and universal,
th
several citations methods were devised during the early part of the 20 century. The ultimate aim
and scope of these citation methods is to standardize the citation process.

[A] Books

Books form the major bulk –sources of citation which has been followed for more than a century.
Following examples would cover various types of citation used in APA style while citing books.

[a] Book with one author

453
Bernstein, T.M. (1965). The careful writer: A modern guide to English usage (2nd ed.).
New York, NY: Atheneum.

[b] Work with two authors

Beck, C. A. J., & Sales, B. D. (2001). Family mediation: Facts, myths, and future
prospects. Washington, DC: American Psychological Association.

[c] Two or more works by the same author


Arrange by the year of publication, the earliest first.

Postman, N. (1979). Teaching as a conserving activity. New York, NY: Delacorte Press.

Postman, N. (1985). Amusing ourselves to death: Public discourse in the age of show
business. New York, NY: Viking.

If works by the same author are published in the same year, arrange alphabetically by title and
add a letter after the year as indicated below

McLuhan, M. (1970a). Culture is our business. New York, NY: McGraw-Hill.

McLuhan, M. (1970b). From cliche to archetype. New York, NY: Viking Press.

[d] Anthology or compilation

Gibbs, J. T., & Huang, L. N. (Eds.). (1991). Children of color: Psychological interventions
with minority youth. San Francisco, CA: Jossey-Bass.

[e] Work in an anthology or an essay in a book

Bjork, R. A. (1989). Retrieval inhibition as an adaptive mechanism in human memory. In H.


L. Roediger III, & F. I. M. Craik (Eds.), Varieties of memory & consciousness (pp. 309-
330). Hillsdale, NJ: Erlbaum.

[f] Book by a corporate author


Associations, corporations, agencies, government departments and organizations are considered
authors when there is no single author

American Psychological Association. (1972). Ethical standards of psychologists.


Washington, DC: American Psychological Association.

[B] Articles

Next to books, articles, papers, etc take the second place of resources for citation. Following
would help the research scholar understand the intricacies in citing the articles in a research
paper.

[a] Article in a reference book or an entry in an encyclopedia


If the article/entry is signed, include the author's name; if unsigned, begin with the title of the entry

Guignon, C. B. (1998). Existentialism. In E. Craig (Ed.), Routledge encyclopedia of


philosophy (Vol. 3, pp. 493-502). London, England: Routledge.

454
[b] Article in a journal

Mellers, B. A. (2000). Choice and the relative pleasure of consequences. Psychological


Bulletin, 126, 910-924.

Note: List only the volume number if the periodical uses continuous pagination throughout a
particular volume. If each issue begins with page 1, then list the issue number as well.

Klimoski, R., & Palmer, S. (1993). The ADA and the hiring process in
organizations. Consulting Psychology Journal: Practice and Research, 45(2), 10-36.

[c] Article in a newspaper or magazine

Semenak, S. (1995, December 28). Feeling right at home: Government residence


eschews traditional rules. Montreal Gazette, p. A4.

Driedger, S. D. (1998, April 20). After divorce. Maclean's, 111(16), 38-43.

[d] Television or radio program

MacIntyre, L. (Reporter). (2002, January 23). Scandal of the Century [Television series
episode]. In H. Cashore (Producer), The fifth estate. Toronto, Canada: Canadian
Broadcasting Corporation.

[e] Film, video recording or DVD

Kubrick, S. (Director). (1980). The Shining [Motion picture]. United States: Warner
Brothers.

II ELECTRONIC SOURCES

We are in the era of ICT. Publishing online, either an article or paper or a book, has become the
fast growing trend and quite often, the most popular and useful happening in the world of
publishing. This guide provides basic guidelines and examples for citing electronic sources using
the Publication Manual of the American Psychological Association, 6th edition and the APA Style
for Electronic Sources (2008). APA style requires that sources receive attribution in the text by
the use of parenthetical in-text references.

Where available, the doi (digital object identifier) number should be used to provide access
information for electronic materials. URLs may be included for resources that do not have a doi
number. The names of full-text databases and rarely necessary in an APA citation. Retrieval date
information should only be included when the page/site/information is likely to change.

[a] Journal Articles

[a1] Article with DOI

Format:
Author Last, First Initial. (Year of Publication). Article title. Journal Title, volume #(issue number),
start page-end page. doi: alphanumeric string

Citation:

455
Welch, K.E. (2005). Technical communication and physical location: Topoi and
architecture in computer classrooms. Technical Communication Quarterly 14(3), 335-344. doi:
10.1207/s15427625tcq1403_12

[a2] Article with DOI

Format:
Author Last, First Initial. (Year of Publication). Article title. Journal Title, volume #(issue number),
start page-end page. Retrieved from URL

Citation:
Fisher, D., Russell, D., Williams, J., & Fisher, D. (2008). Space, time & transfer in
virtual case environments. Kairos 12(2), 127-165. Retrieved from
http://kairos.technorhetoric.net/12.2/binder.html?topoi/fisher-etal/articleIntro.html

[a3] Advanced Online Publication

Format:
Author Last, First Initial. (Year of Publication). Article title. Journal Title,
volume #(issue number), start page-end page. Advance online
publication. doi: alphanumeric string or URL

Note: In the following , the text includes neither page numbers nor a doi number. Therefore, the
page number component is not included and the URL is substituted for the doi.

Citation:
St. John, J., & Quinn, T.W. (2008). Rapid capture of DNA targets.
Biotechniques 44(2). Advance online publication. Retrieved
from http://www.biotechniques.com/default.asp?page=aop&subsection
=article_display&display=full&id=112633

[a4] In-Press article retrieved from institutional or personal Web site

Format:
Author Last, First Initial. (in press). Article title. Journal Title. Retrieved from doi or URL

Citation:
Papini, P., Adriani, O., Ambriola, M., Barbarino, G.C., Basili, A., Bazilevskaja, G.A.,et al. (in
press). In-flight performances of the PAMELA satellite experiment. Nuclear Instruments and
Methods in Physics Research Section A: Accelerators, Spectrometers, Detectors, and Associated
Equipment. Retrieved from http://www.sciencedirect.com

[a5] Manuscript in preparation, retrieved from institutional or personal Web site

Format:
Author Last, First Initial. (n.d.). Manuscript title. Manuscript in preparation. Retrieved from URL

Note: The initials n.d. (no date) are included here in lieu of the publication date.

Citation:
Goggans, H. (n.d.) The “Floating Bear” as zine precursor. Manuscript in preparation. Retrieved
from http://www.heathergoggans.com

456
[b] Electronics Books

[b1] Entire Book

Format:
Author Last, First Initial. (Year). Title. Retrieved from URL

Citation:
Dickens, C. (1910). A tale of two cities. Retrieved from
http://books.google.com/books?id=Pm0AAAAAYAAJ

[b2] Book Chapter

Format::
Author Last, First Initial. (Year). Chapter title. In First Initial Last Name & First Initial Last Name
(Eds.), Book title (pp. start page-end page). doi: alphanumeric string

Note: Use a doi number if available. If a number is not available, do not provide retrieval
information for book chapters. See .

Citation:
Shun, I. (1998). The invention of the martial arts: Kanao Jigorao and Kaodaokan judo. In S.
Vlastos (Ed.), Mirror of modernity: Invented traditions of modern Japan (pp. 163-173).

[c] Dissertations and Theses

[c1] Thesis retrieved from Database

Format:
Author Last, First Initial. (Year). Title. Retrieved from database name. (accession number if
available)

Citation:
Houck, A.M. If God is God: Laughter and the divine in ancient Greek and modern Christian
literature. Retrieved from ProQuest Digital Dissertations. (AAI9990560)

[c2] Dissertation Defense

Format:
Author Last, First Initial. (Year, Month Day of Pub). Title [Format of defense] (Dissertation
defense, University Name). Name. Retrieved from URL

Citation:
Boardman, R. (2004, September 24). Improving tool support for personal information
management [PowerPoint slides](Dissertation defense, Imperial College of London Department of
Electrical and Electronic Engineering). Retrieved from http://www.slideshare.net/rick/phd-
defenseimproving-tool-support-for-personal-information-management/

[d] Abstracts

457
[d1] Abstract as Original Source

Format:
Author Last, First Initial. (Year, Month Day of Pub). Title. [Abstract]. Retrieved from name of
database.

Note: If a publication number is assigned to the abstract, it may be included in parentheses after
the title. See .

Citation:
Berman, L.M., & Letellier, B. (1996). Pharaohs: Treasures of Egyptian art from
the Louvre (AEB 1996.0572) [Abstract]. Retrieved from Annual Egyptological Bibliography
database.

[d2] Abstract submitted for Meeting, Symposium or Poster Session

Format:
Author Last, First Initial. Title of Article. (Year, Month Day). Title of abstract. In First Initial Last
Name of authority (Title of Authority), Title of Meeting, Symposium, or Poster Session. Type of
meeting conducted at the name of sponsoring meeting or conference. Abstract retrieved from
URL

Citation:
Miller, C. (2007, June 25). Preserving soil survey data with GIS. In J.J. Meier (Web Editor),
Issues and trends in digital repositories of non-textual information: Support for research and
teaching. Poster session conducted at the ACRL Science and Technology Section conference.
Abstract retrieved from http://www.ala.org/ala/acrl/aboutacrl/acrlsections/sciencetech/
stsconferences/posters07.cfm#poster5

[d3] Abstract from Secondary Source

Format:
Author Last, First Initial. (Year). Article title. Journal Title volume #(issue number if available),
start page-end page. Abstract retrieved from secondary source
name.

Citation:
Chung, D.S., & Kim, S. (2008). Blogging activity among cancer patients and their companions:
Uses, gratifications, and predictors of outcomes. Journal of the American Society for Information
Science and Technology 59(2), 297-306. Abstract retrieved from Wiley InterScience database.

[e] Bibliographies

[e1] Bibliography from Web site

Format:
Author Last, First Initial. (Year of Pub). Title. Retrieved from Name of Source:URL.

Citation:
de Zepetnek, S.T., Nielsen, W.C., & Aoun, S. (n.d.). Selected bibliography for

458
work in comparative cultural studies (history, theory, method). Retrieved from CLCWeb:
Comparative Literature and Culture: http://clcwebjournal.lib.purdue.edu/library/
comparativeculturalstudies(biblio).html

[e2] Bibliography from Courseware

Format:
Last Name, First Initials of Author. (Year of Pub). Title of course [Bibliography]. Retrieved from
Name of University and Courseware Product/Site: URL.

Citation:
Johnston, S.L. (2004). French resources on the web [Bibliography]. Retrieved from Trinity
University BlackBoard site: http://bb.trinity.edu

[f] Book Reviews and Journal Article Commentaries

[f1] Book Review

Format:
Author Last, First Initial. (Year of Publication). Title of review [Review of the book Title of book].
Journal Title, volume #(issue # if available), inclusive page numbers or location on the web page.
Retrieved from URL

Citation:
Ferrer, H. (2006). The case of the disappearing genres [Review of the book Best American
mystery stories 2005]. American Book Review, 27(4), 8-9. Retrieved from
http://americanbookreview.org

[f2] Peer Commentary, titled

Format:
Author Last, First Initial. (Year of Pub). Title of commentary [Peer commentary on the journal
article “Title of article”]. Retrieved Month Day, Year, from URL

Note: If the title of the item under review is clear from the title of the review, then the bracketed
explanation is not necessary.

Citation:
Bizzell, P., & Herzberg, B. (1988). A response to Kathleen E. Welch [Peer commentary on the
journal article “A critique of classical rhetoric: Thecontemporary appropriation of ancient
discourse”]. Rhetoric Review 6(2): 246. Retrieved from http://www.jstor.org/stable/465942

[g] Curriculum and Course Materials

[g1] Curriculum Guide

Format:
Author Last, First Initial. (Year of Publication). Title. Retrieved from host web sitename: URL.

Citation:

459
National Park Service, U.S. Department of the Interior. (2007). Imagining the corps of discovery:
Visual art of and about the Lewis and Clark expedition. Retrieved from Lewis and Clark Journey
of Discovery web site: http://www. nps.gov/archive/jeff/LewisClark2/Education/ Visual Art/
VisualArtLessonPlan.htm

[g2] Lecture Notes

Format:
Author Last, First Initial. (Year of Pub). Title [format of notes].Retrieved from host web site name:
URL.

Citation:
Johannesson, C. (2008). The mole [PowerPoint slides]. Retrieved from Communication Arts High
School web site: http://www.nisd.net/ communicationsarts/pages/chem/ppt/ molarconv_pres.ppt

[h] Raw Data

[h1] Data Set

Format:
Author Last, First Initial or Corporate Author Name. (Year of Publication). Title. [Format of data].
Available from Name of source site: URL

Citation:
Chris Bell U.S. Congress Committee. (2004). FEC-116877 Form F3 [Data file]. Retrieved from
Federal Election Commission web site: http://www.fec.org/finance/disclosure/efile_search.shtml

[h2] Graphic Representation of Data

Format:
Author Last, First Initial or Corporate Author Name. (Year of Pub). [Description of graphic
representation of data]. Title of source. Retrieved from URL

Citation:
Sullivan, R.D. (2007). [Map depicting 10 different political regions in the United States for the
2008 election year]. Beyond red & blue. Retrieved from http:// massinc.typepad.com/
beyondredandblue/2007/09/beyond-red-blue.html

[h3] Qualitative Data

Format:
Author Last, First Initial (Responsibility) and Subject Last, First Initial (Responsibility).
(Year of Publication). Title of collected data [Format of data]. Retrieved from Name of web site:
URL

Note: Interviews conducted one-on-one that have not been preserved in print or other formats
should be cited in text as a personal communication. Data that cannot be retrieved is not included
in the list of references.

Citation:
Quintard, T. (Interviewer) & Monroe, E. (Interviewee). (1974). Ethel Monroe, April

460
5, 1974 [Audio file]. Retrieved from Black Oral History web site: http://www.wsulibs.
wsu.edu/holland/masc/xblackoralhistory.html

[i] Reference Materials

[i1] Online Encyclopedia

Format:
Author Last, First Initial. (Year of Publication). Title of entry. In First Initial Last Name
of editor (Ed.), Title. Retrieved Month Day, Year, from URL of index or main page of encyclopedia

Note: If no author is listed for an entry, include the title of the entry first in the citation.

Citation:
Kania, A. (2007). Philosophy of music. In E.N. Zalta (Ed.), The Stanford encyclopedia of
philosophy. Retrieved from http://plato.stanford.edu/entries/music/

[i2] Online Dictionary

Format:
Title of entry. (Year of Publication). In Title of dictionary. Retrieved Month Day, Year,
from URL of index or main page of dictionary

Citation:
German shepherd. (n.d.). In Merriam-Webster’s collegiate dictionary. Retrieved from http://
www.britannica.com/dictionary?book=Dictionary

[i3] Online Handbook

Format:
Author Last, First Initial. (Year of Publication). Title of entry. In First Initial Last Name
of editor (Ed.), Title. Retrieved Month Day, Year, from URL of index or main page of handbook

Note: If no author is listed for an entry, include the title of the entry first in the citation.

Citation:
Wallace, E. (n.d.). Fort Stockton, TX. In R.R. Barkley (Ed.), The Handbook of
Texas online. Retrieved from http://www.tshaonline.org/handbook/online/index.html

[i4] Wiki

Format:
Title of entry. (n.d.). Retrieved Month Day, Year, from Title of Wiki: URL

Note: Wikis are collaboratively authored, rarely signed, and always changing. Therefore, author
and publication date are not required.

Citation:
Judo. (n.d.). Retrieved August 29, 2007, from Wikipedia: http://en.wikipedia.org/wiki/Judo

[j] Gray Literature

461
[j1] Annual Report

Format:
Author Last, First Initial or Corporate Author Name. (Year of Publication). Title. Retrieved from
URL

Citation:
Honda Motor Co. (2004). Annual report. Retrieved from http://world.honda.com/ investors/
annualreport/2004/07.html

[j2] Fact Sheet

Format:
Author Last, First Initial or Corporate Author Name. (Year of Publication). Title
[Fact sheet]. Retrieved from URL

Citation:
Boy Scouts of America. (n.d.). Merit badge program [Fact sheet]. Retrieved from http://www.
scouting.org/factsheets/02-500.html

[j3] Consumer Brochure

Format:
Author Last, First Initial or Corporate Author Name. (Year of Publication). Title
[Brochure]. Retrieved from URL

Citation:
First Five Oral Health. (2005). Healthy teeth begin at birth [Brochure]. Retrieved from
http://www.first5oralhealth.org/page.asp?page_id=439

[j4] Public Service Announcement

Format:
Author Last, First Initial or Corporate Author Name (Responsibility of Author).
(Year of Publication). Title [Format of announcement]. Retrieved from URL

Citation:
Lynch, D. (Director). (2007). Clean up New York [Video file]. Retrieved from http://www.
youtube.com/watch?v=ZSWv90msTUc

[j5] Presentation Slides

Format:
Author Last, First Initial or Corporate Author Name. (Year of Publication). Title [Format of slides].
Retrieved from URL

Citation:
Rutter, R., & Boulton, M. (2007). Web typography sucks [PowerPoint slides].Retrieved from
http://webtypography.net/sxsw2007/

[j6] Technical or Research Report

462
Format:
Author Last, First Initial or Corporate Author Name. (Year of Publication). Title (Report No. if
available). Retrieved from URL

Citation:
Miller, D.C., Sen, A., & Malley, L.B. (2007). Comparative indicators of education in the United
States and other G8 countries (Report No. GPO ED003826P). http://nces.ed.gov/
pubsearch/pubsinfo.asp?pubid=2007006

[j7] Press Release

Format:
Author Last, First Initial or Corporate Author Name. (Year, Month Day of Publication).Title [Press
release]. Retrieved from URL

Citation:
Department of Athletics, Trinity University. (2008, January 7). Trinity winsPontiac game changing
performance of the year award [Press release].Retrieved from
http://www.trinity.edu/departments/athletics/Football/Pontiac_GCPOY.htm

[j8] Policy Brief

Format:
Author Last, First Initial or Corporate Author Name. (Year of Publication). Title
(Policy No. if available). Retrieved from URL

Citation:
Organization for Economic Cooperation and Development. (2007). Climate
change: Meeting the challenge to 2050. Retrieved from http://www.oecd.org/dataoecd/
6/21/39762914.pdf

[j9] Educational Standards

Format:
Author Last, First Initial or Corporate Author Name. (Year of Publication). Title.Retrieved from
URL

Citation:
Texas Education Agency. (1998). Texas essential knowledge and skills for social
studies, subchapter c., high school. Retrieved from http://www.tea.state.tx.us/
rules/tac/chapter113/ch113c.pdf

[j10] White Paper

Format:
Author Last, First Initial or Corporate Author Name. (Year of Publication). Title
[White paper]. Retrieved from URL

Citation:
OCLC Online Computer Library Center. (2002). OCLC white paper on the
information habits of college students [White paper]. Retrieved from
http://www5.oclc.org/downloads/community/informationhabits.pdf

463
[j11] Newsletter Article

Format:
Author Last, First Initial or Corporate Author Name. (Year of Publication). Title of
Article. Title of Newsletter, volume # (issue #). Retrieved from URL

Citation:
Seiden, Peggy. (2006). Library survey evaluates service. @library.edu, the
Swarthmore College library newsletter, 8(2). Retrieved from http://
www.swarthmore.edu/Documents/library/spring06.pdf

[k] General Interest Media and Alternative Presses

[k1] Newspaper Article

Format:
Author Last, First Initial. (Year, Month Day of Publication). Article title. Newspaper Name.
Retrieved from URL

Citation:
Mapes, L.V. (2005, May 25). Unearthing Tse-whit-zen. Seattle Times.Retrieved from
http://seattletimes.nwsource.com

[k2] Television Feature, Podcast

Format:
Author Last, First Initial, & Author Last, First Initial (Author Responsibility). (Year of Pub). Title of
television feature [Motion picture]. In First Initial Last Name (Role of Presenter), Title of program.
Podcast retrieved from Name of host
site: URL

Citation:
Schultz, D. (Producer/Writer). (2007). Silence of the bees [Motion picture]. In Kaufman, F.
(Executive Producer), Nature. Podcast retrieved from PBS:
http://www.pbs.org/wnet/nature/rss/podcast.xml

[k3] Audio Podcast

Format:
Author Last, First Initial (Author Responsibility). (Year, Month Day of Publication).
Title of podcast [Podcast identification number if available]. Podcast series Name. Podcast
retrieved from URL

Citation:
Hinze, S. (Host). (2007, December 25). Robots! [Show 440]. Fanboy radio. Podcast retrieved
from http://media.libsyn.com/media/fanboyradio/fbr_440.mp3

[k4] Online Magazine content not found in print version

Format:
Author Last, First Initial. (Year of Pub). Title of article [Online exclusive]. Title of
Magazine. Retrieved Month Day, Year, from URL
464
Citation:
Millet, M. (2005). NextGen: Is this the ninth circle of hell? [Online exclusive]. Library Journal.
Retrieved December 7, 2007, from http:// www.libraryjournal.com/article/CA509641.html

[l] Online Communities

[l1] Message posted to a newsgroup, or a online forum, or discussion group

Format:
Author Last, First Initial. (Year, Month Day of posting). Title of post [post number if available].
Message posted to URL

Citation:
Epstein, P. (2005, November 20). Dice manipulation. Message posted to
http://www.bkgm.com/rgb/rgb.cgi?menu

[l2] Message posted to an electronic mailing list

Format:
Author Last, First Initial. (Year, Month Day of Pub). Title of post [Msg. # if available]. Message
posted to Name of List, archived at URL

Note: Since messages in an e-mail list are posted through email, the URL should direct readers
to the web site or page where the messages have been archived.

Citation:
Bennick, T. (2007, December 28). Speedball press [Msg. #00189]. Message posted to Book Arts
Web electronic mailing list, archived at http://palimpsest.stanford.edu/byform/mailing-
lists/bookarts/#archive

[l3] Weblog Post

Format:
Author Last, First Initial. (Year, Month Day of Publication). Title. Message posted to URL

Citation:
Rush, Wilhelmina. (2007, July 12). Four stars! Message posted to
http://wilhelminarush.livejournal.com

[l4] Video Weblog Post

Format:
Author Last, First Initial. (Year, Month Day of Publication). Title [Format of post]. Video posted to
URL

Note: If the author’s name is not provided, the screen name of the posting author may be used
instead.

Citation:
Rjsivey. (2007, July 27). Narcoleptic Chihuahua [Video file]. Video posted to
http://www.youtube.com/watch?v=XyzeCiW-nn0

465
[m] Computer Programs, Software, and Programming Languages

[m1] Software downloaded from a Website

Format:
Author Last, First Initial. (Year of Publication). Name of product [format of product]. Available from
Name of source: URL Title.

Citation:
Elizabeth Huth Coates Library, Trinity University. (2007). Coates library toolbar [Software].
Retrieved from Coates Library: http://lib.trinity.edu/toolbar/index.shtml

Conclusion

The Internet and other digital sources of information are widely used tools for research, but since
they are still relatively new tools, various disciplines are still deciding what the correct way to
document electronic sources is, and disciplines are constantly changing their minds as to what
the most appropriate ways are. It may be noted that the situation is in the very formative stage.

To ensure accuracy, it is always best to consult the style manual and/or accompanying website
for your discipline first before consulting other sources. Other way to determine the style you
should use is to ask your research supervisor for guidelines or resources, or to locate the official
website for publications in your discipline and see if they have any guidelines or style manuals
available.
---------------------------------------------------------------------------------------------------------------------------------
References

01] American Psychological Association. Publication Manual of the American Psychological


Association. 6th ed. Washington DC: APA, 2009. Print.

02] http://www.dianahacker.com/resdoc/home.html

03] http://owl.english.purdue.edu/owl/section/2/

04] http://www.apastyle.org/apa-style-help.aspx

05] http://www.apastyle.org/learn/index.aspx

06] http://www.apastyle.org/products/index.aspx

07] http://owl.english.purdue.edu/owl/section/2/10/

08] http://www.apastyle.org/

09] http://www.lib.usm.edu/help/style_guides/apa.html

10] http://www.nlight.com/Success/Research/8cite.html

“The proper words in the proper places are the true definition of style.”
<< Jonathan Swift

466
A2Z

PhD
Thesis

Reflections on Academic Research

Appendix IV
Excel for
467 Statistical Data Analysis
Excel For Statistical Data Analysis
Europe Site Site for Asia Site for Middle East UK Site USA Site

This is a webtext companion site of


Business Statistics
Asia-Pacific Site Europe Site Site for Asia Site for Middle East USA Site

Para mis visitantes del mundo de habla hispana, este sitio se encuentra
disponible en español en:
Sitio Espejo para España Sitio Espejo para América Latina Sitio de los E.E.U.U.

Excel is the widely used statistical package, which serves as a tool to


understand statistical concepts and computation to check your hand-worked
calculation in solving your homework problems. The site provides an
introduction to understand the basics of and working with the Excel. Redoing
the illustrated numerical examples in this site will help improving your
familiarity and as a result increase the effectiveness and efficiency of your
process in statistics.

Professor Hossein Arsham

To search the site, try Edit | Find in page [Ctrl + f]. Enter a word or phrase in the
dialogue box, e.g. "variance" or "mean" If the first appearance of the word/phrase is
not what you are looking for, try Find Next.

MENU

1. Introduction
2. Entering Data
3. Descriptive Statistics
4. Normal Distribution
5. Confidence Interval for the Mean
6. Test of Hypothesis Concerning the Population Mean
7. Difference Between Mean of Two Populations
8. ANOVA: Analysis of Variances
9. Goodness-of-Fit Test for Discrete Random Variables
10. Test of Independence: Contingency Tables
468
11. Test Hypothesis Concerning the Variance of Two Populations
12. Linear Correlation and Regression Analysis
13. Moving Average and Exponential Smoothing
14. Applications and Numerical Examples
15. E-Labs to Fully Understand Statistical Concepts
16. Interesting and Useful Sites

Companion Sites:

 Topics in Statistical Data Analysis


 Time Series Analysis and Business Forecasting
 Computers and Computational Statistics
 Questionnaire Design and Surveys Sampling
 Probabilistic Modeling
 Systems Simulation
 Probability and Statistics Resources
 The Business Statistics Online Course

Introduction

This site provides illustrative experience in the use of Excel for data summary,
presentation, and for other basic statistical analysis. I believe the popular use of
Excel is on the areas where Excel really can excel. This includes organizing
data, i.e. basic data management, tabulation and graphics. For real statistical
analysis on must learn using the professional commercial statistical packages such
as SAS, and SPSS.

Microsoft Excel 2000 (version 9) provides a set of data analysis tools called
the Analysis ToolPak which you can use to save steps when you develop
complex statistical analyses. You provide the data and parameters for each
analysis; the tool uses the appropriate statistical macro functions and then
displays the results in an output table. Some tools generate charts in addition to
output tables.

If the Data Analysis command is selectable on the Tools menu, then the
Analysis ToolPak is installed on your system. However, if the Data Analysis
command is not on the Tools menu, you need to install the Analysis ToolPak
by doing the following:

Step 1: On the Tools menu, click Add-Ins.... If Analysis ToolPak is not listed
in the Add-Ins dialog box, click Browse and locate the drive, folder name, and

469
file name for the Analysis ToolPak Add-in — Analys32.xll — usually located
in the Program Files\Microsoft Office\Office\Library\Analysis folder. Once
you find the file, select it and click OK.

Step 2: If you don't find the Analys32.xll file, then you must install it.

1. Insert your Microsoft Office 2000 Disk 1 into the CD ROM drive.
2. Select Run from the Windows Start menu.
3. Browse and select the drive for your CD. Select Setup.exe, click
Open, and click OK.
4. Click the Add or Remove Features button.
5. Click the + next to Microsoft Excel for Windows.
6. Click the + next to Add-ins.
7. Click the down arrow next to Analysis ToolPak.
8. Select Run from My Computer.
9. Select the Update Now button.
10.Excel will now update your system to include Analysis ToolPak.
11.Launch Excel.
12.On the Tools menu, click Add-Ins... - and select the Analysis
ToolPak check box.

Step 3: The Analysis ToolPak Add-In is now installed and Data Analysis... will
now be selectable on the Tools menu.

Microsoft Excel is a powerful spreadsheet package available for Microsoft


Windows and the Apple Macintosh. Spreadsheet software is used to store
information in columns and rows which can then be organized and/or
processed. Spreadsheets are designed to work well with numbers but often
include text. Excel organizes your work into workbooks; each workbook can
contain many worksheets; worksheets are used to list and analyze data .

Excel is available on all public-access PCs (i.e., those, e.g., in the Library and
PC Labs). It can be opened either by selecting Start - Programs - Microsoft
Excel or by clicking on the Excel Short Cut which is either on your desktop, or
on any PC, or on the Office Tool bar.

Opening a Document:

470
 Click on File-Open (Ctrl+O) to open/retrieve an existing
workbook; change the directory area or drive to look for files in
other locations
 To create a new workbook, click on File-New-Blank Document.

Saving and Closing a Document:

To save your document with its current filename, location and file format either
click on File - Save. If you are saving for the first time, click File-Save;
choose/type a name for your document; then click OK. Also use File-Save if
you want to save to a different filename/location.

When you have finished working on a document you should close it. Go to the
File menu and click on Close. If you have made any changes since the file was
last saved, you will be asked if you wish to save them.

The Excel screen

Workbooks and worksheets:

When you start Excel, a blank worksheet is displayed which consists of a


multiple grid of cells with numbered rows down the page and alphabetically-
titled columns across the page. Each cell is referenced by its coordinates (e.g.,
A3 is used to refer to the cell in column A and row 3; B10:B20 is used to refer
to the range of cells in column B and rows 10 through 20).

Your work is stored in an Excel file called a workbook. Each workbook may
contain several worksheets and/or charts - the current worksheet is called the

471
active sheet. To view a different worksheet in a workbook click the appropriate
Sheet Tab.

You can access and execute commands directly from the main menu or you can
point to one of the toolbar buttons (the display box that appears below the
button, when you place the cursor over it, indicates the name/action of the
button) and click once.

Moving Around the Worksheet:

It is important to be able to move around the worksheet effectively because you


can only enter or change data at the position of the cursor. You can move the
cursor by using the arrow keys or by moving the mouse to the required cell and
clicking. Once selected the cell becomes the active cell and is identified by a
thick border; only one cell can be active at a time.

To move from one worksheet to another click the sheet tabs. (If your workbook
contains many sheets, right-click the tab scrolling buttons then click the sheet
you want.) The name of the active sheet is shown in bold.

Moving Between Cells:

Here is a keyboard shortcuts to move the active cell:

 Home - moves to the first column in the current row


 Ctrl+Home - moves to the top left corner of the document
 End then Home - moves to the last cell in the document

To move between cells on a worksheet, click any cell or use the arrow keys. To
see a different area of the sheet, use the scroll bars and click on the arrows or
the area above/below the scroll box in either the vertical or horizontal scroll
bars.

Note that the size of a scroll box indicates the proportional amount of the used
area of the sheet that is visible in the window. The position of a scroll box
indicates the relative location of the visible area within the worksheet.

Entering Data

472
A new worksheet is a grid of rows and columns. The rows are labeled with
numbers, and the columns are labeled with letters. Each intersection of a row
and a column is a cell. Each cell has an address, which is the column letter and
the row number. The arrow on the worksheet to the right points to cell A1,
which is currently highlighted, indicating that it is an active cell. A cell must
be active to enter information into it. To highlight (select) a cell, click on it.

To select more than one cell:

 Click on a cell (e.g. A1), then hold the shift key while you click
on another (e.g. D4) to select all cells between and including A1
and D4.
 Click on a cell (e.g. A1) and drag the mouse across the desired
range, unclicking on another cell (e.g. D4) to select all cells
between and including A1 and D4.
 To select several cells which are not adjacent, press "control" and
click on the cells you want to select. Click a number or letter
labeling a row or column to select that entire row or column.

One worksheet can have up to 256 columns and 65,536 rows, so it'll be a while
before you run out of space.

Each cell can contain a label, value, logical value, or formula.

 Labels can contain any combination of letters, numbers, or


symbols.
 Values are numbers. Only values (numbers) can be used in
calculations. A value can also be a date or a time
 Logical values are "true" or "false."
 Formulas automatically do calculations on the values in other
specified cells and display the result in the cell in which the
formula is entered (for example, you can specify that cell D3 is to
contain the sum of the numbers in B3 and C3; the number
displayed in D3 will then be a funtion of the numbers entered into
B3 and C3).

473
To enter information into a cell, select the cell and begin typing.

Note that as you type information into the cell, the information you enter also
displays in the formula bar. You can also enter information into the formula
bar, and the information will appear in the selected cell.

When you have finished entering the label or value:

 Press "Enter" to move to the next cell below (in this case, A2)
 Press "Tab" to move to the next cell to the right (in this case, B1)
 Click in any cell to select it

Entering Labels

Unless the information you enter is formatted as a value or a formula, Excel


will interpret it as a label, and defaults to align the text on the left side of the
cell.

If you are creating a long worksheet and you will be repeating the same label
information in many different cells, you can use the AutoComplete function.
This function will look at other entries in the same column and attempt to
match a previous entry with your current entry. For example, if you have
already typed "Wesleyan" in another cell and you type "W" in a new cell, Excel
will automatically enter "Wesleyan." If you intended to type "Wesleyan" into
the cell, your task is done, and you can move on to the next cell. If you
intended to type something else, e.g. "Williams," into the cell, just continue
typing to enter the term.

To turn on the AutoComplete funtion, click on "Tools" in the menu bar, then
select "Options," then select "Edit," and click to put a check in the box beside
"Enable AutoComplete for cell values."

Another way to quickly enter repeated labels is to use the Pick List feature.
Right click on a cell, then select "Pick From List." This will give you a menu of

474
all other entries in cells in that column. Click on an item in the menu to enter it
into the currently selected cell.

Entering Values

A value is a number, date, or time, plus a few symbols if necessary to further


define the numbers [such as: . , + - ( ) % $ / ].

Numbers are assumed to be positive; to enter a negative number, use a minus


sign "-" or enclose the number in parentheses "()".

Dates are stored as MM/DD/YYYY, but you do not have to enter it precisely in
that format. If you enter "jan 9" or "jan-9", Excel will recognize it at January 9
of the current year, and store it as 1/9/2002. Enter the four-digit year for a year
other than the current year (e.g. "jan 9, 1999"). To enter the current day's date,
press "control" and ";" at the same time.

Times default to a 24 hour clock. Use "a" or "p" to indicate "am" or "pm" if
you use a 12 hour clock (e.g. "8:30 p" is interpreted as 8:30 PM). To enter the
current time, press "control" and ":" (shift-semicolon) at the same time.

An entry interpreted as a value (number, date, or time) is aligned to the right


side of the cell, to reformat a value.

Rounding Numbers that Meet Specified Criteria: To apply colors to


maximum and/or minimum values:

1. Select a cell in the region, and press Ctrl+Shift+* (in Excel 2003,
press this or Ctrl+A) to select the Current Region.
2. From the Format menu, select Conditional Formatting.
3. In Condition 1, select Formula Is, and type =MAX($F:$F) =$F1.
4. Click Format, select the Font tab, select a color, and then click
OK.
5. In Condition 2, select Formula Is, and type =MIN($F:$F) =$F1.

475
6. Repeat step 4, select a different color than you selected for
Condition 1, and then click OK.

Note: Be sure to distinguish between absolute reference and relative reference


when entering the formulas.

Rounding Numbers that Meet Specified Criteria

Problem: Rounding all the numbers in column A to zero decimal places,


except for those that have "5" in the first decimal place.

Solution: Use the IF, MOD, and ROUND functions in the following formula:
=IF(MOD(A2,1)=0.5,A2,ROUND(A2,0))

To Copy and Paste All Cells in a Sheet

1. Select the cells in the sheet by pressing Ctrl+A (in Excel 2003,
select a cell in a blank area before pressing Ctrl+A, or from a
selected cell in a Current Region/List range, press Ctrl+A+A).
OR
Click Select All at the top-left intersection of rows and columns.
2. Press Ctrl+C.
3. Press Ctrl+Page Down to select another sheet, then select cell A1.
4. Press Enter.

To Copy the Entire Sheet

Copying the entire sheet means copying the cells, the page setup parameters,
and the defined range Names.

Option 1:

1. Move the mouse pointer to a sheet tab.


2. Press Ctrl, and hold the mouse to drag the sheet to a different
location.
3. Release the mouse button and the Ctrl key.

Option 2:

1. Right-click the appropriate sheet tab.


2. From the shortcut menu, select Move or Copy. The Move or Copy
dialog box enables one to copy the sheet either to a different

476
location in the current workbook or to a different workbook. Be
sure to mark the Create a copy checkbox.

Option 3:

1. From the Window menu, select Arrange.


2. Select Tiled to tile all open workbooks in the window.
3. Use Option 1 (dragging the sheet while pressing Ctrl) to copy or
move a sheet.

Sorting by Columns

The default setting for sorting in Ascending or Descending order is by row. To


sort by columns:

1. From the Data menu, select Sort, and then Options.


2. Select the Sort left to right option button and click OK.
3. In the Sort by option of the Sort dialog box, select the row number
by which the columns will be sorted and click OK.

Descriptive Statistics

The Data Analysis ToolPak has a Descriptive Statistics tool that provides you
with an easy way to calculate summary statistics for a set of sample data.
Summary statistics includes Mean, Standard Error, Median, Mode, Standard
Deviation, Variance, Kurtosis, Skewness, Range, Minimum, Maximum, Sum,
and Count. This tool eliminates the need to type indivividual functions to find
each of these results. Excel includes elaborate and customisable toolbars, for
example the "standard" toolbar shown here:

Some of the icons are useful mathematical computation:


is the "Autosum" icon, which enters the formula "=sum()" to add up a range of cells.
is the "FunctionWizard" icon, which gives you access to all the functions available.
is the "GraphWizard" icon, giving access to all graph types available, as shown in this display:

477
Excel can be used to generate measures of location and variability for a
variable. Suppose we wish to find descriptive statistics for a sample data: 2, 4,
6, and 8.

Step 1. Select the Tools *pull-down menu, if you see data analysis, click on
this option, otherwise, click on add-in.. option to install analysis tool pak.

Step 2. Click on the data analysis option.

Step 3. Choose Descriptive Statistics from Analysis Tools list.

Step 4. When the dialog box appears:

Enter A1:A4 in the input range box, A1 is a value in column A and row 1, in
this case this value is 2. Using the same technique enter other VALUES until
you reach the last one. If a sample consists of 20 numbers, you can select for
example A1, A2, A3, etc. as the input range.

Step 5. Select an output range, in this case B1. Click on summary statistics to
see the results.

Select OK.

When you click OK, you will see the result in the selected range.

As you will see, the mean of the sample is 5, the median is 5, the standard
deviation is 2.581989, the sample variance is 6.666667,the range is 6 and so on.
Each of these factors might be important in your calculation
of different statistical procedures.

478
Normal Distribution

Consider the problem of finding the probability of getting less than a certain
value under any normal probability distribution. As an illustrative example, let
us suppose the SAT scores nationwide are normally distributed with a mean
and standard deviation of 500 and 100, respectively. Answer the following
questions based on the given information:

A: What is the probability that a randomly selected student score will be less
than 600 points?
B: What is the probability that a randomly selected student score will exceed
600 points?
C: What is the probability that a randomly selected student score will be
between 400 and 600?

Hint: Using Excel you can find the probability of getting a value approximately
less than or equal to a given value. In a problem, when the mean and the
standard deviation of the population are given, you have to use common sense
to find different probabilities based on the question since you know the area
under a normal curve is 1.

Solution:

In the work sheet, select the cell where you want the answer to appear.
Suppose, you chose cell number one, A1. From the menus, select "insert pull-
down".

Steps 2-3 From the menus, select insert, then click on the Function option.

Step 4. After clicking on the Function option, the Paste Function dialog appears
from Function Category. Choose Statistical then NORMDIST from
theFunction Name box; Click OK

Step 5. After clicking on OK, the NORMDIST distribution box appears:


i. Enter 600 in X (the value box);
ii. Enter 500 in the Mean box;
iii. Enter 100 in the Standard deviation box;
iv. Type "true" in the cumulative box, then click OK.

479
As you see the value 0.84134474 appears in A1, indicating the probability that
a randomly selected student's score is below 600 points. Using common sense
we can answer part "b" by subtracting 0.84134474 from 1. So the part "b"
answer is 1- 0.8413474 or 0.158653. This is the probability that a randomly
selected student's score is greater than 600 points. To answer part "c", use the
same techniques to find the probabilities or area in the left sides of values 600
and 400. Since these areas or probabilities overlap each other to answer the
question you should subtract the smaller probability from the larger probability.
The answer equals 0.84134474 - 0.15865526 that is, 0.68269. The screen shot
should look like following:

Inverse Case

Calculating the value of a random variable often called the "x" value

You can use NORMINV from the function box to calculate a value for the
random variable - if the probability to the left side of this variable is given.
Actually, you should use this function to calculate different percentiles. In this
problem one could ask what is the score of a student whose percentile is 90?
This means approximately 90% of students scores are less than this number. On
the other hand if we were asked to do this problem by hand, we would have had
to calculate the x value using the normal distribution formula x = m + zd. Now
let's use Excel to calculate P90. In the Paste function, dialog click on statistical,
then click on NORMINV. The screen shot would look like the following:

When you see NORMINV the dialog box appears.


i. Enter 0.90 for the probability (this means that approximately 90% of students'
score is less than the value we are looking for)
ii. Enter 500 for the mean (this is the mean of the normal distribution in our
case)
iii. Enter 100 for the standard deviation (this is the standard deviation of the
normal distribution in our case)

At the end of this screen you will see the formula result which is approximately
628 points. This means the top 10% of the students scored better than 628.

Confidence Interval for the Mean

480
Suppose we wish for estimating a confidence interval for the mean of a
population. Depending on the size of your sample size you may use one of the
following cases:

Large Sample Size (n is larger than, say 30):

The general formula for developing a confidence interval for a population


means is:

In this formula is the mean of the sample; Z is the interval coefficient, which
can be found from the normal distribution table (for example the interval
coefficient for a 95% confidence level is 1.96). S is the standard deviation of
the sample and n is the sample size.

Now we would like to show how Excel is used to develop a certain confidence
interval of a population mean based on a sample information. As you see in
order to evaluate this formula you need "the mean of the sample" and the
margin of error Excel will automatically calculate these
quantities for you.

The only things you have to do are:

add the margin of error to the mean of the sample, ; Find the
upper limit of the interval and subtract the margin of error from the mean to the
lower limit of the interval. To demonstrate how Excel finds these quantities we
will use the data set, which contains the hourly income of 36 work-study
students here, at the University of Baltimore. These numbers appear in cells A1
to A36 on an Excel work sheet.

After entering the data, we followed the descriptive statistic procedure to


calculate the unknown quantities. The only additional step is to click on the
confidence interval in the descriptive statistics dialog box and enter the given
confidence level, in this case 95%.

Here is, the above procedures in step-by-step:

Step 1. Enter data in cells A1 to A36 (on the spreadsheet)


Step 2. From the menus select Tools
Step 3. Click on Data Analysis then choose the Descriptive Statistics option

481
then click OK.

On the descriptive statistics dialog, click on Summary Statistic. After you have
done that, click on the confidence interval level and type 95% - or in other
problems whatever confidence interval you desire. In the Output Range box
enter B1 or what ever location you desire.
Now click on OK. The screen shot would look like the following:

482
As you see, the spreadsheet shows that the mean of the sample is =
6.902777778 and the absolute value of the margin of error =
0.231678109. This mean is based on this sample information. A 95%
confidence interval for the hourly income of the UB work-study students has an
upper limit of 6.902777778 + 0.231678109 and a lower limit of 6.902777778 -
0.231678109.

On the other hand, we can say that of all the intervals formed this way 95%
contains the mean of the population. Or, for practical purposes, we can be 95%
483
confident that the mean of the population is between 6.902777778 -
0.231678109 and 6.902777778 + 0.231678109. We can be at least 95%
confident that interval [$6.68 and $7.13] contains the average hourly income of
a work-study student.

Smal Sample Size (say less than 30) If the sample n is less than 30 or we must
use the small sample procedure to develop a confidence interval for the mean
of a population. The general formula for developing confidence intervals for
the population mean based on small a sample is:

In this formula is the mean of the sample. is the interval coefficient


providing an area of in the upper tail of a t distribution with n-1 degrees of
freedom which can be found from a t distribution table (for example the
interval coefficient for a 90% confidence level is 1.833 if the sample is 10). S is
the standard deviation of the sample and n is the sample size.

Now you would like to see how Excel is used to develop a certain confidence
interval of a population mean based on this small sample information.

As you see, to evaluate this formula you need "the mean of the sample" and
the margin of error Excel will automatically calculate these
quantities the way it did for large samples.

Again, the only things you have to do are: add the margin of
error to the mean of the sample, , find the upper limit of the
interval and to subtract the margin of error from the mean to find the lower
limit of the interval.

To demonstrate how Excel finds these quantities we will use the data set, which
contains the hourly incomes of 10 work-study students here, at the University
of Baltimore. These numbers appear in cells A1 to A10 on an Excel work sheet.

After entering the data we follow the descriptive statistic procedure to calculate
the unknown quantities (exactly the way we found quantities for large sample).
Here you are with the procedures in step-by-step form:

Step 1. Enter data in cells A1 to A10 on the spreadsheet


Step 2. From the menus select Tools
484
Step 3. Click on Data Analysis then choose the Descriptive Statistics option.
Click OK on the descriptive statistics dialog, click on Summary Statistic, click
on the confidence interval level and type in 90% or in other problems
whichever confidence interval you desire. In the Output Range box, enter B1 or
whatever location you desire. Now click on OK. The screen shot will look like
the following:

Now, like the calculation of the confidence interval for the large sample,
calculate the confidence interval of the population based on this small sample
information. The confidence interval is:

485
6.8 ± 0.414426102
or
$6.39<===>$7.21.

We can be at least 90% confidant that the interval [$6.39 and $7.21] contains
the true mean of the population.

Test of Hypothesis Concerning the Population Mean

Again, we must distinguish two cases with respect to the size of your sample

Large Sample Size (say, over 30): In this section you wish to know how Excel
can be used to conduct a hypothesis test about a population mean. We will use
the hourly incomes of different work-study students than those introduced
earlier in the confidence interval section. Data are entered in cells A1 to A36.
The objective is to test the following Null and Alternative hypothesis:

The null hypothesis indicates that the average hourly income of a work-study
student is equal to $7 per hour; however, the alternative hypothesis indicates
that the average hourly income is not equal to $7 per hour.

I will repeat the steps taken in descriptive statistics and at the very end will
show how to find the value of the test statistics in this case, z, using a cell
formula.

Step 1. Enter data in cells A1 to A36 (on the spreadsheet)

Step 2. From the menus select Tools

Step 3. Click on Data Analysis then choose the Descriptive Statistics option,
click OK.
On the descriptive statistics dialog, click on Summary Statistic. Select
the Output Range box, enter B1 or whichever location you desire. Now
click OK.

(To calculate the value of the test statistics search for the mean of the sample
then the standard error. In this output, these values are in cells C3 and C4.)
486
Step 4. Select cell D1 and enter the cell formula = (C3 - 7)/C4. The screen shot
should look like the following:

The value in cell D1 is the value of the test statistics. Since this value falls in
acceptance range of -1.96 to 1.96 (from the normal distribution table), we fail
to reject the null hypothesis.

Small Sample Size (say, less than 30):

487
Using steps taken the large sample size case, Excel can be used to conduct a
hypothesis for small-sample case. Let's use the hourly income of 10 work-study
students at UB to conduct the following hypothesis.

The null hypothesis indicates that average hourly income of a work-study


student is equal to $7 per hour .The alternative hypothesis indicates that
average hourly income is not equal to $7 per hour.

I will repeat the steps taken in descriptive statistics and at the very end will
show how to find the value of the test statistics in this case "t" using a cell
formula.

Step 1. Enter data in cells A1 to A10 (on the spreadsheet)


Step 2. From the menus select Tools
Step 3. Click on Data Analysis then choose the Descriptive Statistics option.
Click OK.
On the descriptive statistics dialog, click on Summary Statistic. Select the
Output Range boxes, enter B1 or whatever location you chose. Again, click
onOK.
(To calculate the value of the test statistics search for the mean of the sample
then the standard
error, in this output these values are in cells C3 and C4.)

Step 4. Select cell D1 and enter the cell formula = (C3 - 7)/C4. The screen shot
would look like the following:

488
Since the value of test statistic t = -0.66896 falls in acceptance range -2.262 to
+2.262 (from t table, where = 0.025 and the degrees of freedom is 9), we
fail to reject the null hypothesis.

Difference Between Mean of Two Populations

In this section we will show how Excel is used to conduct a hypothesis test
about the difference between two population means assuming that populations
have equal variances. The data in this case are taken from various offices here
at the University of Baltimore. I collected the hourly income data of 36

489
randomly selected work-study students and 36 student assistants. The hourly
income range for work-study students was $6 - $8 while the hourly income
range for student assistants was $6-$9. The main objective in this hypothesis
testing is to see whether there is a significant difference between the means of
the two populations. The NULL and the ALTERNATIVE hypothesis is that
the means are equal and the means are not equal, respectively.

Referring to the spreadsheet, I chose A1 and A2 as label centers. The work-


study students' hourly income for a sample size 36 are shown in cellsA2:A37,
and the student assistants' hourly income for a sample size 36 is shown in
cells B2:B37

Data for Work Study Student: 6, 6, 6, 6, 6, 6, 6, 6.5, 6.5, 6.5, 6.5, 6.5, 6.5, 7,
7, 7, 7, 7, 7, 7, 7.5, 7.5, 7.5, 7.5, 7.5, 7.5, 8, 8, 8, 8, 8, 8, 8, 8, 8.

Data for Student Assistant: 6, 6, 6, 6, 6, 6.5, 6.5, 6.5, 6.5, 6.5, 7, 7, 7, 7, 7,


7.5, 7.5, 7.5, 7.5, 7.5, 7.5, 8, 8, 8, 8, 8, 8, 8, 8.5, 8.5, 8.5, 8.5, 8.5, 9, 9, 9, 9.

Use the Descriptive Statistics procedure to calculate the variances of the two
samples. The Excel procedure for testing the difference between the two
population means will require information on the variances of the two
populations. Since the variances of the two populations are unknowns they
should be replaced with sample variances. The descriptive for both samples
show that the variance of first sample is s12 = 0.55546218, while the variance of
the second sample s22 =0.969748.

work-study student student assistant

Mean 7.05714286 Mean 7.471429


Standard Error 0.12597757 Standard Error 0.166454
Median 7 Median 7.5
Mode 8 Mode 8
Standard Deviation 0.74529335 Standard Deviation 0.984758
Sample Variance 0.55546218 Sample Variance 0.969748
Kurtosis -1.38870558 Kurtosis -1.192825
Skewness -0.09374375 Skewness -0.013819
Range 2 Range 3
Minimum 6 Minimum 6
Maximum 8 Maximum 9
Sum 247 Sum 261.5
Count 35 Count 35

490
To conduct the desired test hypothesis with Excel the following steps can be
taken:

Step 1. From the menus select Tools then click on the Data Analysis option.

Step 2. When the Data Analysis dialog box appears:


Choose z-Test: Two Sample for means then click OK

Step 3. When the z-Test: Two Sample for means dialog box appears:

Enter A1:A36 in the variable 1 range box (work-study students' hourly


income)
Enter B1:B36 in the variable 2 range box (student assistants' hourly
income)
Enter 0 in the Hypothesis Mean Difference box (if you desire to test a mean
difference other than 0, enter that value)
Enter the variance of the first sample in the Variable 1 Variance box
Enter the variance of the second sample in the Variable 2 Variance box and
select Labels
Enter 0.05 or, whatever level of significance you desire, in the Alpha box
Select a suitable Output Range for the results, I chose C19, then click OK.

The value of test statistic z=-1.9845824 appears in our case in cell D24. The
rejection rule for this test is z < -1.96 or z > 1.96 from the normal distribution
table. In the Excel output these values for a two-tail test are z<-1.959961082
and z>+1.959961082. Since the value of the test statistic z=-1.9845824 is less
than -1.959961082 we reject the null hypothesis. We can also draw this
conclusion by comparing the p-value for a two tail -test and the alpha value.

Since p-value 0.047190813 is less than a=0.05 we reject the null hypothesis.
Overall we can say, based on the sample results, the two populations' means are
different.

Small Samples: n1 OR n2 are less than 30

In this section we will show how Excel is used to conduct a hypothesis test
about the difference between two population means. - Given that the
populations have equal variances when two small independent samples are
taken from both populations. Similar to the above case, the data in this case are
taken from various offices here at the University of Baltimore. I collected
hourly income data of 11 randomly selected work-study students and 11
randomly selected student assistants. The hourly income range for both groups

491
was similar range, $6 - $8 and $6-$9. The main objective in this hypothesis
testing is similar too, to see whether there is a significant difference between
the means of the two populations. The NULL and the ALTERNATIVE
hypothesis are that the means are equal and they are not equal, respectively.

work-study student student assistant


6 6
8 9
7.5 8.5
6.5 7
7 6.5
6 7
7.5 7.5
8 6
6 8
6.5 9
7 7.5>

Referring to the spreadsheet, we chose A1 and A2 as label centers. The work-


study students' hourly income for a sample size 11 are shown in cellsA2:A12,
and the student assistants' hourly income for a sample size 11 is shown in
cells B2:B12. Unlike previous case, you do not have to calculate the variances
of the two samples, Excel will automatically calculate these quantities and use
them in the calculation of the value of the test statistic.

Similar to the previous case, but a bit different in step # 2, to conduct the
desired test hypothesis with Excel the following steps can be taken:

Step 1. From the menus select Tools then click on the Data Analysis option.

Step 2. When the Data Analysis dialog box appears:


Choose t-Test: Two Sample Assuming Equal Variances then click OK

Step 3 When the t-Test: Two Sample Assuming Equal Variances dialog box
appears:

Enter A1:A12 in the variable 1 range box (work-study student hourly income)
Enter B1:B12 in the variable 2 range box (student assistant hourly income)

492
Enter 0 in the Hypothesis Mean Difference box(if you desire to test a mean
difference other than zero, enter that value) then select Labels

Enter 0.05 or, whatever level of significance you desire, in the Alpha box

Select a suitable Output Range for the results, I chose C1, then click OK.

The value of the test statistic t=-1.362229828 appears, in our case, in cell D10.
The rejection rule for this test is t<-2.086 or t>+2.086 from the t distribution
table where the t value is based on a t distribution with n1-n2-2 degrees of
freedom and where the area of the upper one tail is 0.025 ( that is equal to
alpha/2).

In the Excel output the values for a two-tail test are t<-2.085962478 and
t>+2.085962478. Since the value of the test statistic t=-1.362229828, is in an
acceptance range of t<-2.085962478 and t>+2.085962478, we fail to reject the
null hypothesis.

We can also draw this conclusion by comparing the p-value for a two-tail test
and the alpha value.

Since the p-value 0.188271278 is greater than a=0.05 again, we fail to reject
the null hypothesis.

Overall we can say, based on sample results, the two populations' means are
equal.

work-study student student assistant


Mean 6.909090909 7.454545455
Variance 0.590909091 1.172727273
Observations 11 11
Pooled Variance 0.881818182
Hypothesized Mean Difference 0
Df 20
t Stat -1.362229828
P(T<=t) one tail 0.094135639
t Critical one tail 1.724718004
P(T<=t)two tail 0.188271278
t Critical two tail 2.085962478

493
ANOVA: Analysis of Variances

In this section the objective is to see whether or not means of three or more
populations based on random samples taken from populations are equal or not.
Assuming independents samples are taken from normally distributed
populations with equal variances, Excel would do this analysis if you choose
one way anova from the menus. We can also choose Anova: two way factor
with or without replication option and see whether there is significant
difference between means when different factors are involved.

Single-Factor ANOVA Test

In this case we were interested to see whether there a significant difference


among hourly wages of student assistants in three different service departments
here at the University of Baltimore. Six student assistants were randomly were
selected from the three departments and their hourly wages were recorded as
following:

ARC CSI TCC


10.00 6.50 9.00
8.00 7.00 7.00
7.50 7.00 7.00
8.00 7.50 7.00
7.00 7.00 6.50

Enter data in an Excel work sheet starting with cell A2 and ending with cell C8.
The following steps should be taken to find the proper output for interpretation.

Step 1. From the menus select Tools and click on Data Analysis option.
Step 2. When data analysis dialog appears, choose Anova single-factor option;
enter A2:C8 in the input range box. Select labels in first row.
Step3.Select any cell as output(in here we selected A11). Click OK.

The general form of Anova table looks like following:

Source of Variation SS df MS F P-value F crit

Between Groups SSTR K-1 MSTR MST/MSE 0.046725 3.682316674

Within Groups SSE nt-K MSE

Total

494
Suppose the test is done at level of significance a = 0.05, we reject the null
hypothesis. This means there is a significant difference between means of
hourly incomes of student assistants in these departments.

The Two-way ANOVA Without Replication

In this section, the study involves six students who were offered different
hourly wages in three different department services here at the University of
Baltimore. The objective is to see whether the hourly incomes are the same.
Therefore, we can consider the following:

Factor: Department

Treatment: Hourly payments in the three departments

Blocks: Each student is a block since each student has worked in the three
different departments

Student ARC CSI TCC

1 10.00 7.50 7.00


2 8.00 7.00 6.00
3 7.00 6.00 6.00
4 8.00 6.50 6.50
5 9.00 8.00 7.00
6 8.00 8.00 6.00

The general form of Anova table would look like:

Source of Sum of Degrees of Mean


F
Variation Squares freedom Squares

Treatment SST K-1 MST F=MST/MSE


Blocks SSB b-1 MSB
Error SSE (K-1)(b-1) MSB
Total SST nt-1

To find the Excel output for the above data the following steps can be
taken:
495
Step 1. From the menus select Tools and click on Data Analysis option.

Step2. When data analysis box appears: select Anova two-factor without
replication then Enter A2: D8 in the input range. Select labels in first row.

Step3. Select an output range (in here we selected A11) then OK.

SUMMARY COUNT SUM AVERAGE VARIANCE

1 3 24.5 8.166667 2.583333

2 3 21 7 1

3 3 19.5 6.5 0.25

4 3 21.5 7.166667 0.583333

5 3 23 7.666667 2.333333

6 3 22 7.333333 1.333333

ARC 6 50 8.333333 1.066667

CSI 6 43 7.166667 0.666667

TCC 6 38.5 6.416667 0.241667

ANOVA

Source of Variation SS df MS F P-value F crit

Rows 4.902778 5 0.980556 1.972067 0.168792 3.325837

Columns 11.19444 2 5.597222 11.25698 0.002752 4.102816

Error 4.972222 10 0.497222

Total 21.06944 17

NOTE: F=MST/MSE =0.980556/0.497222 = 1.972067


F = 3.33 from table (5 numerator DF and 10 denominator DF)
Since 1.972067<3.33 we fail to reject the null.

496
Conclusion: There is not sufficient evidence to conclude that hourly rates
differ for the three departments.

Two-Way ANOVA with Replication

Referring to the student assistant and the work study hourly wages here at the
university of Baltimore the following data shows the hourly wages for the two
categories in three different departments:

ARC CSI TCC


6.50 6.10 6.90
Work Study 6.80 6.00 7.20
7.10 6.50 7.10
7.40 6.80 7.50
Student Assistant 7.50 7.00 7.00
8.00 6.60 7.10

Factors

Factor A: Student job category (in here two different job categories exists)

Factor B: Departments (in here we have three departments)

Replication: The number of students in each experimental condition. In this


case there are three replications.

Interaction:

ARC CSI TCC


6.50 6.10 6.90
Work Study 6.80 6.00 7.20
7.10 6.50 7.10
7.40 6.80 7.50
Student Assistant 7.50 7.00 7.00
8.00 6.60 7.10

SUMMARY ARC CSI TCC Total


Count 3 3 3 9
Sum 20.4 19 21 60.2
497
Average 6.8 6.2 7.1 6.69
Variance 0.09 0.1 0 0.19
Count 3 3 3 9
Sum 22.9 20 22 64.9
Average 7.63333 6.8 7.2 7.21
Variance 0.10333 0 0.1 0.18
Total
Total
Count 6 6 6
Sum 43.3 39 43
Average 7.21667 6.5 7.1
Variance 0.28567 0.2 0

ANOVA

Source of Variation SS df MS F P-value F crit


Sample(Factor A) 1.22722 1 1.2 18.6 0.001016557 4.747221
Columns(Factor B) 1.84333 2 0.9 13.9 0.000741998 3.88529
Interaction 0.38111 2 0.2 2.88 0.095003443 3.88529
Within 0.79333 12 0.1

Total 4.245 17

Conclusion:
Mean hourly income differ by job category.
Mean hourly income differ by department.
Interaction is not significant.

Goodness-of-Fit Test for Discrete Random Variables

The CHI-SQUARE distribution can be used in a hypothesis test involving a


population variance. However, in this section we would like to test and see how
close a sample results are to the expected results.

Example: The Multinomial Random Variable

498
In this example the objective is to see whether or not based on a randomly
selected sample information the standards set for a population is met. There are
so many practical examples that can be used in this situation. For example it is
assumed the guidelines for hiring people with different ethnic background for
the US government is set at 70%(WHITE), 20%(African American) and
10%(others), respectively. A randomly selected sample of 1000 US employees
shows the following results that is summarized in a table.

ETHNIC EXPECTED NUMBER OF OBSERVED FROM


BACKGROUND EMPLOYEES SAMPLE
WHITE 700 =70%OF 1000 750
AFRICAN American 200 =20%OF 1000 170
OTHERS 100 =10%OF 1000 80

As you see the observed sample numbers for groups two and three are lower
than their expected values unlike group one which has a higher expected value.
Is this a clear sign of discrimination with respect to ethnic background? Well
depends on how much lower the expected values are. The lower amount might
not statistically be significant. To see whether these differences are significant
we can use Excel and find the value of the CHI-SQUARE. If this value falls
within the acceptance region we can assume that the guidelines are met
otherwise they are not. Now lets enter these numbers into Excel spread- sheet.
We used cells B7-B9 for the expected proportions, C7-C9 for the observed
values and D7-D9 for the expected frequency. To calculate the expected
frequency for a category, you can multiply the proportion of that category by
the sample size (in here 1000). The formula for the first cell of the expected
value column, D7 is 1000*B7. To find other entries in the expected value
column, use the copy and the paste menu as shown in the following picture.
These are important values for the chi-square test. The observed range in this
case is C7: C9 while the expected range is D7: D9. The null and the alternative
hypothesis for this test are as follows:

H0 : PW = 0.70, PA=0.20 and PO =0.10

HA: The population proportions are not PW = 0.70, PA= 0.20 and PO = 0.10

Now lets use Excel to calculate the p-value in a CHI-SQUARE test. Step
1.Select a cell in the work sheet, the location which you like the p value of
theCHI-SQUARE to appear. We chose cell D12.

499
Step 2. From the menus, select insert then click on the Function option, Paste
Function dialog box appears.

Step 3.Refer to function category box and choose statistical, from function
name box select CHITEST and click on OK.

Step 4.When the CHITEST dialog appears:


Enter C7: C9 in the actual-range box then enter D7: D9 in the expected-
range box, and finally click on OK.

The p-value will appear in the selected cell, D12.

As you see the p value is 0.002392 which is less than the value of the level of
significance (in this case the level of significance, a= 0.10). Hence the null
hypothesis should be rejected. This means based on the sample information the
guidelines are not met. Notice if you type "=CHITEST(C7:C9,D7:D9)" in the
formula bar the p-value will show up in the designated cell.

NOTE: Excel can actually find the value of the CHI-SQUARE. To find this
value first select an empty cell on the spread sheet then in the formula bar type
"=CHIINV(D12,2)." D12 designates the p-Value found previously and 2 is the
degrees of freedom (number of rows minus one). The CHI-SQUARE value in
this case is 12.07121. If we refer to the CHI-SQUARE table we will see that
the cut off is 4.60517 since 12.07121>4.60517 we reject the null. The
following screen shot shows you how to the CHI-SQUARE value.

Test of Independence: Contingency Tables

The CHI-SQUARE distribution is also used to test and see whether two
variables are independent or not. For example based on sample data you might
want to see whether smoking and gender are independent events for a certain
population. The variables of interest in this case are smoking and the gender of
an individual. Another example in this situation could involve the age range of
an individual and his or her smoking habit. Similar to case one data may appear
in a table but unlike the case one this table may contains several columns in
addition to rows. The initial table contains the observed values. To find
expected values for this table we set up another table similar to this one. To
find the value of each cell in the new table we should multiply the sum of the
cell column by the sum of the cell row and divide the results by the grand total.
The grand total is the total number of observations in a study. Now based on

500
the following table test whether or not the smoking habit and gender of the
population that the following sample taken from are independent. On the other
hand is that true that males in this population smoke more than females?

You could use formula bar to calculate the expected values for the expected
range. For example to find the expected value for the cell C5 which is replaced
in c11 you could click on the formula bar and enter C6*D5/D6 then enter in
cell C11.

Step 1. Observed Range b4:c5

Smoking and gender

yes no total
male 31 69 100
female 45 122 167
total 76 191 267

Step2. Expected Range b10:c11

28.46442 71.53558
47.53558 119.4644

So the observed range is b4:c5 and the expected range is b10:c11.

Step 3. Click on fx(paste function)

Step 4. When Paste Function dialog box appears, click on Statistical in


function category and CHITEST in the function name then click OK.

When the CHITEST box appears, enter b4:c5 for the actual range, then b10:c11
for the expected range.

Step 5. Click on OK (the p-value appears). 0.477395

Conclusion: Since p-value is greater than the level of significance (0.05), fails
to reject the null. This means smoking and gender are independent events.
Based on sample information one can not assure females smoke more than
males or the other way around.

501
Step 6. To find the chi-square value, use CHINV function, when Chinv box
appears enter 0.477395 for probability part, then 1 for the degrees of freedom.

Degrees of freedom=(number of columns-1)X(number of rows-1)

CHI-SQUARE=0.504807

Test Hypothesis Concerning the Variance of Two Populations

In this section we would like to examine whether or not the variances of two
populations are equal. Whenever independent simple random samples of equal
or different sizes such as n1 and n2 are taken from two normal distributions with
equal variances, the sampling distribution of s12/s22 has F distribution with n1- 1
degrees of freedom for the numerator and n2 - 1 degrees of freedom for the
denominator. In the ratio s12/s22 the numerator s12 and the denominator s22 are
variances of the first and the second sample, respectively. The following figure
shows the graph of an F distribution with 10 degrees of freedom for both the
numerator and the denominator. Unlike the normal distribution as you see the F
distribution is not symmetric. The shape of an F distribution is positively
skewed and depends on the degrees of freedom for the numerator and the
denominator. The value of F is always positive.

Now let see whether or not the variances of hourly income of student-assistant
and work-study students based on samples taken from populations previously
are equal. Assume that the hypothesis test in this case is conducted at a = 0.10.
The null and the alternative are:

Rejection Rule: Reject the null hypothesis if F< F0.095 or F> F0.05 where F, the
value of the test statistic is equal to s12/s22, with 10 degrees of freedom for both
the numerator and the denominator. We can find the value of F.05 from the F
distribution table. If s12/s22, we do not need to know the value of
F0.095 otherwise, F0.95 = 1/ F0.05 for equal sample sizes.

502
A survey of eleven student-assistant and eleven work-study students shows the
following descriptive statistics. Our objective is to find the value of s12/s22,
where s12 is the value of the variance of student assistant sample and s22 is the
value of the variance of the work study students’ sample. As you see these
values are in cells F8 and D8 of the descriptive statistic output.

503
To calculate the value of s12/s22, select a cell such as A16 and enter cell formula
= F8/D8 and enter. This is the value of F in our problem. Since this value,
F=1.984615385, falls in acceptance area we fail to reject the null hypothesis.
Hence, the sample results do support the conclusion that student assistants
hourly income variance is equal to the work study students hourly income
variance. The following screen shoot shows how to find the F value. We can
follow the same format for one tail test(s).

504
Linear Correlation and Regression Analysis

In this section the objective is to see whether there is a correlation between two
variables and to find a model that predicts one variable in terms of the other
variable. There are so many examples that we could mention but we will
mention the popular ones in the world of business. Usually independent
variable is presented by the letter x and the dependent variable is presented by
the letter y. A business man would like to see whether there is a relationship
between the number of cases of sold and the temperature in a hot summer day
based on information taken from the past. He also would like to estimate the
number cases of soda which will be sold in a particular hot summer day in a
ball game. He clearly recorded temperatures and number of cases of soda sold
on those particular days. The following table shows the recorded data from
June 1 through June 13. The weatherman predicts a 94F degree temperature for

505
June 14. The businessman would like to meet all demands for the cases of
sodas ordered by customers on June 14.
Cases of
DAY Temperature
Soda
1-Jun 57 56
2-Jun 59 58
3-Jun 65 63
4-Jun 67 66
5-Jun 75 73
6-Jun 81 78
7-Jun 86 85
8-Jun 88 85
9-Jun 88 87
10-
84 84
Jun
11-
82 88
Jun
12-
80 84
Jun
13-
83 89
Jun

Now lets use Excel to find the linear correlation coefficient and the regression
line equation. The linear correlation coefficient is a quantity between -1 and +1.
This quantity is denoted by R. The closer R to +1 the stronger positive (direct)
correlation and similarly the closer R to -1 the stronger negative (inverse)
correlation exists between the two variables. The general form of the regression
line is y = mx + b. In this formula, m is the slope of the line and b is the y-
intercept. You can find these quantities from the Excel output. In this situation
the variable y (the dependent variable) is the number of cases of soda and the x
(independent variable) is the temperature. To find the Excel output the
following steps can be taken:

Step 1. From the menus choose Tools and click on Data Analysis.

Step 2. When Data Analysis dialog box appears, click on correlation.

506
Step 3. When correlation dialog box appears, enter B1:C14 in the input range
box. Click on Labels in first row and enter a16 in the output range box. Click
on OK.

Cases of Soda Temperature


Cases of Soda 1
Temperature 0.96659877 1

As you see the correlation between the number of cases of soda demanded and
the temperature is a very strong positive correlation. This means as the
temperature increases the demand for cases of soda is also increasing. The
linear correlation coefficient is 0.966598577 which is very close to +1.

Now lets follow same steps but a bit different to find the regression
equation.

Step 1. From the menus choose Tools and click on Data Analysis

Step 2. When Data Analysis dialog box appears, click on regression.

Step 3. When Regression dialog box appears, enter b1:b14 in the y-range box
and c1:c14 in the x-range box. Click on labels.

Step 4. Enter a19 in the output range box.

Note: The regression equation in general should look like Y=m X + b. In this
equation m is the slope of the regression line and b is its y-intercept.

SUMMARY OUTPUT

Regression Statistics
Multiple R 0.966598577
R Square 0.934312809
Adjusted R Square 0.928341246
Standard Error 2.919383191
Observations 13

ANOVA
507
df SS MS F Significance F
Regression 1 1333.479989 1333.479989 156.4603497 7.58511E-08
Residual 11 93.75078034 8522798213
Total 12 1427.230769

Standard Upper
Coefficients t Stat P-value Lower 95%
Error 95%

5.44574283 1.68535458 0.12004480 - 21.1640


Intercept 9.17800767
6 7 1 2.80799756 1
Temperatur 0.87920271 12.5084111 7.58511E- 0.72449776 1.03390
0.07028892
e 1 6 08 3 8

The relationship between the number of cans of soda and the temperature is: Y
= 0.879202711 X + 9.17800767

The number of cans of soda = 0.879202711*(Temperature) + 9.17800767.


Referring to this expression we can approximately predict the number of cases
of soda needed on June 14. The weather forecast for this is 94 degrees, hence
the number of cans of soda needed is equal to; The number of cases of
soda=0.879202711*(94) + 9.17800767 = 91.82 or about 92 cases.

Moving Average and Exponential Smoothing

Moving Average Models: Use the Add Trendline option to analyze a moving
average forecasting model in Excel. You must first create a graph of the time
series you want to analyze. Select the range that contains your data and make a
scatter plot of the data. Once the chart is created, follow these steps:

1. Click on the chart to select it, and click on any point on the line to
select the data series. When you click on the chart to select it, a
new option, Chart, s added to the menu bar.
2. From the Chart menu, select Add Trendline.

The following is the moving average of order 4 for weekly sales:

508
Exponential Smoothing Models: The simplest way to analyze a timer series
using an Exponential Smoothing model in Excel is to use the data analysis tool.
This tool works almost exactly like the one for Moving Average, except that
you will need to input the value of a instead of the number of periods, k. Once
you have entered the data range and the damping factor, 1-  , and indicated
what output you want and a location, the analysis is the same as the one for the
Moving Average model.

Applications and Numerical Examples

Descriptive Statistics: Suppose you have the following, n = 10, data:

1.2, 1.5, 2.6, 3.8, 2.4, 1.9, 3.5, 2.5, 2.4, 3.0

1. Type your n data points into the cells A1 through An.


2. Click on the "Tools" menu. (At the bottom of the "Tools" menu
will be a submenu "Data Analysis...", if the Analysis Tool Pack
has been properly installed.)
3. Clicking on "Data Analysis..." will lead to a menu from which
"Descriptive Statistics" is to be selected.
4. Select "Descriptive Statistics" by pointing at it and clicking twice,
or by highlighting it and clicking on the "Okay" button.
5. Within the Descriptive Statistics submenu,

a. for the "input range" enter "A1:Dn", assuming you typed the data into cells
A1 to An.

509
b. click on the "output range" button and enter the output range "C1:C16".

c. click on the Summary Statistics box

d. finally, click on "Okay."

The Central Tendency: The data can be sorted in ascending order:

1.2, 1.5, 1.9, 2.4, 2.4, 2.5, 2.6, 3.0, 3.5, 3.8

The mean, median and mode are computed as follows:

(1.2 1.5 2.6 3.8 2.4 1.9 3.5 2.5 2.4 3.0) / 10 = 2.48

(2.4 + 2.5) / 2 = 2.45

The mode is 2.4, since it is the only value that occurs twice.

The midrange is (1.2+ 3.8) / 2 = 2.5.

Note that the mean, median and mode of this set of data are very close to each
other. This suggests that the data is very symmetrically distributed.

Variance: The variance of a set of data is the average of the cumulative


measure of the squares of the difference of all the data values from the mean.

The sample variance-based estimation for the population variance are


computed differently. The sample variance is simply the arithmetic mean of the
squares of the difference between each data value in the sample and the mean
of the sample. On the other hand, the formula for an estimate for the variance in
the population is similar to the formula for the sample variance, except that the
denominator in the fraction is (n-1) instead of n. However, you should not
worry about this difference if the sample size is large, say over 30. Compute an
estimate for the variance of the population, given the following sorted data:

1.2, 1.5, 1.9, 2.4, 2.4, 2.5, 2.6, 3.0, 3.5, 3.8 mean = 2.48 as computed earlier.
An estimate for the population variance is: s2 = 1 / (10-1) [ (1.2 - 2.48)2+ (1.5 -
2.48)2 + (1.9 - 2.48)2 + (2.4 -2.48)2 + (2.4 - 2.48)2 + (2.5 - 2.48)2 + (2.6 -
2.48)2 + (3.0 - 2.48)2 + (3.5 -2.48)2 + (3.8 - 2.48)2 ]
= (1 / 9) (1.6384 + 0.9604 + 0.3364 + 0.0064 + 0.0064 + 0.0004 + 0.0144 +
0.2704 + 1.0404 + 1.7424) = 0.6684

Therefore, the standard deviation is s = ( 0.6684 )1/2 = 0.8176


510
Probability and Expected Values: Newsweek reported that "average take" for
bank robberies was $3,244 but 85 percent of the robbers were caught.
Assuming 60 percent of those caught lose their entire take and 40 percent lose
half, graph the probability mass function using EXCEL. Calculate the expected
take from a bank robbery. Does it pay to be a bank robber?

To construct the probability function for bank robberies, first define the random
variable x, bank robbery take. If the robber is not caught, x = $3,244. If the
robber is caught and manages to keep half, x = $1,622. If the robber is caught
and loses it all, then x = 0. The associated probabilities for these x values are
0.15 = (1 - 0.85), 0.34 = (0.85)(0.4), and 0.51 = (0.85)(0.6). After entering the x
values in cells A1, A2 and A3 and after entering the associated probabilities in
B1, B2, and B3, the following steps lead to the probability mass function:

1. Click on ChartWizard. The "ChartWizard Step 1 of 4" screen will


appear.
2. Highlight "Column" at "ChartWizard Step 1 of 4" and click
"Next."
3. At "ChartWizard Step 2 of 4 Chart Source Data," enter "=B1:B3"
for "Data range," and click "column" button for "Series in." A
graph will appear. Click on "series" toward the top of the screen
to get a new page.
4. At the bottom of the "Series" page, is a rectangle for "Category
(X) axis labels:" Click on this rectangle and then highlight A1:A3.
5. At "Step 3 of 4"; move on by clicking on "Next," and at "Step 4 of
4", click on "Finish."

The expected value of a robbery is $1,038.08.

E(X) = (0)(0.51)+(1622)(0.34) + (3244)(0.15) = 0 + 551.48 + 486.60 = 1038.08

The expected return on a bank robbery is positive. On average, bank robbers


get $1,038.08 per heist. If criminals make their decisions strictly on this
expected value, then it pays to rob banks. A decision rule based only on an
expected value, however, ignores the risks or variability in the returns. In
addition, our expected value calculations do not include the cost of jail time,
which could be viewed by criminals as substantial.

511
Discrete & Continuous Random Variables:

Binomial Distribution Application: A multiple choice test has four unrelated


questions. Each question has five possible choices but only one is correct.
Thus, a person who guesses randomly has a probability of 0.2 of guessing
correctly. Draw a tree diagram showing the different ways in which a test taker
could get 0, 1, 2, 3 and 4 correct answers. Sketch the probability mass function
for this test. What is the probability a person who guesses will get two or more
correct?

Solution: Letting Y stand for a correct answer and N a wrong answer, where
the probability of Y is 0.2 and the probability of N is 0.8 for each of the four
questions, the probability tree diagram is shown in the textbook on page 182.
This probability tree diagram shows the "branches" that must be followed to
show the calculations captured in the binomial mass function for n = 4 and =
0.2. For example, the tree diagram shows the six different branch systems that
yield two correct and two wrong answers (which corresponds to 4!/(2!2!) = 6.
The binomial mass function shows the probability of two correct answers as

P(x = 2 | n = 4, p = 0.2) = 6(.2)2(.8)2 = 6(0.0256) = 0.1536 = P(2)

Which is obtained from excel by using the "BINOMDIST" Command, where


the first entry is x, the second is n, and the third is mass (0) or cumulative (1);
that is, entering

=BINOMDIST(2,4,0.2,0) IN ANY EXCEL CELL YIELDS 0.1536 AND


=BINOMDIST(3,4,0.2,0) YIELDS P(x=3|n=4, p = 0.2) = 0.0256
=BINOMDIST(4,4,0.2,0) YIELDS P(x=4|n=4, p = 0.2) = 0.0016
=1-BINOMDIST(1,4,0.2,1) YIELDS P(x  2 | n = 4, p = 0.2) = 0.1808

Normal Example: If the time required to complete an examination by those


with a certain learning disability is believed to be distributed normally, with
mean of 65 minutes and a standard deviation of 15 minutes, then when can the
exam be terminated so that 99 percent of those with the disability can finish?

Solution: Because the average and standard deviation are known, what needs
to be established is the amount of time, above the mean time, such that 99
percent of the distribution is lower. This is a distance that is measured in
standard deviations as given by the Z value corresponding to the 0.99
probability found in the body of Appendix B, Table 5,as shown in the textbook
OR the commands entered into any cell of Excel to find this Z value is
=NORMINV(0.99,0,1) for 2.326342.
512
The closest cumulative probability that can be found is 0.9901, in the row
labeled 2.3 and column headed by .03, Z = 2.33, which is only an
approximation for the more exact 2.326342 found in Excel. Using this more
exact value the calculation with mean  and standard deviation  in the
following formula would be

Z=(X-)/
That is, Z = ( x - 65)/15
Thus, x = 65 + 15(2.32634) = 99.9 minutes.

Alternatively, instead of standardizing with the Z distribution using Excel we


can simply work directly with the normal distribution with a mean of 65 and
standard deviation of 15 and enter "=NORMINV(0.99,65,15)". In general to
obtain the x value for which alpha percent of a normal random variable's values
are lower, the following "NORMINV" command may be used, where the first
entry is , the second is  , and the third is .

Another Example: In the early 1980s, the Toro Company of Minneapolis,


Minnesota, advertised that it would refund the purchase price of a snow blower
if the following winter's snowfall was less than 21 percent of the local average.
If the average snowfall is 45.25 inches, with a standard deviation of 12.2
inches, what is the likelihood that Toro will have to make refunds?

Solution: Within limits, snowfall is a continuous random variable that can be


expected to vary symmetrically around its mean, with values closer to the mean
occurring most often. Thus, it seems reasonable to assume that snowfall (x) is
approximately normally distributed with a mean of 45.25 inches and standard
deviation of 12.2 inches. Nine and one half inches is 21 percent of the mean
snowfall of 45.25 inches and, with a standard deviation of 12.2 inches, the
number of standard deviations between 45.25 inches and 9.5 inches is Z:

Z = ( x -  ) / s = (9.50 - 45.25)/12.2 = -2.93

Using Appendix B, Table 5, the textbook demonstrates the determination of


P(x  9.50) = P(z  -2.93) = 0.17, the probability of snowfall less than 9.5
inches. Using Excel, this normal probability is obtained with the
"NORMDIST" command, where the first entry is x, the second is mean  , the
third is standard deviation s, and the fourth is CUMULATIVE (1). Entering

=NORMDIST(9.5,45.25,12.2,1), Gives P( x  9.50) = 0.001693.

513
Sampling Distribution and the Central Limit Theorem : A bakery sells an
average of 24 loaves of bread per day. Sales (x) are normally distributed with a
standard deviation of 4.

If a random sample of size n = 1 (day) is selected, what is the probability this x


value will exceed 28?

If a random sample of size n = 4 (days) is selected, what is theprobability that


xbar  28?

Why does the answer in part 1 differ from that in part 2?

Solutions:

1. The sampling distribution of the sample mean xbar is normal with a mean of
24 and a standard error of the mean of 4. Thus, using Excel, 0.15866 =1-
NORMDIST(28,24,4,1).

2. The sampling distribution of the sample mean xbar is normal with a mean of
24 and a standard error of the mean of 2 using Excel, 0.02275 =1-
NORMDIST(28,24,2,1).

Regression Analysis: The highway deaths per 100 million vehicle miles and
highway speed limits for 10 countries, are given below:

(Death, Speed) = (3.0, 55), (3.3, 55), (3.4, 55), (3.5, 70), (4.1, 55), (4.3, 60),
(4.7, 55), (4.9, 60), (5.1, 60), and (6.1, 75).

From this we can see that five countries with the same speed limit have very
different positions on the safety list. For example, Britain ... with a speed limit
of 70 is demonstrably safer than Japan, at 55. Can we argue that, speed has
little to do with safety. Use regression analysis to answer this question.

Solution: Enter the ten paired y and x data into cells A2 to A11 and B2 to B11,
with the "death" rate label in A1 and "speed" limits label in B1, the following
steps produce the regression output.

Choose "Regression" from "Data Analysis" in the "Tools" menu. The


Regression dialog box will will appear.

514
Note: Use the mouse to move between the boxes and buttons. Click on the
desired box or button. The large rectangular boxes require a range from the
worksheet. A range may be typed in or selected by highlighting the cells with
the mouse after clicking on the box. If the dialog box blocks the data, it can be
moved on the screen by clicking on the title bar and dragging.

For the "Input Y Range," enter A1 to A11, and for the "Input X Range" enter
B1 to B11.

Because the Y and X ranges include the "Death" and "Speed" labels in A1 and
B1, select the "Labels" box with a click.

Click the "Output Range" button and type reference cell, which in this
demonstration is A13.

To get the predicted values of Y (Death rates) and residuals select the
"Residuals" box with a click.

Your screen display should show a Table, clicking "OK" will give the
"SUMMARY OUTPUT," "ANOVA" AND RESIDUAL OUTPUT"

The first section of the EXCEL printout gives "SUMMARY OUTPUT." The
"Multiple R" is the square root of the "R Square;" the computation and
interpretation of which we have already discussed. The "Standard Error" of
estimate (which will be discussed in the next chapter) is s = 0.86423, which is
the square root of "Residual SS" = 5.97511 divided by its degrees of freedom,
df = 8, as given in the "ANOVA" section. We will also discuss the adjusted R-
square of 0.21325 in the following chapters.

Under the "ANOVA" section are the estimated regression coefficients and
related statistics that will be discussed in detail in the next chapter. For now it is
sufficient to recognize that the calculated coefficient values for the slope and y
intercept are provided (b = 0.07556 and a = -0.29333). Next to these coefficient
estimates is information on the variability in the distribution of the least-
squares estimators from which these specific estimates were drawn: the column
titled "Std. Error" contains the standard deviations (standard errors) of the
intercept and slope distributions; the "t-ratio" and "p" columns give the
calculated values of the t statistics and associated p-values. As shown in
Chapter 13, the t statistic of 1.85458 and p-value of 0.10077, for example,
indicates that the sample slope (0.07556) is sufficiently different from zero, at
even the 0.10 two-tail Type I error level, to conclude that there is a significant

515
relationship between deaths and speed limits in the population. This conclusion
is contrary to assertion that "speed has little to do with safety."

SUMMARY OUTPUT: Multiple R = 0.54833, R Square = 0.30067, Adjusted


R Square = 0.21325, Standard Error = 0.86423, Observations = 10

ANOVA df SS MS F P-value
Regression 1 2.56889 2.56889 3.43945 0.10077
Residual 8 5.97511 0.74689
Total 9 8.54400

Coeffs. Estimate Std. Error T Stat P-value Lower 95% Upper


95%
Intercept -0.29333 2.45963 -0.11926 0.90801 -5.96526 5.37860
Speed 0.07556 0.04074 1.85458 0.10077 -0.01839 0.16950

Residual Output:

Predicted Residuals
3.86222 -0.86222
3.86222 -0.56222
3.86222 -0.46222
4.99556 -1.49556
3.86222 0.23778
4.24000 0.06000
3.86222 0.83778
4.24000 0.66000
4.24000 0.86000
5.37333 0.72667

E-Labs to Fully Understand Statistical Concepts

The Value of Performing Experiment: If the learning environment is focused


on background information, knowledge of terms and new concepts, the learner
is likely to learn that basic information successfully. However, this basic
knowledge may not be sufficient to enable the learner to carry out successfully
the on-the-job tasks that require more than basic knowledge. Thus, the
probalility of making real errors in the business environment is high. On the
other hand, if the learning environment allows the learner to experience and
learn from failures within a variety of situations similar to what they would
experience in the "real world" of their job, the probalility of having similar

516
failures in their business environment is low. This is the realm of simulations-a
safe place to fail.

The appearance of statistical software is one of the most important events in the
process of decision making under uncertainty. Statistical software systems are
used to construct examples, to understand the existing concepts, and to find
new statistical properties. On the other hand, new developments in the process
of decision making under uncertainty often motivate developments of new
approaches and revision of the existing software systems. Statistical software
systems rely on a cooperation of statisticians, and software developers.

Beside the statistical software, Java Applets, Online statistical computation, and
the use of a scientific calculator is required for the course. A Scientific
Calculator is the one, which has capability to give you, say, the result of square
root of 5. Any calculator that goes beyond the 4 operations is fine for this
course. These calculators allow you to perform simple calculations you need in
this course, for example, enabling you to take square root, to raise e to the
power of say, 0.36. and so on. These types of calculators are called general
Scientific Calculators. There are also more specific and advanced calculators
for mathematical computations in other areas such as Finance, Accounting,
Civil Engineering, and even Statistics. The last one, for example, computes
mean, variance, skewness, and kurtosis of a sample by simply entering all data
one-by-one and then pressing any of the mean, variance, skewness, and kurtosis
keys.

Without a computer one cannot perform any realistic statistical


data analysis. Students who are signing up for the course are
expected to know the basics of Excel, and other
popular Spreadsheet.
As a starting point, you need visiting the Excel Web site created
for this course.
This section is a part of the JavaScript E-labs learning tools for
decision making. The following is a classification of statistical
JavaScript by their application areas:
MENU
1. Summarizing Data

 Bivariate Sampling Statistics


 Descriptive Statistics
 Determination of the Outliers
 Empirical Distribution Function

517
 Histogram
 The Three Means

2. Computational probability

 Combinatorial Maths
 Comparing Two Random Variables
 Multinomial Distributions
 P-values for the Popular Distributions

3. Requirements for most tests & estimations

 Removal of the Outliers


 Sample Size Determination
 Test for Homogeneity of Population
 Test for Normality
 Test for Randomness

4. One population & one variable

 Binomial Exact Confidence Intervals


 Goodness-of-Fit for Discrete Variables
 Mean, and Variance Confidence Intervals
 Testing the Mean
 Testing the Medians
 Testing the Variance

5. One population & two or more variables

 The Before-and-After Test for Means and Variances


 The Before-and-After Test for Proportions
 Chi-square Test for Crosstable Relationship
 Multiple Regressions
 Polynomial Regressions
 Quadratic Regression
 Simple Regression with Diagnostic Tools
 Testing the Population Correlation Coefficient

6. Two populations & one variable

 Confidence Intervals for Two Populations


 K-S Test for Equality of Two Populations
 Two Populations Testing Means & Variances

7. Several populations & one or more variables

518
 ANOVA: Testing Equality of the Means
 Compatibility of Multi-Counts
 Equality of Multi-variances: The Bartlett's Test
 Identical Populations Test for Crosstable Data
 Testing the Proportions
 Testing Several Correlation Coefficients

Interesting and Useful Sites

 Add-ins for Excel


 Analyse-It for Microsoft Excel
 Data analysis and statistical solutions for Excel
 Spreadsheet Page

A selection of:

| BUBL Catalogue| Business and Economics (Biz/ed)| Business &


Finance| Business & Industrial| Business Nation| Education
World| Economics LTSN| MathForum| Maths, Stats & OR
Network| MERLOT| Social Science| Statistics & Operational
Research| Statistics Network| Statistics on the Web| SurfStat| University
of Cambridge| Virtual Learning Resource Centre| Virtual Library| WebEc|

Back to
Business Statistics

The Copyright Statement: The fair use, according to the 1996 Fair
Use Guidelines for Educational Multimedia, of materials presented on
this Web site is permitted for non-commercial and classroom
purposes only. This site may be mirrored intact (including these
notices), on any server with public access. All files are available
athttp://www.mirrorservice.org/sites/home.ubalt.edu/ntsbarsh/Business-
stat for mirroring.

519

You might also like