Professional Documents
Culture Documents
PhD THESIS
Practical TIPS for research scholars
Prof Dr S Ramalingam
Prof Dr S Ramalingam
Head – Management Studies
Dr MGR Univesity
Chennai – 600 095 INDIA
A2Z PhD Thesis
Practical TIPS for research Scholars
ALL RIGHTS RESERVED. No part of this book covered by the copyright herein may be
reproduced, transmitted, stored or used in any form or by any means graphic, electronic, or
mechanical, including but not limited to photocopying, recording, scanning, digitizing, taping, Web
distribution, information networks, or information storage and retrieval systems, without the prior
written permission of the publisher.
© Prof Dr S Ramalingam
Published by:
Na Subbu Reddiar 100 Educational Trust
AD-13, 5th Street, Anna Nagar, Chennai—600 040
Cover Design:
Centenary Committee
Dedicated to:
My Paternal GRANDmother
&
ALL my Teachers
Foreword i
Preface v
Contents
Chapter I
The Ethics of Academic Research 1
Trust is the foundation of scholarship in the academic fratenity. Innovation can continue only in an atmosphere of
confidence and fairness. Scholars will strengthen the foundation of trust within the fratenity by gaining knowledge of their
fields and committing themselves to cultivating collegial relationships. It is always advantageous if the research scholar is
aware of the possible ethical issues involved during the process of an academic research.
Chapter II
Journey of a PhD Thesis 8
The journey of academic research is fascinating and fabuluous one, especially the research scholar is highly committed.
Here, the various stages and their intricacies of an academic research are interestingly narrated and also some practical
tips and guidelines are provided to make the research scholars comfortable.
Chapter III
Research Proposal 18
A PhD thesis proposal is an extremely important document, and much thought and planning should go into crafting this
document. Writing a clear and effective thesis proposal is the first step to a career in research. The major stages in an
academic research are detailed with illustrations and also a suitable format for a research proposal is suggested.
Chapter IV
Selecting a Supervisor 32
Matching of scholar to supervisor for effective relationships is crucially important. There are several qualities that the
research scholars expect to see in their research supervisor. This chapter makes a serious attempt to indicate highly
astonishingly approaches that a scholar never dares to attempt. Highly critical chapter indeed, it is.
Chapter V
Finalizing the Topic 40
Topic represents the core subject matter of scholarly communication and the means by which the scholar arrives at other
possible topics of research and discover new knowledge. It is important to keep in mind that an initial topic may not be the
exact topic.This chapter provides a variety of scholarly strategies to design and then, finally decide a research topic.
Chapter VI
Research Problem 46
A research problem is the topic one would like to address, investigate, or study, whether descriptively or experimentally. It
is the focus or reason for engaging in research study. The research problem should be stated in such a way that it would
lead to analytical thinking on the part of the researcher with the aim of possible concluding solutions to the stated
problem. This chapter comprehensively illustrates the various stages and intricacies involved in identifying a research
problem.
Chapter VII
Review of Literature 54
The "literature" of a literature review refers to any collection of materials on a topic, not necessarily the great literary texts
of the world. Literature reviews provide the research scholar with a handy guide to a particular topic. A literature review is
usually organized around ideas, not the sources themselves as an annotated bibliography would be organized. This
means that one will not just simply list the sources and go into detail about each one of them, one at a time. Literature
review, fulcrum of an academic research is, in its breadth, width and depth, discussed in this chapter.
Chapter VIII
Scope of Research Study 69
Scope is simply boundary of the research. Scope of coverage defines what areas around the subject matter the research
covers and what it did not. This means that the scope of the study may be referred to the specific element and content
that the researcher wants to explore in a study. The scope of a ‘research scope’ is nicely presented here.
Chapter IX
Limitations 72
Without exception, all research is limited in several ways. It is important to remember that all research suffers from
limitations. Even though there may be a large number of limitations in any thesis, it is not necessary todiscuss all of these
limitations in the Research Limitations section. Limitless scope of ‘limitations’ has been discussed in this chapter.
Chapter X
Objectives 79
The objective of the research should be closely related to the research study of the thesis.The main purpose of the
research objective is to focus on research problem, avoid the collection of unnecessary data and provide direction to
research study. Scholars should remember that the objectives of a research study form and define the direction and path
of the research journey. Greater and meticulous attention given at the stage of designing the objectives would make the
scholars feel at ease at the later stages of the reaearch study.
Chapter XI
Research Design 84
The research design refers to the strategy a scholar chooses to integrate the different components of the study in a
cohesive and coherent way in order to address the research problem; it constitutes the blueprint for the collection,
measurement, and analysis of data. Throughout the design construction task, it is important to have in mind some
endpoint, some criteria which we should try to achieve before finally accepting a design strategy.
Chapter XII
Sampling 99
The size of the sample depends on the type of research design being used; the desired level of confidence in the results;
the amount of accuracy wanted; and the characteristics of the population of interest. Sample size has little to do with the
size of the population, however. Detailed illusustrations for various types of sampling techniques are provided to enable
the research scholar to get the comprehensive understanding of the concept.
Chapter XIII
Designing a Questionnaire 110
Perhaps the most important stage of the survey process is the creation of questions that accurately measure the opinions,
experiences and behaviors of the public. Questionnaire design is a multiple-stage process that requires attention to many
details at the same time. The effects of question wording are one of the least understood areas of questionnaire research.
Chapter XIV
Data Collection 128
Data collection is the process of gathering and measuring information on variables of interest, in an established
systematic fashion that enables one to answer stated research questions, test hypotheses, and evaluate outcomes. Data
collection is to be considered as an art as well as science. It playes a very crucial role in any research study. Selecting a
correct method of data collection helps a research scholar in correct path and yields a desired quality results.
Chapter XV
Statistical Tools for Research 138
Scholars frequently use statistics to analyze their results. Statistics can help understand a phenomenon by confirming or
rejecting a hypothesis. It is vital to how one acquires knowledge to most scientific theories. It is the scholar’s primary
responsibility to identify and use the relevant types of statistical tools that suit his nature of research study. It is not always
safe to rely entirely on statisticians.
Chapter XVI
Reliability & Validity 147
Measurement experts believe that every measurement device should possess certain qualities. Perhaps the two most
common technical concepts in measurement are reliability and validity. Any kind of assessment, whether traditional or
"authentic," must be developed in a way that gives the assessor accurate information about the performance of the
individual. This chapter provides an interesting of reliability & validity.
Chapter XVII
Data Analysis 152
Data analysis is a body of methods that help to describe facts, detect patterns, develop explanations, and test
hypotheses. It is used in all of the sciences. It is used in business, in administration, and in policy. Data analysis is not
about numbers — it uses them.
Chapter XVIII
Findings of Research Study 160
The value of a scholar’s thesis will stand or fall on the validity and quality of the thesis findings. Critical as well as the most
significant stage of the thesis is identifying and finalizing the findings of the thesis. Various components that should occuy
in the research findings section are narrated and some useful tips to develop a good and quality findings are provided in
this chapter.
Chapter XIX
Structure of Thesis 163
This chapter addresses the problem/issues/difficulties involved in designing and structuring a PhD Thesis. A flexible five
chapter structure is comprehensively illustrated. A highly detailed sequence of a PhD thesis – chapterwise, sectionwise
and subsectionwise – is presented to enable a research scholar to feel at ease while drafting the thesis. Very interesting
chapter in the book.
Chapter XX
Research Discussion 171
The discussion section explains scholar’s interpretation of the findings as they relate to the research problem already
investigated. This section is comprised of all new information and focuses on the implications of the findings in relation to
the overall scope of other research that has taken place.
Chapter XXI
Writing a Thesis 178
Research scholars encounter many pitfalls when writing a thesis. A well-written thesis is essentially a sustained analysis
of a research topic and even the most careful scholar can succumb to commonly made mistakes in a work of this
magnitude. This chapter discusses the planning of the writing process, issues/difficulties encounted and mainly,
commonly made mistakes of writing thesiss such as the danger of disorganization, the problem of writing a worthy
conclusion and the problem of writing an analytical literature review and offers some strategies to overcome them.
Chapter XXII
Anatomy of an Abstract 185
An abstract is a condensed version of a longer piece of writing that highlights the major points covered, concisely
describes the content and scope of the writing, and reviews the writing's contents in abbreviated form. Many scholars
struggle to write a good abstract because they know that a poor abstract will wreck their whole thesis. Even if the whole
thesis is perfect, a mere indifference in the quality of abstract will turnoff the mind of the reader from the whole thesis.
Thus a scholar should put maximum efforts to write an astonishing thesis abstract so that output can be obtained in form
of encouragement and nice suggestions from the readers.
Chapter XXIII
Endnotes 189
Endnotes are used: (1) to cite the source of statements quoted or closely paraphrased in the text, (2) to make additional
comments about some point of the text, or (3) to acknowledge someone else for an idea or argument. The quantity and
quality of citations noted in a research thesis reflect the seriousness and curiosity of research scholars. Evaluators or
examiners would definitely take a note of the scholarship of scholars and appreciate the efforts put by the scholars.
Chapter XXIV
Research Conclusion 193
The conclusion should provide a restatement of the thesis, a summary of the author's conclusions, and perhaps a solution
to the problem, if this is the writer's intent. The conclusion of a thesis should be closed summarizing everything that has
come before, explaining in simple terms the way in which the research study ended, relating it to the greater environment
of the world at large, and leaving the reader with the ability to draw his or her own conclusions from what you have
described.
Chapter XXV
Editing and Proofreading 198
Proofreading is the act of searching for errors before you hand in your final research thesis.Individualizing your
proofreading process to match weaknesses in your writing will help you proofread more efficiently and effectively.
Chapter XXVI
Writing an Annotated Bibliography 204
An annotated bibliography is a list of citations related to a particular subject area or theme that include a brief, usually not
more than 150 words, descriptive or evaluative summary. As a result, you are better prepared to develop your own point
of view and contributions to the literature. The format of an annotated bibliography can differ depending on its purpose
and the nature of the assignment. It may be arranged alphabetically by author or chronologically by publication date. Ask
your supervisor for specific guidelines in terms of length, focus, and type of annotation as cited.
Chapter XXVII
Research Results 210
The results section of the research paper is where you report the findings of the research study based upon the
information gathered as a result of the methodology [or methodologies] applied in the research study. The results section
should simply state the findings, without bias or interpretation, and arranged in a logical sequence. The results section
should always be written in the past tense. A section describing results is particularly necessary if the research includes
data generated from the study.
Chapter XXVIII
Defending a Thesis 214
The thesis defense or viva voce is like an oral examination in some ways. It is different in many ways, however. The chief
difference is that the candidate usually knows more about the syllabus than do the examiners. It would be a mistake,
however, to underestimate the examiners' knowledge of your subject. Think of your defense as a high-level professional
conversation about a topic of interest. No doctoral dissertation committee worthy of the name assembles for the sole
purpose of publicly humiliating a candidate; faculty members are in the business of supporting successful program
completion whenever possible. Have some confidence in you!
Chapter XXIX
Reading a Research Paper 222
The process of reading research papers effectively is challenging. Reading a research paper often requires a special
approach as well as skill. The aim of reading a research paper varies depending upon the necessities of the reader. But it
has to be borne in mind , whatever be type of the reader, that one has to be familiar with the standard format of any
research paper and has to have some correct prospective while reading the paper. At end of the day, the author serves
the community provided the reader gets what he/she wants.
Chapter XXX
Evaluating a Research Paper 230
While research papers contribute to the community in general, the well-judged and well-balanced evaluation endures the
quality of the paper and enriches the value and utility of the paper. This chapter discusses, in detail, the stages, intricacies
and strategies involved in evaluating a research paper and the purpose of evaluation and benefits a scholar gets are
detailed.
Chapter XXXI
Journal Impact Paper 238
It has become mandatory for academic research scholars to publish a minimum number of research papers in peer-
reviewed national or international reputed journals having a reasonable Impact Factor. There have been many innovative
applications of journal impact factors. The impact factor can be used to provide a gross approximation of the prestige of
journals in which individuals have been published. This is best done in conjunction with other considerations such as peer
review, productivity, and subject specialty citation rates.
Chapter XXXII
Publishing a Research Paper 248
Publishing your work in a peer reviewed journal is an indication of quality. Intending researchers need to submit their
articles for review by experts in the field before the article can be approved for publication in a peer-reviewed journal.
Many databases allow the scholar to restrict the search to peer-reviewed journals. The important salient feature of this
chapter is that it provides a highly comprehensive [nearly 100] guidelines to enable a scholar to publish a paper of
academit quality.
Chapter XXXIII
Plagiarism 263
Plagiarism is the method of taking another person's writing, conversation, song, or even idea and showing it off as one’s
own. This includes information from web pages, books, songs, television shows, email messages, interviews, articles,
artworks or any other medium. Careful notetaking and a clear understanding of the rules for quoting, paraphraing, and
summarizing sources can help prevent plagiarism.
Glossary 272
Appendices
There are a number of ways of thinking about it. The first thing that comes
immediately to mind to many PhD students is that it is ‘a contribution to
knowledge’.Other elements to it are that it is: a license to teach in a university, a
signal of expertise and authority, a qualification, the highest degree that can be
awarded in a university.
i
In many senses, the dissemination of the research results is just as important as
the research activity itself.
There are many ways to disseminate research results, and the production of a
research thesis is one of them. Although a research thesis is a usual requirement
for academic degree programmes that include a research element, it is more
than an instrument for the assessment of the research scholar. It must be written
such that the results presented can be validated and to form the basis for further
research. Procedures adopted must be justified; claims and conclusions must be
supported by experiments or reasoned arguments and deductions. A research
thesis contains elements which distinguish it from other types of reports, and
because it is the culmination of several years of work, the publication can be
quite voluminous. Writing one therefore requires some thought, planning and
organisation.
For a research scholar, thesis writing is a very important aspect of one’s learning
life because passing the course depends on it. Therefore, one should be focused
while working on the thesis. One should make a systematic beginning by
examining what topic to present the thesis on. Then, one has to see that a topic
for research thesis is decided which one can easily prove. One has to do some
research on it. This way, a scholar will be able to find out whether the topic is
worth investing one’s time or not.
When a research scholar is about to begin, writing a thesis seems a long, difficult
task. That is because it is a long, difficult
task. Fortunately, it will seem less daunting
once one has a couple of chapters done.
Towards the end, one will even find
enjoying it---an enjoyment based on
satisfaction in the achievement, pleasure in
the improvement in the technical writing, and of course the approaching end.
Despite the fact that universities have been assessing doctoral theses for many
years, there has been little research done on the processes involved in drafting a
PhD thesis.There are several scholarly books on ‘Research Methodology’ by
reputed authors in the market. Though they deal comprehensively about the
theoretical aspects of academic research and even offer some guidelines,
elaborate and down to earth guidelines are rare. Of course, research supervisor’s
ii
guidance is always available to scholars. Obviously there is a gap between these
known two sources. Probably this book, “A2Z PhD Thesis” is a serious attempt
to fill the gap.
The set of tips intends to give some ideas and guidelines on how to go about
writing a research thesis. As one reads through this ‘A2Z PhD Thesis’ one will
probably notice that writing a thesis is not as daunting and hard as feared.This is
a general guide for all disciplines, but is most suitable for scholars in social and
behavioral sciences. The text is organized according to the stages of the
research-and-writing process as defined by the authors: preparation, choosing a
topic, collecting information, organizing information, interpreting results, and
presenting the finished product.
·
The special feature of the book is that it contains [a] a practical guidelines to use
SPSS, [b] comprehensive guidelines to statistical tools using Excel, [c] a list of
online research resources and [d] Research 360o comprising of (i) web links to
more than 10 million online papers/articles, (ii) web links to more than 100 000
iii
online journals, (iii) web links to nearly 8000 online international libraries, (iv) web
links to more than 10000 international research organizations,
(v) web links to several online TV channels, Newspapers, etc,
(vi) web links to all countries’ official websites, WHO, UN, ILO,
IMF, World Bank, etc, (vii) web links to online Encyclopedia,
Dictionaries, etc and (viii) web links to useful tools to
researchers like, Citation Style, Writing Skills, Report Writing,
Researh Methods, etc. All these are made available in an accompanied CD.
I congratulate the author for the efforts in bringing out a resource material for the
benefit of the research scholars in the Process of Thesis Writing. I wish the
Research Scholars to make use of this book and become successful in their
Research activities.
iv
Preface
v
which relies on insight, lateral thinking, inspiration and a lot of hard work. Clearly
the purpose of this book is to help the scholar to set out to obtain a PhD.
The ability to conduct research in an area requires deep knowledge in that area,
knowledge about related areas, and the experience of working on research
problems, i.e. problems whose outcomes are not
known. To develop these critical abilities, most PhD
programs have three components in them – some
course work to provide the breath of knowledge, some
methods to develop the depth of knowledge in the
chosen area of study, and a thesis that provides the
experience of working on research problems. Doing a
PhD is mostly self driven and self taught degree and
the supervisor gently aiding the process. The program
and supervisor help mostly in creating an atmosphere
and environment in which the scholar gets motivated to excel. Hence, while
doing a PhD, the scholar should be self motivated and committed, and willing to
work hard and long on problems. Research is often a lonely business and PhD is
a preparation for a career in it. Research is tough career, but with the tips
provided in this book it can become easier and more satisfying.
There are other things which look simple until a scholar stops and thinks about
them. For instance, how do to choose a topic, and
how to find a good supervisor? The standard books
give quite a lot of good advice about this, but there
will still be quite a lot of things that one is not sure
about. So, what do you do about this? One good
step is to read the rest of this book at this point. The
main thing is that it gives a fair idea about which
things matter, which things are well understood and which things are
comparatively peripheral. For instance, a brief about academic writing as
opposed to formal English is provided (because most scholars are pretty bad at
it) and about feeling lost. Similarly, not much is said about statistics and about
experimental design, because these are comprehensively covered by numerous
excellent texts and training courses, so one should have no problems getting
access to them if they’re needed for the research. But, regarding citation style, a
brief guideline on APA system is available in the appendix to give relief to
scholars.
In the fast track of ICT, any written material becomes obsolete quickly and to
meet this eventuality, a CD accompanies the book; the CD contains exhaustive
information by way of links to various web sites covering almost all materials that
one scholar may require. For example, CD contains [a] a practical guidelines to
use SPSS, [b] comprehensive guidelines to statistical tools using Excel, [c] a list
of online research resources and [d] Research 360 o comprising of links to online
libraries, international research organizations, journals, articles/papers, news
media, etc. These sources would be very helpful and handy to a research scholar
enabling him/her to get updated always. Any suggesions to improve the contents
and quality of the book are always welcome.
All the sources, like books, articles, papers, journals, electronic sources like
webpages, etc are duly acknowledged and readers, if necessary, may make use
of them.
vii
A2Z
PhD
Thesis
Chapter I
1
THE ETHICS OF ACADEMIC RESEARCH
Introduction
Ethics are moral principles that guide behavior; in an academic environment, these
moral principles expand to become the standard rules of scholarly conduct. Academic
ethics involves such concepts as intellectual property, copyright, fair use, plagiarism,
censorship, freedom of speech, and the use of proprietary and non-proprietary
resources.
Trust is the foundation of scholarship in the academic fratenity. Innovation can continue
only in an atmosphere of confidence and fairness. A scholar must be able to trust that
colleagues are honest in presenting their research, and they must have the same trust in
other’s work. The range of research subjects and methods, along with systems of
analysis and data presentation that guide each field, give rise to situations of great moral
complexity. Likewise, relationships between research scholars and supervisors, along
with great opportunity, carry important responsibilities and obligations. Scholars will
strengthen the foundation of trust within the fratenity by gaining knowledge of their fields
and committing themselves to cultivating collegial relationships.
Ethical Issues
It is always advantageous if the research scholar is aware of the possible ethical issues
involved during the process of an academic research. The following is the summary of
those possible issues:
Designing research:
Researcher’s right to absence of gatekeeper coercion, Panticipant’s right to be fully informed,
Participant’s right to privacy, Sponsor’s/Participant’s right to Quality research
Collection of Data:
2
Researcher’s right to absence of sponsor coercion, Researcher’s right to safety, Participant’s
right to informed consent, Participant’s right to withdraw, Participant’s deception, Participant’s
right to confidentiality/anonymity, Organization’s right to confidentiality/anonymity, Sponsor’s /
Participant’s right to Quality research
Processing of Data:
Participant’s right as individuals to the processing and storing of his personal data
Analysis of Data:
Researcher’s right to absence of sponsor coercion, Organization’s rights to confidentiality /
anonymity, Participant’s right to confidentiality / anonymity, Sponsor’s/Participant’s right to Quality
research
Primary research is conducted all of the time--journalists use it as their primary means of
reporting news and events; national polls and surveys discover what the population
thinks about a particular political figure or proposal; and companies collect data on their
consumer base and market trends. When conducting research in an academic or
professional setting, a scholar needs to be aware of the ethics behind the research
activity.
One should have the permission of the people whom he will be studying to conduct
research involving them.
Not all types of research require permission—for example, if you are interested in
analyzing something that is available publicly (such as in the case of commercials, public
message boards, etc) you do not necessarily need the permission of the authors.
One should not do anything that would cause physical or emotional harm to your
subjects. This could be something as simple as being careful how one is word sensitive
or difficult questions during the interviews.
Objectivity vs. subjectivity in the research is another important consideration. Be sure
one’s own personal biases and opinions do not get in the way of your research.
Many types of research, such as surveys or observations, should be conducted under the
assumption that you will keep one’s findings anonymous. Many interviews, however, are
not done under the condition of anonymity. One should let one’s subjects know whether
the research results will be anonymous or not
When one is doing research, one should be sure not taking advantage of easy-to-access
groups of people (such as children at a daycare) simply because they are easy to
access. One should choose your subjects based on what would most benefit the
research.
Some types of research done in a university setting require Institutional Board Approval.
This means that the research has to be approved by an ethics review committee to make
sure the scholar is not violating any of the above considerations.
When reporting the results one should be sure that the scholar accurately represents
what is observed or what was told. Interview responses should not be out of context and
3
the small parts of observations should not be discussed without putting them into the
appropriate context.
Ethical Codes
Given the importance of ethics for the conduct of research, it should come as no surprise
that many different professional associations, government agencies, and universities
have adopted specific codes, rules, and policies relating to research ethics.
The following is a rough and general summary of some ethical principals that various
ethical committee usually address. [Adapted from Shamoo A and Resnik D. 2009. Responsible Conduct
of Research, 2nd ed. (New York: Oxford University Press].
Honesty
Strive for honesty in all scientific communications. Honestly report data, results, methods and
procedures, and publication status. Do not fabricate, falsify, or misrepresent data. Do not deceive
colleagues, granting agencies, or the public.
Objectivity
Strive to avoid bias in experimental design, data analysis, data interpretation, peer review,
personnel decisions, grant writing, expert testimony, and other aspects of research where
objectivity is expected or required. Avoid or minimize bias or self-deception. Disclose personal or
financial interests that may affect research.
Integrity
Keep your promises and agreements; act with sincerity; strive for consistency of thought and
action.
Carefulness
Avoid careless errors and negligence; carefully and critically examine your own work and the
work of your peers. Keep good records of research activities, such as data collection, research
design, and correspondence with agencies or journals.
Openness
Share data, results, ideas, tools, resources. Be open to criticism and new ideas.
Confidentiality
Protect confidential communications, such as papers or grants submitted for publication,
personnel records, trade or military secrets, and patient records.
Responsible Publication
Publish in order to advance research and scholarship, not to advance just your own career.
Avoid wasteful and duplicative publication.
4
Responsible Mentoring
Help to educate, mentor, and advise students. Promote their welfare and allow them to make
their own decisions.
Non-Discrimination
Avoid discrimination against colleagues or students on the basis of sex, race, ethnicity, or other
factors that are not related to their scientific competence and integrity.
Competence
Maintain and improve your own professional competence and expertise through lifelong
education and learning; take steps to promote competence in science as a whole.
Legality
Know and obey relevant laws and institutional and governmental policies.
Animal Care
Show proper respect and care for animals when using them in research. Do not conduct
unnecessary or poorly designed animal experiments.
Ethical Dilemmas
There are many other activities that the government does not define as "misconduct" but
which are still regarded by most researchers as unethical. These are called "other
deviations" from acceptable research practices and these situations create difficult
decisions for research known as ethical dilemmas. The following is a list:
Publishing the same paper in two different journals without telling the editors
Submitting the same paper to different journals without telling the editors
Not informing a collaborator of your intent to file a patent in order to make sure that you
are the
sole inventor
Including a colleague as an author on a paper in return for a favor even though the
colleague did not make a serious contribution to the paper
5
Discussing with your colleagues confidential data from a paper that you are reviewing for
a
journal
Trimming outliers from a data set without discussing your reasons in paper
Using an inappropriate statistical technique in order to enhance the significance of your
research
Bypassing the peer review process and announcing your results through a press
conference
without giving peers adequate information to review your work
Conducting a review of the literature that fails to acknowledge the contributions of other
people in
the field or relevant prior work
Stretching the truth on a grant application in order to convince reviewers that your project
will
make a significant contribution to the field
Stretching the truth on a job application or curriculum vita
Giving the same research project to two graduate students in order to see who can do it
the
fastest
Overworking, neglecting, or exploiting graduate or post-doctoral students
Failing to keep good research records
Failing to maintain research data for a reasonable period of time
Making derogatory comments and personal attacks in your review of author's submission
Promising a student a better grade for sexual favors
Using a racist epithet in the laboratory
Not reporting an adverse event in a human research experiment
Wasting animals in research
Exposing students and staff to biological risks in violation of your institution's biosafety
rules
Rejecting a manuscript for publication without even reading it
Sabotaging someone's work
Stealing supplies, books, or data
Rigging an experiment so you know how it will turn out
Making unauthorized copies of data, papers, or computer programs
Owning over a statutory amount in stock in a company that sponsors your research and
not
disclosing this financial interest
Deliberately overestimating the clinical significance of a new drug in order to obtain
economic
benefits
6
Finally, situations frequently arise in research in which different people disagree about
the proper course of action and there is no broad consensus about what should be done.
In these situations, there may be good arguments on both sides of the issue and
different ethical principles may conflict.
Conclusion
If "deviations" from ethical conduct occur in research as a result of ignorance or a failure
to reflect critically on problematic traditions, then a course in research ethics may help
reduce the rate of serious deviations by improving the researcher's understanding of
ethics and by sensitizing him or her to the issues. Finally, training in research ethics
should be able to help researchers grapple with ethical dilemmas by introducing
researchers to important concepts, tools, principles, and methods that can be useful in
resolving these dilemmas. Whistleblowing is one mechanism to help discover
misconduct in research. But apart from these, a supervisor has his own ethical role in
advising and guiding the research scholar of these issues/prctices and ensuring that the
scholars do not indulge in these highly unethical practices.
“Even the most rational approach to ethics is defenseless if there isn’t the will to do what is
right.”
<< Alexdander Solzhenitsyn
7
Reflections on Academic Research
A2Z
PhD
Thesis
Chapter II
Journey of a
PhD Thesis
8
JOURNEY OF A PhD THESIS
SOME TIPS
Introduction
Although every thesis is unique, they all aim to persuade the reader of one 'big idea'.
This central claim is otherwise referred to as the ‘thesis’; hence a research thesis is the
improvement/development of one central claim. This is reflected in research degree
requirements that demand candidates to demonstrate a ‘significant original contribution
to knowledge, and/or to the application of knowledge within the field of study’. When you
are about to begin, writing a thesis seems a long, difficult task. That is because it is too
long, difficult task. But, it will seem less daunting once a couple of chapters are done.
Towards the end, you will even find yourself enjoying it---an enjoyment based on
satisfaction in the achievement, pleasure in the improvement in your technical writing,
and of course the approaching end. Like many other tasks, thesis writing usually seems
worst before you begin. Of course, each long journey starts with a single step.
Start Early
Begin working on your essay as soon as the assignment is given. Take advantage of the
time at your disposal to do your research and writing to meet the due date. If you wait
until the last minute, you may have difficulty finding library materials, particularly if other
students are researching the same topic, and you may be pressured by other
assignments.
Do a literature survey
Start the journey. Try to know and have several sources that would help you gathering
materials, such as research articles, papers, journals, etc. Go through these materials
seriously and thoroughly and make a systematic list of all materials to enable you to
refer back at any time during your research. List may be useful to you when you go for a
bibliography.
Focus a field/area
9
Keeping in mind the guidelines your supervisor has set down for the assignment in terms
of length, subject matter, types of sources, etc., choose a topic you would be interested
in pursuing. Your next step is to verify at the library that there is sufficient material to
support your choice. If not, discard your topic and adopt a more realistic one.
Do not fall into the trap of selecting a topic that is so broad you would have to write a
book to do it justice. Limit your topic to one particular aspect that you will be able to treat
thoroughly within the prescribed limits of your thesis. Background reading in a general or
specialized encyclopedia will give your a clue as to the subject's natural limits and
divisions. The librarian can direct you to the encyclopedia that will be appropriate to your
particular needs
Roughly organize your thoughts to produce an outline that will give direction to your
reading and note-taking. Take advantage of the Libraries' varied resources: The library
catalogue, for books (including government publications) on your topic; consult the
databases to locate articles; request the advice of your supervisor and the librarian who
may be able to direct you to other sources pertinent to your subject area.
For each source that you have consulted, be sure you have all the information
necessary to cite it in your bibliography. Accuracy at this stage will save you the trouble
of having to re-trace your steps when you are writing your final draft. For a book, mark
down the author, title, place of publication, publisher and copyright date. For an article
from a journal, take note of the author, title of the article, title of the journal, volume and
issue number, date and inclusive page numbers. For a Web document, take down the
author, title, date, URL (Web address) and date consulted.
Map out your approach by composing a detailed sentence outline. First, compose a
thesis statement. This one sentence statement is the most important one of your entire
10
research paper; so be sure to phrase it carefully. A thesis statement clearly
communicates the subject of your paper and the approach you are going to take to it. It
is the controlling factor to which all information that follows must relate. Secondly, group
and regroup your notes according to the various aspects of your topic until you find a
sequence that seems logical. This can serve as the basis for your outline.
In writing a rough draft you are striving for a flow of ideas. Write non-stop using your final
outline and organized notes as guides. Do not worry about correct spelling or
punctuation at this stage. Remember that the purpose of a rough draft is to see if you
have a logical progression of arguments and sufficient supporting material.
Make the necessary adjustments until you are satisfied your statements flow logically
and your ideas have been fully presented in clear, concise prose. You may need to
review your documentation if some sections of your text need further development.
A bibliography is a listing in alphabetical order according to the author's last name of all
the sources you consulted in preparing your research paper. It is presented on a
separate page at the end and is set up according to a standard format that you will find
described in most style manuals. Examples for the most commonly used citation styles
(APA, MLA, Chicago) are available. Reworks are a Web-based tool that helps organize
the references you find and prepares a bibliography automatically.
You are now ready to focus primarily on the style of your essay rather than the content.
Make use of:
11
Time required
It is strongly recommended sitting down with your supervisor and making up a timetable
for writing it: a list of dates for when you will give the first and second drafts of each
chapter to your supervisor. This structures your time and provides intermediate targets.
If you merely aim "to have the whole thing done by some distant date", you can deceive
yourself and procrastinate more easily. If you have told your supervisor that you will
deliver a first draft of chapter 2 on Friday, it focuses your attention. You may want to
make your timetable into a chart with items that you can check off as you have finished
them. This is particularly useful towards the end of the thesis when you find there will be
quite a few loose ends here and there.
How much time it may take? Let us hear what Chinneck (1999) has to say: “Longer than
you think. Even after the research itself is all done ‐‐ models built, calculations complete
‐‐ it is wise to allow at least one complete term for writing the thesis. It's not the physical
act of typing that takes so long, it's the fact that writing the thesis requires the complete
organization of your arguments and results. It's during this formalization of your results
into a well‐organized thesis document capable of withstanding the scrutiny of expert
examiners that you discover weaknesses. It's fixing those weaknesses that take time.”
In general, after completing and finalizing the preliminary administrative formalities which
may take approximately upto 18 months,a scholar should allow from 18 to 24 months.
As this suggestion is time-tested, scholar may take this as a basis and could proceed
accordingly. The following is a breakdown of the phases of analysis and the time to be
alloted for each one:
(1) Literature Review: The literature review is in many ways the most difficult and time
consuming part of the thesis project. It is also the most important. The review of the literature
provides the context for your thesis project. You will be building on previous researchers’ work so
it is important that you be thoroughly familiar with it. The review of the literature provides your
hypothesis, your methodology, and your context for analysis and interpretation. Therefore you
should spend considerable time on this part of your project. You should initially allot at least four
to six months for this part of your research. However you should also realize that the literature
review will continue for the duration of the project.
(2) Data Collection: Depending on where you will be doing your data collection and whether you
will be doing it full‐time or part‐time, the data collection phase of your project will take between
three and six months. It is very important that you build in enough time to go back and redo some
of your data collection. Most researchers find that their expertise changes over the course of data
collection and you will need to go back and recheck the data that you initially collected.
12
(3) Data Analysis: Do not underestimate the time you allot here. Learning a statistics package
takes time. Researching the appropriate statistics and learning how to use and apply them to
your data takes time. Plan on at least three months for this phase of your analysis.
(4) Preparation of the Thesis Drafts: You will be writing in drafts. You should count on writing at
least three to four complete drafts before your thesis is complete. As a general guideline, from
first draft to final draft you should count on at least six months.
(5) Final Submission: Once the final draft is completed, it has to be finetuned, printed, required
number of copies taken and finally submitted complying with the administrative formalities. This
may take another two months.
Iterative solution
Whenever you sit down to write, it is very important to write something. So write
something, even if it is just a set of notes or a few paragraphs of text that you would
never show to anyone else. Most of us find it easier, however, to improve something that
is already written than to produce text from nothing. So put down a draft (as rough as
you like) for your own purposes, then improve it up with the held of your supervisor.
Word-processors are wonderful in this regard: in the first draft you do not have to start at
the beginning, you can leave gaps, you can put in little notes to yourself, and then you
can fill the gaps with relevant materials later.
Your supervisor will want your thesis to be as good as possible, because his/her
reputation as well as yours is affected. Scientific writing is a difficult art, and it takes a
while to learn. As a consequence, there will be many ways in which your first draft can
be improved. So take a positive attitude to all the scribbles with which your supervisor
decorates your text: each comment tells you a way in which you can make your thesis
better.
The process of writing the thesis is like a course in scientific writing, and in that sense
each chapter is like an assignment in which you are taught, but not assessed.
Remember, only the final draft is assessed: the more comments your supervisor adds to
13
first or second draft, the better. Before you submit a draft to your adviser, run a spell
check so that s/he does not waste time on those. If you have any characteristic
grammatical failings, check for them.
What is a thesis?
Your thesis is a research report. The report concerns a problem or series of problems in
your area of research and it should describe what was known about it previously, what
you did towards solving it, what you think your results mean, and where or how further
progress in the field can be made. The readers of a thesis do not know what the
"answer" is. If the thesis is for a PhD, the university requires that it makes an original
contribution to human knowledge: your research must discover something hitherto
unknown.
Obviously your examiners will read the thesis. They will be experts in the general field of
your thesis but, on the exact topic of your thesis, you are the world expert. Keep this in
mind: you should write to make the topic clear to a reader who has not spent most of the
last three years thinking about it. Your thesis will also be used as a scientific report and
consulted by future workers in your laboratory who will want to know, in detail, what you
did. Theses are occasionally consulted by people from other institutions, and the library
sends microfilm versions if requested (yes, still). More commonly theses are now stored
in an entirely digital form. These may be stored as .pdf files on a server at your
university. The advantage is that your thesis can be consulted much more easily by
researchers around the world
Practical Tips
A research scholar should take all earnest efforts and steps in preparing/drafting a
thesis. Each stage, and each minor issue/ point have to be minutely taken care of and
meticulously attended to. Throughout the drafting of a thesis, serious thinking in depth
and breadth will help a scholar succeed and derive the complete satisfaction. Following
are some of the points one should bear in mind:
14
1. A thesis is a hypothesis or conjecture.
2. A PhD dissertation is a lengthy, formal document that argues in defense of a particular
thesis.
3. Two important adjectives used to describe a dissertation are ``original'' and ``substantial.''
The research performed to support a thesis must be both, and the dissertation must show
it to be so. In particular, a dissertation highlights original contributions.
4. The scientific method means starting with a hypothesis and then collecting evidence to
support or deny it. Before one can write a dissertation defending a particular thesis, one
must collect evidence that supports it. Thus, the most difficult aspect of writing a
dissertation consists of organizing the evidence and associated discussions into a
coherent form.
5. The essence of a dissertation is critical thinking, not experimental data. Analysis and
concepts form the heart of the work.
6. A dissertation concentrates on principles: it states the lessons learned, and not merely
the facts behind them.
7. In general, every statement in a dissertation must be supported either by a reference to
published scientific literature or by original work. Moreover, a dissertation does not repeat
the details of critical thinking and analysis found in published sources; it uses the results
as fact and refers the reader to the source for further details.
8. Each sentence in a dissertation must be complete and correct in a grammatical sense.
Moreover, a dissertation must satisfy the stringent rules of formal grammar (Indeed, the
writing in a dissertation must be crystal clear. Shades of meaning matter; the terminology
and prose must make fine distinctions. The words must convey exactly the meaning
intended, nothing more and nothing less.
9. Each statement in a dissertation must be correct and defensible in a logical and scientific
sense. Moreover, the discussions in a dissertation must satisfy the most stringent rules of
logic applied to mathematics and science.
Conclusion
Remember the following phrase: "No one will ever read your thesis.'' You'll hear this
phrase a number of times as you finish up, and it's vitally important that you believe it to
be true. The phrase is important because without it you would be tempted to work on
your thesis until everything is perfect, and you would never finish. Writing a thesis is
tough work. It is often said: "You should tell everyone that it's going to be unpleasant,
that it will mess up their lives, that they will have to give up their friends and their social
lives for a while. It's a tough period for almost every student." By the way, there is a key
to success: practice. No one ever learned to write by reading essays like this. Instead,
you need to practise, practise. Every day. On behalf of scholars everywhere, I wish you
good luck!
15
Some Guidelines for Writing a Thesis
The following are the possible stages involved in the process of pursuing a research
study leading to PhD degree. A research scholar may have to undergo almost all stages,
though situations vary for each scholar. It is better for a scholar to know all these stages,
their importance and intricacies well before he/she involves himself/herself in the full
pledged research activities. One should have a pre-research discussion with the
supervisor.
16
[C] Writing the Thesis
”It is better to deserve honors and not have them than to have them
and not to deserve them.” << Mark Twain
17
A2Z
PhD
Thesis
Reflections on Academic Research
Chapter III
Research Proposal
18
RESEARCH PROPOSAL
A PhD thesis proposal is an extremely important document, and much thought and
planning should go into crafting this document. Its immediate purpose is to secure the
agreement of your thesis committee to allow you to pursue your research, but it has very
long-term implications: not only will the research take several years to complete, it will
have a major impact on your search for grant funding, post-doctoral positions, and the
competition for a tenure-track position. Writing a clear and effective thesis proposal is
the first step you will take on your way to a career in research.
1
“The introduction is the part of the paper that provides readers with the
background information for the research reported in the paper. Its purpose is to
establish a framework for the research, so that readers can understand how it is
related to other research” (Wilkinson, 1991, p. 96).
2
In an introduction, the writer should
4
Theories, theoretical frameworks, and lines of inquiry may be differently handled
in quantitative and qualitative endeavours.
(a) “In quantitative studies, one uses theory deductively and places it toward the
beginning of the plan for a study. The objective is to test or verify theory. One thus
begins the study advancing a theory, collects data to test it, and reflects on whether
the theory was confirmed or disconfirmed by the results in the study. The theory
becomes a framework for the entire study, an organizing model for the research
questions or hypotheses for the data collection procedure” (Creswell, 1994, pp. 87-
88).
(b) In qualitative inquiry, the use of theory and of a line of inquiry depends on the nature
of the
investigation. In studies aiming at “grounded theory,” for example, theory and
theoretical tenets emerge from findings. Much qualitative inquiry, however, also aims
to test or verify theory, hence in these cases the theoretical framework, as in
quantitative efforts, should be identified and discussed early on.
1
“The problem statement describes the context for the study and it also identifies
the general analysis approach” (Wiersma, 1995, p. 404).
2
“A problem might be defined as the issue that exists in the literature, theory, or
practice that leads to a need for the study” (Creswell, 1994, p. 50).
3.
It is important in a proposal that the problem stand out—that the reader can
easily recognize it. Sometimes, obscure and poorly formulated problems are
masked in an extended discussion. In such cases, reviewers and/or committee
members will have difficulty recognizing the problem.
4.
A problem statement should be presented within a context, and that context
20
should be provided and briefly explained, including a discussion of
the conceptual or theoretical framework in which it is embedded. Clearly and
succinctly identify and explain the problem within the framework of the theory or
line of inquiry that undergirds the study. This is of major importance in nearly all
proposals and requires careful attention. It is a key element that associations
such as AERA and APA look for in proposals. It is essential in all quantitative
research and much qualitative research.
5.
State the problem in terms intelligible to someone who is generally sophisticated
but who is relatively uninformed in the area of your investigation.
6.
Effective problem statements answer the question “Why does this research need
to be conducted.” If a researcher is unable to answer this question clearly and
succinctly, and without resorting to hyperspeaking (i.e., focusing on problems of
macro or global proportions that certainly will not be informed or alleviated by the
study), then the statement of the problem will come off as ambiguous and diffuse.
7.
For conference proposals, the statement of the problem is generally incorporated
into the introduction; academic proposals for theses or dissertations should have
this as a separate section.
1
“The purpose statement should provide a specific and accurate synopsis of the
overall purpose of the study” (Locke, Spirduso, & Silverman, 1987, p. 5). If the
purpose is not clear to the writer, it cannot be clear to the reader.
2.
Briefly define and delimit the specific area of the research. You will revisit this in
greater detail in a later section.
3
Foreshadow the hypotheses to be tested or the questions to be raised, as well as
the
significance of the study. These will require specific elaboration in subsequent
sections.
4
The purpose statement can also incorporate the rationale for the study. Some
committees prefer that the purpose and rationale be provided in separate
sections, however.
5
Key points to keep in mind when preparing a purpose statement.
21
(a) Try to incorporate a sentence that begins with “The purpose of this study is . .……...”
This will clarify your own mind as to the purpose and it will inform the reader directly
and explicitly.
(b) Clearly identify and define the central concepts or ideas of the study. Some
supervisors
prefer a separate section to this end. When defining terms, make a judicious choice
between
using descriptive or operational definitions.
(c) Identify the specific method of inquiry to be used.
(d) Identify the unit of analysis in the study.
1
“The review of the literature provides the background and context for the
research problem. It should establish the need for the research and indicate that
the writer is knowledgeable about the area” (Wiersma, 1995, p. 406).
2
The literature review accomplishes several important things.
(a) It shares with the reader the results of other studies that are closely related to the
study being reported (Fraenkel & Wallen, 1990).
(b) It relates a study to the larger, ongoing dialogue in the literature about a topic, filling
in gaps and extending prior studies (Marshall & Rossman, 1989).
(c) It provides a framework for establishing the importance of the study, as well as a
benchmark for comparing the results of a study with other findings.
(d) It “frames” the problem earlier identified.
3
Demonstrate to the reader that you have a comprehensive grasp of the field and
are aware of important recent substantive and methodological developments.
4
Delineate the “jumping-off place” for your study. How will your study refine,
revise, or extend what is now known?
5
Avoid statements that imply that little has been done in the area or that what has
been done is too extensive to permit easy summary. Statements of this sort are
usually taken as indications that the writer is not really familiar with the literature.
6
In a proposal, the literature review is generally brief and to the point. Be
judicious in your choice of exemplars—the literature selected should be pertinent
and relevant (APA, 2009). Select and reference only the more appropriate
citations. Make key points clearly and succinctly.
7
Doctoral Committees may want a section outlining your search strategy—the
22
procedures you used and sources you investigated (e.g., databases, journals,
test banks, experts in the field) to compile your literature review. Check with your
supervisor.
1
. Questions are relevant to normative or census type research (How many of them
are there? Is there a relationship between them?). They are most often used in
qualitative inquiry, although their use in quantitative inquiry is becoming more
prominent.Hypotheses are relevant to theoretical research and are typically used
only in quantitative inquiry. When a writer states hypotheses, the reader is
entitled to have an exposition of the theory that lead to them (and of the
assumptions underlying the theory). Just as conclusions must be grounded in the
data, hypotheses must be grounded in the theoretical framework.
2
A research question poses a relationship between two or more variables but
phrases the relationship as a question; ahypothesis represents a declarative
statement of the relations between two or more variables (Kerlinger, 1979;
Krathwohl, 1988).
3
Deciding whether to use questions or hypotheses depends on factors such as
the purpose of the study, the nature of the design and methodology, and the
audience of the research (at times even the taste and preference of committee
members, particularly the Chair).
4
The practice of using hypotheses was derived from using the scientific method in
social science inquiry. They have philosophical advantages in statistical testing,
as researchers should be and tend to be conservative and cautious in their
statements of conclusions (Armstrong, 1974).
5
Hypotheses can be couched in four kinds of statements.
(a) Literary null—a “no difference” form in terms of theoretical constructs. For
example, “There is no relationship between support services and academic
persistence of nontraditional-aged college women.” Or, “There is no difference in
school achievement for high and low self-regulated students.”
(b) Operational null—a “no difference” form in terms of the operation required to test
the hypothesis. For example, “There is no relationship between the number of
hours nontraditional-aged college women use the student union and their
persistence at the college after their freshman year.” Or, “There is no difference
between the mean grade point averages achieved by students in the upper and
lower quartiles of the distribution of the Self-regulated Inventory.” The operational
null is generally the preferred form of hypothesis-writing.
23
(c) Literary alternative—a form that states the hypothesis you will accept if the null
hypothesis is rejected, stated in terms of theoretical constructs. In other words,
this is usually what you hope the results will show. For example, “The more that
nontraditional-aged women use support services, the more they will persist
academically.” Or, “High self-regulated students will achieve more in their classes
than low self-regulated students.”
6
In general, the null hypothesis is used if theory/literature does not suggest a
hypothesized relationship between the variables under investigation; the
alternative is generally reserved for situations in which theory/research suggests
a relationship or directional interplay.
7
Be prepared to interpret any possible outcomes with respect to the questions or
hypotheses. It will be helpful if you visualize in your mind=s eye the tables (or
other summary devices) that you expect to result from your research (Guba,
1961).
8
Questions and hypotheses are testable propositions deduced and directly
derived from theory (except in grounded theory studies and similar types of
qualitative inquiry).
9
Make a clear and careful distinction between the dependent and independent
variables and be certain they are clear to the reader. Be excruciatingly consistent
in your use of terms. If appropriate, use the same pattern of wording and word
order in all hypotheses.
1
“The methods or procedures section is really the heart of the research proposal.
The activities should be described with as much detail as possible, and the
continuity between them should be apparent” (Wiersma, 1995, p. 409).
2
Indicate the methodological steps you will take to answer every question or to
test every hypothesis illustrated in the Questions/Hypotheses section.
24
3
All research is plagued by the presence of confounding variables (the noise that
covers up the information you would like to have). Confounding variables should
be minimized by various kinds of controls or be estimated and taken into account
by randomization processes (Guba, 1961). In the design section, indicate
(a) the variables you propose to control and how you propose to control them,
experimentally or statistically, and
(b) the variables you propose to randomize, and the nature of the randomizing
unit (students, grades, schools, etc.).
4
Be aware of possible sources of error to which your design exposes you. You will
not produce a perfect, error free design (no one can). However, you should
anticipate possible sources of error and attempt to overcome them or take them
into account in your analysis. Moreover, you should disclose to the reader the
sources you have identified and what efforts you have made to account for them.
5
Sampling
(a) The key reason for being concerned with sampling is that of validity—the
extent to which the interpretations of the results of the study follow from the
study itself and the extent to which results may be generalized to other
situations with other people (Shavelson, 1988).
(c) Another reason for being concerned with sampling is that of internal
validity—
the extent to which the outcomes of a study result from the variables that
were manipulated, measured, or selected rather than from other variables
not systematically treated. Without probability sampling, error estimates
cannot be constructed (Shavelson, 1988).
(d) Perhaps the key word in sampling is representative. One must ask oneself,
25
“How representative is the sample of the survey population (the group from
which the sample is selected) and how representative is the survey
population of the target population (the larger group to which we wish to
generalize)?”
(f) If available, outline the characteristics of the sample (by gender, race /
ethnicity, socioeconomic status, or other relevant group membership).
(g) Detail procedures to follow to obtain informed consent and ensure anonymity
and/or confidentiality.
6
Instrumentation
(a) Outline the instruments you propose to use (surveys, scales, interview
protocols, observation grids). If instruments have previously been
used, identify previous studies and findings related to reliability and
validity. If instruments have not previously been used, outline
procedures you will follow to develop and test their reliability and
validity. In the latter case, a pilot study is nearly essential.
7
Data Collection
(a) Outline the general plan for collecting the data. This may include survey
administration procedures, interview or observation procedures. Include an
explicit statement covering the field controls to be employed. If appropriate,
discuss how you obtained entré.
(b) Provide a general outline of the time schedule you expect to follow.
26
8
Data Analysis
(a) Specify the procedures you will use, and label them accurately (e.g.,
ANOVA,
MANCOVA, HLM, ethnography, case study, grounded theory). If coding
procedures are to be used, describe in reasonable detail. If you triangulated,
carefully explain how you went about it. Communicate your precise intentions
and reasons for these intentions to the reader. This helps you and the reader
evaluate the choices you made and procedures you followed.
(b) Indicate briefly any analytic tools you will have available and expect to use
(e.g., Ethnograph, NUDIST, AQUAD, SAS, SPSS, SYSTAT).
(c) Provide a well thought-out rationale for your decision to use the design,
methodology, and analyses you have selected.
1
A limitation identifies potential weaknesses of the study. Think about your
analysis, the nature of self-report, your instruments, the sample. Think about
threats to internal validity that may have been impossible to avoid or minimize—
explain.
2
A delimitation addresses how a study will be narrowed in scope, that is, how it is
bounded. This is the place to explain the things that you are not doing and why
you have chosen not to do them—the literature you will not review (and why not),
the population you are not studying (and why not), the methodological
procedures you will not use (and why you will not use them). Limit your
delimitations to the things that a reader might reasonably expect you to do but
that you, for clearly explained reasons, have decided not to do.
1
Indicate how your research will refine, revise, or extend existing knowledge in the
area under investigation. Note that such refinements, revisions, or extensions
may have either substantive, theoretical, or methodological significance. Think
pragmatically (i.e., cash value).
2
Most studies have two potential audiences: practitioners and professional peers.
Statements relating the research to both groups are in order.
3
This can be a difficult section to write. Think about implications—how results of
the study may affect scholarly research, theory, practice, educational
interventions, curricula, counseling, policy.
27
4
When thinking about the significance of your study, ask yourself the following
questions.
i What will results mean to the theoretical framework that framed the study?
ii What suggestions for subsequent research arise from the findings?
iii What will the results mean to the practicing educator?
iv Will results influence programs, methods, and/or interventions?
v Will results contribute to the solution of educational problems?
vi Will results influence educational policy decisions?
vii What will be improved or changed as a result of the proposed research?
viii How will results of the study be implemented, and what innovations will
come about?
I : References
1
Follow APA (2009) guidelines regarding use of references in text and in the
reference list. This is the requirement of the Department of Management Studies,
Dr MGR University.
2
Only references cited in the text are included in the reference list; however,
exceptions can be found to this rule. For example, committees may require
evidence that you are familiar with a broader spectrum of literature than that
immediately relevant to your research. In such instances, the reference list may
be called a bibliography.
3
Some committees require that reference lists and/or bibliographies be
“annotated,” which is to say that each entry be accompanied by a brief
description, or an abstract. Check with your Supervisor.
J : Appendices
The following materials are appropriate for an appendix. Consult with your
Supervisor.
1
Describe the theoretical framework for the dissertation. This section describes the foundations of
the research, especially if the thesis is heavily indebted to a particular approach to a topic, or if it
tests the validity of a given theory.
2
Describe the research problem itself, placed in this theoretical framework. You may choose to
include the principal studies that are relevant to your research proposal, although a fuller
literature review will be included below.
3
29
Describe the hypothesis. This is basically your statement of what you believe the research might
indicate. You should have some preliminary basis for this statement, in order to demonstrate that
a more thorough investigation is merited.
4
Describe the purpose of the study. This is perhaps the most important section of the PhD thesis
proposal: why should this study proceed? A dissertation at the doctoral level is intended to add to
the body of world knowledge; will this study achieve that goal? You should show how it will it
change the direction of current research, confirm current research in a novel manner, or in some
other way contribute in a positive way to what we know about the world.
5
Demonstrate a solid grasp of the parameters of the literature to be reviewed. This need not be
exhaustive for the proposal stage of the dissertation, but the major sources should be identified.
6
Describe the methodology of the study. This is especially important for quantitative proposals in
the sciences and social sciences.
7
Describe the limitations of the study. It may seem like your dissertation is incredibly huge in
scope, but every research project has its self-defined limitations.
8
Conclude with the significance of the proposal. This should be a briefer statement of the ideas
described earlier in the section describing the purpose of the study, but with a broader focus. Why
would the conclusions of this dissertation matter? Would they deepen our understanding of a
topic in a major way, or would they lead to some material benefit for humanity?
9
Attach any relevant appendices, including a preliminary bibliography, samples of survey
instruments, and the like.
10
Plan the proposal meeting well. If graphic presentations are necessary to help the committee with
understandings make sure you prepare them so they look good. A well planned meeting will help
your committee understand that you are prepared to move forward with well planned research.
Your presentation style at the meeting should not belittle your committee members (make it
sound like you know they have read your proposal) but you should not assume too much (go
through each of the details with an assumption that maybe one of the members skipped over that
section).
“Let not our proposal be disregarded on the score of our youth.”<< Virgil
31
Reflections on Academic Research
A2Z
PhD
Thesis
Chapter IV
Selecting a
Supervisor
32
SELECTING A SUPERVISOR
Introduction
Eggleston and Delamont (1983), found that the matching of student to supervisor for
effective relationships is crucially important. The question that arises is how can this
match between student and supervisor be made? In a doctoral level program, the
student chooses a supervisor and has to develop a relationship with this individual. This
relationship is different in many ways from the relationships that students have had with
the lecturers who delivered most of the courses. For example, research students do
need guidance, but they also need to develop sufficient autonomy and freedom to
design and execute their own projects (Cornwall, Schmithals, & Jaques, 1977; Harding,
1973). Clearly, there are several qualities that a student expects to see in her research
supervisor, all of which may or may not be of equal significance to the student.
Consequently, the process of selection of the supervisor becomes one of the critical
factors in determining the degree of fit between the student and her supervisor.
33
Interviewing a Research Supervisor
Professors enjoy talking to prospective students about their research, and this process is
an excellent opportunity to meet the faculty and to discover their current research
interests. Before you talk to each one, read their selected publications again and think of
the questions you would like to ask them. Some important questions you should ask
everyone you interview are:
Element Description
Freedom to work The professor is open to ideas and is flexible about
adopting alterative approaches
Time conscious The professor is conscious about time taken for
completion and is generally willing to work towards it
Job prospect The professors’ ability to help the candidate in
obtaining a suitable job after completion of thesis
Convergence of interest The matching of interest of the student and the
professor
Reputation/Subject The reputation of the professor in his or her field.
knowledge/Publications
Personal relationship with the professor Cordial and understanding relationship with the
professor
Social networks The professors’ social network and relationship with
other professors in the institute and outside
Can take a stand The extent to which the professor will support the
student in contentious situations, and defend his or her
stand once it has been agreed upon previously
Number of thesis guided Number of thesis guided by the professor, the more
the better
Commitment and involvement Professors’ enthusiasm in guiding the thesis
Support
Supportiveness is the quality that PhD students value most highly in supervisors.
This involves supervisors being encouraging, mentoring, and aware that students'
lives extend beyond the PhD. Supportive supervisors make an effort to understand
how the student prefers to work. In addition, such supervisors attend to the student
as a whole person, rather than purely as a research student.
Availability
Students value availability in their supervisors. This involves supervisors meeting
with students regularly, setting aside adequate time for students, and being
contactable through several media (e.g., email, phone) – particularly if they are not
physically present.
35
Knowledge and Expertise in the Field Surrounding the PhD
Ideal supervisors are those who have expertise in the field surrounding the
student's research. Students value highly a supervisor who can use their
knowledge of the area to understand and demonstrate how the student's research
topic fits within the wider field. Students do not necessarily expect the supervisor to
have expertise in the precise topic of their research, however. Having a supervisor
with expertise in the methodologies required in their research is particularly
important.
Good Communication
Ideal supervisors have good communication skills. In particular: good listening
skills; the tendency to maintain an open dialogue about the project, its progress and
problems; the ability to communicate in an open, honest, and fair manner about
issues that arise as they arise; and making expectations clear with regard to
matters such as the process of completing a PhD or Master's thesis, budget
considerations, and the role each party must play in performing the project
research.
Constructive Feedback
Students see an ideal supervisor as one who provides feedback and criticism of
their work that is constructive and prompt. In addition students value consistency in
the feedback given. Some valued consistency across time. This is often a sign that
the supervisor and student share the same focus regarding the project. In addition,
where more than one supervisor is responsible for providing feedback, consistency
between supervisors is important.
Poor Feedback
Feedback which conflicts with previous feedback given, too little feedback, delayed and
infrequent feedback, illegible feedback, and too much negative feedback relative to
encouraging and positive comments are all problematic issues for students.
Personality Clashes
Students find clashes of personality with their supervisors to be problematic for all
concerned. The majority of students saw a personality clash as the reason most likely to
drive them to abandon their studies or to change supervisors.
Conclusion
38
do circulate of institutions which need fees so badly that they accept research students
without giving the designated supervisors the opportunity to have an input into the
decision process.
39
R
e
f
l
e
c
t A2Z
i
o
n PhD
s
Thesis Chapter V
o
n
A
c
a
d
e
m
i
c
R
e
s
e
a Finalizing the Topic
r
c
h
40
FINALIZING THE TOPIC
"It is really important to do the right research as well as to do the research right.”
A topic is the major organizing principle guiding the analysis of any research study.
Topics offer the scholar an occasion for writing and a focus which governs what is likely
to be said. Topics represent the core subject matter of scholarly communication and the
means by which the scholar arrives at other possible topics of research and discover
new knowledge.
Thinking early leads to starting early. If the scholar begins thinking about possible topics
when the assignment is given, he has already begun the arduous, yet rewarding, task of
planning and organization. Once he has made the assignment a priority in her mind, he
may begin to have ideas throughout the day. Brainstorming is often a successful way for
scholars to get some of these ideas down on paper. Seeing one's ideas in writing is
often an impetus for the writing process. Though brainstorming is particularly effective
when a topic has been chosen, it can also benefit the scholar who is unable to narrow a
topic. It consists of a timed writing session during which the scholar jots down—often in
list or bulleted form—any ideas that come to his mind. At the end of the timed period, the
scholar will peruse his list for patterns of consistency. If it appears that something seems
to be standing out in his mind more than others, it may be wise to pursue this as a topic
possibility.
It is important for the scholar to keep in mind that an initial topic may not be the exact
topic about which he ends up writing. Research topics are often fluid, and dictated more
by the scholar's ongoing research than by the original chosen topic. Such fluidity is
common in research, and should be embraced as one of its many characteristics.
Choosing a research topic is not easy as a scholar imagines. One should be thinking
about it right from the start of the research study and a serious continuous process, till
the topic finally finalized. There are generally three ways you are asked to write about a
41
research problem: (a) scholar’s supervisor provides with a general topic from which
scholar picks up a study of particular aspect; (b) supervisor provides with a list of
possible topics; or, (c) he/she leaves it up to the scholar to choose a topic and at later
state, scholar has to obtain his/her permission to write about it before beginning the
formal investigation. Following are some strategies for getting started for each scenario.
Step 2: Review related literature to help refine how you will approach focusing on the topic and
finding a way to analyze it. Use the main concept terms already developed in Step 1 to retrieve
relevant articles. This will help refine and refocus the analytical approach. Of course, this exercise
has to be done several times before you finalize how to approach writing about the topic.
Step 3: Since social science research studies are generally designed to get you to develop your
own ideas and arguments, look for sources that can help broaden, modify, or strengthen your
initial ideas and arguments [for example, you have decided to argue that the entry of foreign
universities is ill prepared to take on responsibilities of providing real and factual cost-effective
higher education] There are least four appropriate roles your related literature plays in helping
you formulate how to begin your analysis:
Sources for Historical Context--another role your related literature plays in helping
you formulate how to begin your analysis is to place issues and events in proper
historical context. This can help to demonstrate familiarity with developments in relevent
scholarship about your topic, provide a means of comparing historical versus
contemporary issues and events, and identifying key people, places, and things that had
an important role related to the topic.
42
law journals. Another role of related literature is to provide a means of approaching a
topic from multiple perspectives rather than the perspective offered by just one discipline.
NOTE: Always review the references cited by the authors in footnotes, endnotes, or a bibliography to help
locate additional research on the topic. Also, remember to keep careful notes at every stage. You may think
you will remember what you have searched and where you found things, but it’s easy to forget.
Step 4: Assuming you've done a good job of synthesizing and thinking about the results of our
initial search for related literature, you're ready to prepare a detailed outline for your paper that
lays the foundation for a more in-depth and focused review of relevant research literature. [after
consulting with the supervisor, if needed!].
Step 1: Started thinking--which topic from this list is the easiest to find the most information on?
An intelligent supervisor should never include a topic that is so obscure or complex that no
research is available to review and begin to design a study. Instead of trying to find the path of
least resistence, begin by choosing a topic that you find interesting in some way, that is
controversial or you have an opinion about, or that has some personal meaning for you. You're
going to be working on your topic for quite some time, so choose one that's interesting or makes
you want to take a position on.
Once you’ve settled on a topic of interest from the list, follow Steps 1 - 4 listed above to further
develop it into a research study.
NOTE: It may be reasonable to review related literature to help refine how you will approach
analyzing a topic, and then discover that the topic is not all that interesting afterall. In that case,
you can choose another from the list. Just don’t wait too long to make a switch and be sure to
consult with your supervisor first.
Step 1: The key process here is turning an idea or general thought into a topic that can be cast
as a research problem. When given an assignment where you choose the research topic, don't
begin by thinking about what to write about, but rather, ask yourself the question, "What do I want
to know?" Treat an open-ended assignment as an opportunity to learn about something that's
new or exciting to you.
Step 2: If you lack any more ideas, or wish to gain focus, try some or all of the following
strategies:
Review your course readings, particularly the suggested readings, for topic
ideas. Don't just review what you've already read but jump ahead in the syllabus to readings
that have not been covered yet in the course.
Browse through some current journals in your subject discipline. Even if most
of the articles are not relevant, you can skim through the contents quickly. You only need one
to be the spark that begins the process of wanting to learn more about a topic.
Think about essays and other coursework you have taken or lectures and/or
programs you have attended. Thinking back, what most interested you? What would you like to
know more about?
Search online resources, to see if your idea has been covered. Use this
coverage to refine your idea into something that you'd like to investigate further but in a more
deliberate, scholarly way based on a problem to research.
43
Step 3: To build upon your initial idea, use the suggestions under this tab to help narrow,
broaden, or increase the timeliness of your idea so you can write it out as a research problem.
Once you are comfortable with having turned your idea into a topic, follow Steps 1 - 4
listed in Part [A] above to further develop it into a research paper.
Here are some critical 11 points to consider in finding and developing a research topic:
Any topic will be difficult to research if it is too broad. A great way to fine-tune a topic is to use the
method traditionally used by newspaper reporters: Who?-What?-Where?-When?-Why?
Who is involved?
A particular age group, occupation, ethnic group, men, women, etc. For example, if you are
interested in writing about the environment, you might focus on the effects of air pollution on
infants and children.
What is the problem?
What is the issue facing the "who" in your topic-health concerns, job and economic trends,
contaminated drinking water? Try stating your topic as a question. For example, if you’re
interested in finding out about drinking water, you might ask: Are there preventive measures that
government can take to keep the drinking water supply from being contaminated?
Where is it happening?
A specific country, region, city, physical environment, rural vs. urban? For example: What
environmental issues are most important in the southern plains area of the U.S.
When is this happening?
Is this a current issue or an historical event? Will you discuss the historical development of a
current problem? Example: How does environmental awareness affect business practices today
44
Why is it happening / Why is this a problem?
You may want to focus on causes, or argue the importance of this problem by outlining historical
or current ramifications. Or you may want to persuade your instructor or class why they should
care about the issue. Example: Why are some states seriously investigating wind power
opportunities now? Be flexible. It is common to modify your topic during the research process.
I never really know the title of a book until it's finished.<< Mary Wesley
45
A2Z
PhD
Thesis
Chapter VI
Research Problem
46
RESEARCH PROBLEM
Introduction
A research problem is the situation that causes the researcher to feel apprehensive,
confused and ill at ease. It is the demarcation of a problem area within a certain context
involving the WHO or WHAT, the WHERE, the WHEN and the WHY of the problem
situation.
There are many problem situations that may give rise to reseach. Three sources
usually contribute to problem identification. Own experience or the experience of others
may be a source of problem supply. A second source could be scientific literature. You
may read about certain findings and notice that a certain field was not covered. This
could lead to a research problem. Theories could be a third source. Shortcomings in
theories could be researched.
Definition
47
Identification of the Problem
The prospective researcher should think on what caused the need to do the research
(problem identification). The question that he/she should ask is: Are there questions
about this problem to which answers have not been found up to the present?
Research originates from a need that arises. A clear distinction between the PROBLEM
and the PURPOSE should be made. The problem is the aspect the researcher worries
about, think about, wants to find a solution for. The purpose is to solve the problem, ie
find answers to the question(s). If there is no clear problem formulation, the purpose
and methods are meaningless.
The research problem should be stated in such a way that it would lead to analytical
thinking on the part of the researcher with the aim of possible concluding solutions to the
stated problem. Research problems can be stated in the form of either questions or
statements.
Systematic Approach
48
Step 2: Do some preliminary research.
A. Search Digital Dissertations and read the abstracts to identify research questions
that have already been pursued in your field
B. Search an index in your field to find relevant articles and focus on the last
sections covering recommended further research
C. Search an index in the proposed field of research that includes conference
proceedings and review titles and abstracts
A. Be careful about research feasibility: If you are doing empirical research, your
research question must be appropriate for the subjects you have to work with
B. Be careful about broad questions: Your research question cannot be so broad
that you can't adequately cover it
Step 5: Through this process, reach an agreement with your supervisor on an initial set of
research questions to guide the research. Note that as your research progresses, you
might refine your questions.
1. Introduce the reader to the importance of the topic being studied. The reader is
oriented to the significance of the study and the research questions or hypotheses to
follow.
2. Places the problem into a particular context that defines the parameters of what is
to be investigated.
3. Provides the framework for reporting the results and indicates what is probably
necessary to conduct the study and explain how the findings will present this
information.
“So What!”
In the social sciences, the research problem establishes the means by which you must
answer the "so what?" question. The "so what?" question refers to a research problem
surviving the relevancy test. Note that this question requires a commitment on your part
to not only show that you have researched the material, but that you have thought about
its significance.
To survive the "so what" question, Hernon and Schwartz [Library & Information Science
Research 29 (2007): 307-309] noted that problem statements should possess the
following attributes:
There are two broad conceptualizations of a research problem in the social sciences:
This research problem fails the "so what?" test because it does not reveal
the relevance of why you are investigating the problem of having no hospital in the
community [maybe there's a hospital in the community ten miles away] and does not
elucidate the significance of why one should study the fact that no hospital exists in the
community [maybe its because the hospital in the community ten miles away has no
emergency room].
Is the problem of current interest? Will the research results have social,
1
educational or scientific value?
2 Will it be possible to apply the results in practice?
3 Does the research contribute to the science of education?
4 Will the research opt new problems and lead to further research?
5 Is the research problem important? Will you be proud of the result?
6 Is there enough scope left within the area of reseach (field of research)?
Can you find an answer to the problem through research? Will you be able
7
to handle the research problem?
8 Will it be pratically possible to undertake the research?
9 Will it be possible for another researcher to repeat the research?
10 Is the research free of any ethical problems and limitations?
11 Will it have any value?
Do you have the necessary knowledge and skills to do the research? Are
12
you qualified to undertake the research?
Is the problem important to you and are you motivated to undertake the
13
research?
Is the research viable in your situation? Do you have enough time and
14
energy to complete the project?
15 Do you have the necessary funds for the research?
16 Will you be able to complete the project within the time available?
Do you have access to the administrative, statistic and computer facilities
17
the research necessitates?
51
Some Guidelines for Writing the Research Problem Statement
1. First select your research topic, which is the issue or subject area that you intend to
investigate.
2. Describe the business or management problem based on your topic that you intend to
research. Do this right at the beginning of your research proposal or report as laid out in
the templates (remember to reference any facts that you are basing your research on).
This will set the scene for your Research Problem statement, so that you can write a
clear, stand alone Research Problem.
5. Verbs such as “understand”, “explore”, “investigate”, “examine” and “discuss” are poor
verbs as they describe processes, not outcomes, eg you can discuss something
endlessly without ever having to make recommendations, draw conclusions or offer a
result. You might be exploring, examining or discussing as part of your process, but they
cannot be the end result of your research, which should be more tangible.
6. If your Research Problem contains two or more concepts / ideas, then break it down
into subproblems, so that each sub‐problem consists of one idea only. Each
sub‐problem should contain key words that you can use in your literature search (using
the electronic library databases and Google Scholar) on that sub‐problem.
For example:
Sub‐problem 1
Analyse and evaluate the role of entrepreneurship in establishing SMMEs in emerging markets.
(Here your key search terms for your literature review could be “entrepreneurship”, “SMME” and
“emerging markets”).
52
Sub‐problem 2
Evaluate the economic contribution of SMMEs to growth and development in emerging markets.
(Here your search terms could be “economic contribution”, “economic growth”, “and emerging
market development”).
53
A2Z Chapter VII
PhD
Thesis
Reflections on Academic Research
Review of Literature
54
REVIEW OF LITERATURE
The format of a review of literature may vary from discipline to discipline and from
assignment to assignment. A review may be a self-contained unit -- an end in itself -- or
a preface to and rationale for engaging in primary research. A review is a required part
of grant and research proposals and often a chapter in theses. Generally, the purpose of
a review is to analyze critically a segment of a published body of knowledge through
summary, classification, and comparison of prior research studies, reviews of literature,
and theoretical articles.
Introduction
A literature review can stand alone or appear as part of a longer work, such as a
research proposal. The quality of the literature review is dependent upon (a) the
thoroughness of the writer's search, (b) the quality and reliability of the writer's sources,
(c) the ability of the writer to relate research studies to one another and to the writer's
own thesis or purpose, (d) the objectivity of the writer in selecting, interpreting,
organizing, and summarizing the research he or she has reviewed.
Literature reviews provide the research scholar with a handy guide to a particular topic.
In case a scholar finds shortage of time or limited time, literature reviews can give help
to have an overview or act as a stepping stone. For professionals, they are useful
reports that keep them up to date with what is current in the field. For scholars, the depth
and breadth of the literature review emphasizes the credibility of the writer in his or her
field. Literature reviews also provide a solid background for a research paper's
investigation. Comprehensive knowledge of the literature of the field is essential to most
research papers.
In most of the cases it is better to seek guidelines or clarifications from one’s supervisor
regarding the comprehensiveness of the review, probably as listed hereunder:
Look for other literature reviews in your area of interest or in the discipline and read them
to get a sense of the types of themes you might want to look for in your own research or
ways to organize your final review. You can simply put the word "review" in your search
engine along with your other topic terms to find articles of this type on the Internet or in
an electronic database. The bibliography or reference section of sources you've already
read are also excellent entry points into your own research.
There are hundreds or even thousands of articles and books on most areas of study.
The narrower one’s topic, the easier it will be to limit the number of sources one requires
reading in order to get a good survey of the material. Research supervisor may probably
not expect you to read everything that's out there on the topic, but it is always better to
limit the scope of the research at this stage.
56
Some disciplines require that one uses information that is as current as possible. In the
sciences, for instance, treatments for medical problems are constantly changing
according to the latest studies. Information even two years old could be obsolete.
However, if a researcher is writing a review in the humanities, history, or social sciences,
a survey of the history of the literature may be what is needed, because what is
important is how perspectives have changed through the years or within a certain time
period. Try sorting through some other current bibliographies or literature reviews in the
field to get a sense of what one’s discipline expects. Also it is important to consider what
is currently of interest to scholars in this field and what is not.
A literature review is usually organized around ideas, not the sources themselves as an
annotated bibliography would be organized. This means that one will not just simply list
the sources and go into detail about each one of them, one at a time. Not at all. As the
scholar reads widely but selectively in the selected topic area, he/she should consider
identifying what themes or issues connect the sources together. Do they present one or
different solutions? Is there an aspect of the field that is missing? How well do they
present the material and do they portray it according to an appropriate theory? Do they
reveal a trend in the field? One of these themes to focus the organization of the current
review should be identified and finalized.
You've got a focus, and you've narrowed it down to a thesis statement. Now what is the
most effective way of presenting the information? What are the most important topics,
subtopics, etc., that your review needs to include? And in what order should you present
them? Develop an organization for your review at both a global and local level.
Basic categories
Just like most academic papers, literature reviews also must contain at least three basic
elements: an introduction or background information section; the body of the review
containing the discussion of sources; and, finally, a conclusion and/or future scope for
potential research scholars.
57
Introduction
Define or identify the general topic, issue, or area of concern, thus providing an
appropriate context for reviewing the literature.
Point out overall trends in what has been published about the topic; or conflicts in theory,
methodology, evidence, and conclusions; or gaps in research and scholarship; or a
single problem or new perspective of immediate interest.
Establish the writer's reason (point of view) for reviewing the literature; explain the
criteria to be used in analyzing and comparing literature and the organization of the
review (sequence); and, when necessary, state why certain literature is or is not included
(scope).
Group research studies and other types of literature (reviews, theoretical articles, case
studies, etc.) according to common denominators such as qualitative versus quantitative
approaches, conclusions of authors, specific purpose or objective, chronology, etc.
Summarize individual studies or articles with as much or as little detail as each merits
according to its comparative importance in the literature, remembering that space
(length) denotes significance.
Once the basic categories are in place, then the order/manner in which the sources
themselves are arranged within the body of this section.
58
The following illustration may be considered to have the feel of the problem and studied
as to how three typical ways of organizing the sources are presented.
You've decided to focus your literature review on materials dealing with sperm whales.
This is because you've just finished reading Moby Dick, and you wonder if that whale's
portrayal is really real. You start with some articles about the physiology of sperm whales
in biology journals written in the 1980's. But these articles refer to some British biological
studies performed on whales in the early 18th century. So you check those out. Then you
look up a book written in 1968 with information on how sperm whales have been portrayed
in other forms of art, such as in Alaskan poetry, in French painting, or on whale bone, as
the whale hunters in the late 19th century used to do. This makes you wonder about
American whaling methods during the time portrayed in Moby Dick, so you find some
academic articles published in the last five years on how accurately Herman Melville
portrayed the whaling scene in his novel.
Chronological
If your review follows the chronological method, you could write about the materials
above according to when they were published. For instance, first you would talk about
the British biological studies of the 18th century, then about Moby Dick, published in
1851, then the book on sperm whales in other art (1968), and finally the biology articles
(1980s) and the recent articles on American whaling of the 19th century. But there is
relatively no continuity among subjects here. And notice that even though the sources on
sperm whales in other art and on American whaling are written recently, they are about
other subjects/objects that were created much earlier. Thus, the review loses its
chronological focus.
By publication
Order your sources by publication chronology, then, only if the order demonstrates a
more important trend. For instance, you could order a review of literature on biological
studies of sperm whales if the progression revealed a change in dissection practices of
the researchers who wrote and/or conducted the studies.
By trend
A better way to organize the above sources chronologically is to examine the sources
under another trend, such as the history of whaling. Then your review would have
subsections according to eras within this period. For instance, the review might examine
whaling from pre-1600-1699, 1700-1799, and 1800-1899. Under this method, you would
59
combine the recent studies on American whaling in the 19th century with Moby Dick
itself in the 1800-1899 categories, even though the authors wrote a century apart.
Thematic
Thematic reviews of literature are organized around a topic or issue, rather than the
progression of time. However, progression of time may still be an important factor in a
thematic review. For instance, the sperm whale review could focus on the development
of the harpoon for whale hunting. While the study focuses on one topic, harpoon
technology, it will still be organized chronologically. The only difference here between a
"chronological" and a "thematic" approach is what is emphasized the most: the
development of the harpoon or the harpoon technology.
Note: But more authentic thematic reviews tend to break away from chronological order. For
instance, a thematic review of material on sperm whales might examine how they are portrayed
as "evil" in cultural documents. The subsections might include how they are personified, how their
proportions are exaggerated, and their behaviors misunderstood. A review organized in this
manner would shift between time periods within each section according to the point made.
Methodological
A methodological approach differs from the two above in that the focusing factor usually
does not have to do with the content of the material. Instead, it focuses on the "methods"
of the researcher or writer. For the sperm whale project, one methodological approach
would be to look at cultural differences between the portrayal of whales in American,
British, and French art work. Or the review might focus on the economic impact of
whaling on a community. A methodological scope will influence either the types of
documents in the review or the way in which these documents are discussed.
Once you've decided on the organizational method for the body of the review, the
sections you need to include in the paper should be easy to figure out. They should arise
out of your organizational strategy. In other words, a chronological review would have
subsections for each vital time period. A thematic review would have subtopics based
upon factors that relate to the theme or issue.
Evaluate the current "state of the art" for the body of knowledge reviewed, pointing out
major methodological flaws or gaps in research, inconsistencies in theory and findings,
and areas or issues pertinent to future study.
Conclude by providing some insight into the relationship between the central topic of the
literature review and a larger area of study such as a discipline, a scientific endeavor, or
a profession.
Meticuluous Revision
Now it is time for a thorough revision. Spending a lot of time revising is a wise idea,
because your main objective is to present the material, not the argument. So check over
your review again to make sure it follows the assignment and/or your outline. Then, just
as you would for most other academic forms of writing, rewrite or rework the language of
your review so that you've presented your information in the most concise manner
possible. Be sure to use terminology familiar to your audience; get rid of unnecessary
jargon or slang. Finally, double check that you've documented your sources and
formatted the review appropriately for your discipline. For tips on the revising and editing
process, see our handout on revising drafts.
"I have seen great literature reviews with 100 pages and 100 references and I have seen
poor literature reviews with 100 pages and 100 references. I have seen great literature
reviews with 20 pages and 20 references and I have seen poor literature reviews with 20
pages and 20 references."
This endeavor will be a significant part of your research study. You will learn a great deal
from this exercise, not only about your topic, but also about how to learn from a serious
research study. I doubt that there are many topics that you could choose that would
have fewer than 50-100 recently published articles. Most have several hundred or
thousands. It is your task to find the relevant articles and make sense of them.
61
Reference
Anson, Chris M. and Robert A. Schwegler (2000) The Longman Handbook for Writers and
Readers. Second edition. New York: Longman.
Jones, Robert, Patrick Bizzaro, and Cynthia Selfe (1997) The Harcourt Brace Guide to Writing in
the Disciplines. New York: Harcourt Brace.
Lamb, Sandra E.(1998) How to Write It: A Complete Guide to Everything You'll Ever Write.
Berkeley, Calif.: Ten Speed Press.
Rosen, Leonard J. and Laurence Behrens (2000) The Allyn and Bacon Handbook. Fourth edition.
Boston: Allyn and Bacon.
Troyka, Lynn Quitman. Simon and Schuster (2002) Handbook for Writers. Upper Saddle River,
N.J.: Prentice Hall.
These guidelines are adapted primarily from Galvan (2006). Galvan outlines a very
clear, step-by-step approach that is very useful to use as you write your review. I have
integrated some other tips within this guide, particularly in suggesting different
technology tools that you might want to consider in helping you organize your review. In
the sections from Step 6-9 what I have included is the outline of those steps exactly as
described by Galvan. I also provide links at the end of this guide to resources that you
should use in order to search the literature and as you write your review.
In addition to using the step-by-step guide that I have provided below, I also recommend
that you (a) locate examples of literature reviews in your field of study and skim over
these to get a feel for what a literature review is and how these are written (I have also
provided links to a couple of examples at the end of these guidelines (b) read over other
62
guides to writing literature reviews so that you see different perspectives and
approaches.
Read APA guidelines so that you become familiar with the common core elements of
how to write in APA style: in particular, pay attention to general document guidelines
(e.g. font, margins, and spacing), title page, abstract, body, text citations, and
quotations.
It will help you considerably if your topic for your literature review is the one on which
you intend to do your research study, or is in some way related to the topic of your
research paper. However, you may pick any scholarly topic.
Once you have identified and located the articles for your review, you need to analyze
them and organize them before you begin writing:
63
1. Overview the articles: Skim the articles to get an idea of the general purpose
and content of the article (focus your reading here on the abstract, introduction
and first few paragraphs, the conclusion of each article. Tip: as you skim the
articles, you may want to record the notes that you take on each directly into
RefWorks in the box for User 1. You can take notes onto note cards or into a
word processing document instead or as well as using RefWorks, but having
your notes in RefWorks makes it easy to organize your notes later.
2. Group the articles into categories (e.g. into topics and subtopics and
chronologically within each subtopic). Once again, it's useful to enter this
information into your RefWorks record. You can record the topics in the same
box as before (User 1) or use User 2 box for the topic(s) under which you have
chosen to place this article.
3. Take notes:
(a) Decide on the format in which you will take notes as you read the articles (as
mentioned above, you can do this in RefWorks. You can also do this using a
Word Processor, or a concept mapping program like Inspiration, a data base
program (e.g. Access or File Maker Pro), in an Excel spreadsheet, or the "old-
fashioned" way of using note cards. Be consistent in how you record notes.
(b) Define key terms: look for differences in the way keys terms are defined (note
these differences).
(c) Note key statistics that you may want to use in the introduction to your review.
(d) Select useful quotes that you may want to include in your review. Important: If
you copy the exact words from an article, be sure to cite the page number as you
will need this should you decide to use the quote when you write your review (as
direct quotes must always be accompanied by page references). To ensure that
you have quoted accurately (and to save time in note taking), if you are
accessing the article in a format that allows this, you can copy and paste using
your computer "edit --> copy --> paste" functions. Note: although you may collect
a large number of quotes during the note taking phase of your review, when you
write the review, use quotes very sparingly. The rule I follow is to quote only
when some key meaning would be lost in translation if I were to paraphrase the
original author's words, or if using the original words adds special emphasis to a
point that I am making.
(e) Note emphases, strengths & weaknesses: Since different research studies focus
on different aspects of the issue being studied, each article that you read will
have different emphases, strengths. and weaknesses. Your role as a
reviewer is to evaluate what you read, so that your review is not a mere
description of different articles, but rather a critical analysis that makes sense of
the collection of articles that you are reviewing. Critique the research
methodologies used in the studies, and distinguish between assertions (the
author's opinion) and actual research findings (derived from empirical evidence).
(f) Identify major trends or patterns: As you read a range of articles on your topic,
you should make note of trends and patterns over time as reported in the
literature. This step requires you to synthesize and make sense of what you read,
since these patterns and trends may not be spelled out in the literature, but
rather become apparent to you as you review the big picture that has emerged
64
over time. Your analysis can make generalizations across a majority of studies,
but should also note inconsistencies across studies and over time.
(g) Identify gaps in the literature, and reflect on why these might exist (based on the
understandings that you have gained by reading literature in this field of study).
These gaps will be important for you to address as you plan and write your
review.
(h) Identify relationships among studies: note relationships among studies, such as
which studies were landmark ones that led to subsequent studies in the same
area. You may also note that studies fall into different categories (categories that
you see emerging or ones that are already discussed in the literature). When you
write your review, you should address these relationships and different
categories and discuss relevant studies using this as a framework.
(i) Keep your review focused on your topic: make sure that the articles you find are
relevant and directly related to your topic. As you take notes, record which
specific aspects of the article you are reading are relevant to your topic (as you
read you will come up with key descriptors that you can record in your notes that
will help you organize your findings when you come to write up your review). If
you are using an electronic form of note taking, you might note these descriptors
in a separate field (e.g. in RefWorks, put these under User 2 or User 3; in Excel
have a separate column for each descriptor; if you use Inspiration, you might
attach a separate note for key descriptors.
(j) Evaluate your references for currency and coverage: Although you can always
find more articles on your topic, you have to decide at what point you are finished
with collecting new resources so that you can focus on writing up your findings.
However, before you begin writing, you must evaluate your reference list to
ensure that it is up to date and has reported the most current work. Typically a
review will cover the last five years, but should also refer to any landmark studies
prior to this time if they have significance in shaping the direction of the field. If
you include studies prior to the past five years that arenot landmark studies, you
should defend why you have chosen these rather than more current ones.
1. Galvan (2006) recommends building tables as a key way to help you overview,
organize, and summarize your findings, and suggests that including one or more
of the tables that you create may be helpful in your literature review. If
you do include tables as part of your review each must be accompanied by an
analysis that summarizes, interprets and synthesizes the literature that you have
charted in the table. You can plan your table or do the entire summary chart of
your literature using a concept map.
(a) You can create the table using the table feature within Microsoft Word, or can
create it initially in Excel and then copy and paste/import the Excel sheet into
Word once you have completed the table in Excel. The advantage of using Excel
is that it enables you to sort your findings according to a variety of factors (e.g.
sort by date, and then by author; sort by methodology and then date)
(b) Examples of tables that may be relevant to your review:
(i) Definitions of key terms and concepts.
(ii) Research methods
(iii) Summary of research results
65
Step 6: Synthesize the literature prior to writing your review
Using the notes that you have taken and summary tables develop an outline of your final
review. The following are the key steps as outlined by Galvan (2006: 71-79)
1. Consider your purpose and voice before beginning to write. Your initial purpose
is to provide an overview of the topic that is of interest to you, demonstrating your
understanding of key works and concepts within your chosen area of focus. You
are also developing skills in reviewing and writing, to provide a foundation for
your final thesis. In your final thesis your literature review should demonstrate
your command of your field of study and/or establishing context for a study that
you have done.
2. Consider how you reassemble your notes: plan how you will organize your
findings into a unique analysis of the picture that you have captured in your
notes.Important: A literature review is not series of annotations (like an annotated
bibliography). Galvan (2006:72) captures the difference between an annotated
bibliography and a literature review very well: "...in essence, like describing trees
when you really should be describing a forest. In the case of a literature review,
you are really creating a new forest, which you will build by using the trees you
found in the literature you read."
3. Create a topic outline that traces your argument: first explain to the reader your
line or argument (or thesis); then your narrative that follows should explain and
justify your line of argument.
4. Reorganize your notes according to the path of your argument
5. Within each topic heading, note differences among studies.
6. Within each topic heading, look for obvious gaps or areas needing more
research.
7. Plan to describe relevant theories.
8. Plan to discuss how individual studies relate to and advance theory
9. Plan to summarize periodically and, again near the end of the review
10. Plan to present conclusions and implications
11. Plan to suggest specific directions for future research near the end of the review
12. Flesh out your outline with details from your analysis
1. If your review is long, provide an overview near the beginning of the review
2. Near the beginning of a review, state explicitly what will and will not be covered
3. Specify your point of view early in the review: this serves as the thesis statement
of the review.
4. Aim for a clear and cohesive essay that integrates the key details of the literature
and communicates your point of view (a literature is not a series of annotated
articles).
5. Use subheadings, especially in long reviews
6. Use transitions to help trace your argument
7. If your topic teaches across disciplines, consider reviewing studies from each
discipline separately
8. Write a conclusion for the end of the review: Provide closure so that the path of
the argument ends with a conclusion of some kind. How you end the review,
however, will depend on your reason for writing it. If the review was written to
stand alone, as is the case of a term paper or a review article for publication, the
conclusion needs to make clear how the material in the body of the review has
supported the assertion or proposition presented in the introduction. On the other
hand, a review in a thesis, dissertation, or journal article presenting original
research usually leads to the research questions that will be addressed.
9. Check the flow of your argument for coherence.
Reference:
Galvan, J. (2006). Writing literature reviews: a guide for students of the behavioral sciences ( 3rd
ed.). Glendale, CA: Pyrczak Publishing.
67
Model: review of literaure
Just like most academic papers, literature reviews also must contain at least three basic
elements: an Introduction or background information section; the Body of the review
containing the discussion of sources; and, finally, a Conclusion detailing the future
scope for potential research scholars.
Introduction
Define or identify the general topic, issue, or area of concern, thus providing an appropriate
context for reviewing the literature.
Point out overall trends in what has been published about the topic; or conflicts in theory,
methodology, evidence, and conclusions; or gaps in research and scholarship; or a single
problem or new perspective of immediate interest.
Establish the writer's reason (point of view) for reviewing the literature; explain the criteria to be
used in analyzing and comparing literature and the organization of the review (sequence); and,
when necessary, state why certain literature is or is not included (scope).
Body
Design the review of literature in any ONE of three broad sections:
(a) Choronological [either by publication or by trend]
(b) Thematic
(c) Methodological
Conclusion
Summarize major contributions of significant studies and articles to the body of knowledge under
review, maintaining the focus established in the introduction.
Evaluate the current "state of the art" for the body of knowledge reviewed, pointing out major
methodological flaws or gaps in research, inconsistencies in theory and findings, and areas or
issues pertinent to future study.
Conclude by providing some insight into the relationship between the central topic of the literature
review and a larger area of study such as a discipline, a scientific endeavor, or a profession.
“Review your goals twice every day in order to be focused on achieving them.”
<< Les Brown
68
A2Z
PhD
Thesis
Chapter VIII
69
Scope of Research Study
What should be in the scope of study? Scope is simply boundary of the research. What
can be bounded are of course coverage, data, analytical method and applicability of
output. Hence the scope of study should have at least 4 paragraphs and one for each of
the four type of research boundary above. Scope of coverage defines what areas around
the subject matter the research covers and what it did not.
The scope of the study means the specific areas that the particular researcher wants to
cover in his/her study. Also it is an established fact that searching for resources on
researcher’s subject will be more effective if he/she already defined the scope of the
research. The questions to consider in the research scope should be:
This means that the scope of the study may be referred to the specific element and
content that the researcher wants to explore in his/her study. And he/she is responsible
to set the scope realistically, the broader scope will make the study takes longer time
while too little scope can make the study not worthy. The scope of study for a research is
usually one of the first sections to the thesis. It sets out the scope of your work and
limitations. The research scholar should add in as much detail as possible when describing
what is under research, why the research is being done and how it is being done. The
examiners will want to know why a particular area is under the research topic and what is
proposed to find out or expected to discover. This will set out an idea of the purpose of the
research as well as giving the reader your expectations of the paper.
70
Difference between limitation and scope
What is the scope and limitation of the study and what is the difference between the
scope and the limitation of the study. Walonick (2005) explained that all research studies
should have limitations and a finite scope. Limitations literally mean kind of restraint or
obstacles that the study might be gone through during the period. In most of the
research, limitations are often imposed by time and budget constraints, both time and
budget are the most commonly restraining factors that will give its significant impacts on
the study. If the limitation is not clearly defined and tackled carefully, the study may end
up being invalid. The researcher should precisely list the limitations of the study and
describe the extent to which he/she believe the limitations degrade the quality of the
research. This will help the researcher and the audience understands the actual situation
that the study encounters.
We can conclude that the differences between limitation and scope of the research are:
(a) The research scope is the specific are that the researcher want to cover in his/her study
while the research limitations is the constraint and obstacle that the research expects to
encounter during the study; and
(b) The research limitation is beyond the researcher’s control since it involves external factors
which outside researcher’s authority while the research scope is under researcher’s control and
manageable by him/her. The researcher will determine what scope to be cover and what scope to
be left
71
A2Z
PhD
Thesis
Reflections on Academic Research
Chapter IX
Limitations
72
LIMITATIONS
Introduction
Without exception, all research is limited in several ways. There are internal or formal
limitations, such as the materials and procedures used, the ways in which critical terms
are defined, the scope of the problem explored and of the applicability of the results. And
there are external limitations as well, governed by constraints upon one’s time or
pocketbook; the inability to travel to special collections, museums, or libraries, or to
speak or read other languages; or to consider an evolving political situation beyond a
certain date. These limitations should be acknowledged; indeed, identifying them may
help the scholar to focus the topic. However, problems such as time and money
difficulties do not relieve the scholar of the responsibility of designing a study that can
adequately test the hypothesis and measure its results. Proposals that include no
mention of limitations suggest that the scholar has not really gone beyond a superficial
consideration of the subject. This section of the thesis, therefore, will require
considerable thought. But close attention now to these and related questions will save
the scholar much time and discomfort in later stages of research and writing.
There is no ‘one best way’ to structure the Research Limitations section of the thesis.
However, a structure based on three ‘moves’ is suggested: [a] the announcing, [b]
reflecting, and [c] forward looking move. The announcing move immediately allows the
scholar to identify the limitations of the thesis and explain how important each of these
limitations is. The reflecting move provides greater depth, helping to explain the nature
of the limitations and justify the choices that the scholar made during the research
process. Finally, the forward looking move enables the scholar to suggest how such
limitations could be overcome in future. The collective aim of these three moves is to
help the scholar walk the reader through the Research Limitations section in a succinct
and structured way. This will make it clear to the reader that the scholar (a) recognises
the limitations of your own research, (b) understands why such factors are limitations,
and (c) can point to ways of combating these limitations if future research was
carried out.
73
[a] Announcing Move
Idenifying limitations: What they are and how important they are
Overall, the announcing move should be around 10-15% of the total word count of
the Research Limitations section.
There are many possible limitations that the research may have faced Four
main types of research limitation are:
Even though there may be a large number of limitations in any thesis, it is not necessary to
discuss all of these limitations in the Research Limitations section. After all, the scholar is not
writing a 2000 word critical review of the limitations of the thesis, just a 400-700 word critique
that is just one section long (i.e. the Research Limitations section within
your Conclusions chapter). Therefore, in this first announcing move, it is recommended that
the scholar identifies only those limitations that had the greatest potential impact on:
We use the word potential when we talk about the impact that these research limitations
could have had on the thesis because we often do not know the degree to which
different factors limited the findings or our ability to effectively answer the research
questions and/or hypotheses.
For example, we know that when adopting a quantitative research design, a failure to
use aprobability sampling technique significantly limits our ability to make
broader generalisations from our results (i.e. our ability to make statistical
inferences from our sample to the population being studied). However, the degree to
which this reduces the quality of our findings is a matter of debate. Also, whilst the lack
of a probability sampling technique when using a quantitative research design is a very
obvious example of a research limitation, other limitations are far less clear.
74
Therefore, the key point is to focus on those limitations that you feel had the greatest
impact on your findings, as well as your ability to effectively answer your research
questions and/or hypotheses.
You may already know which of these limitations applies to your dissertation. However, if
you are unsure about the potential weaknesses in your research, we would recommend
that you read:
Overall, the announcing move should be around 10-15% of the total word count of
the Research Limitations section.
Having identified the most important limitations to the thesis in the announcing move, the
reflecting move focuses on explaining the nature of these limitations and justifying the
choices that the scholar made during the research process. This part should be around
60-70% of the total word count of the Research Limitations section.
It is important to remember at this stage that all research suffers from limitations,
whether it is performed by undergraduate and postgraduate level dissertation students,
or seasoned academics. Acknowledging such limitations should not be viewed as a
weakness, highlighting to the person marking evaluating the thesis. Instead, the reader
is more likely to accept that the scholar recognise the limitations of one’s own research if
the scholar writes a high quality reflecting move. This is because explaining the
limitations of the research and justifying the choices the scholar made during the
research process demonstrates the command that the scholar had over the research.
We talk about explaining the nature of the limitations in the thesis because such
limitations are highly research specific. Let’s take the example of potential limitations to
the sampling strategy.
Whilst the scholar may have a number of potential limitations in sampling strategy, let’s
focus on the lack of probability sampling; that is, of all the different types of sampling
technique that one could have used, the scholar choose not to use a probability
75
sampling technique (e.g. simple random sampling, systematic random sampling,
stratified random sampling). As mentioned, if the scholar used a quantitative research
design in the thesis, the lack of probability sampling is an important, obvious limitation to
one’s research. This is because it prevents the scholar from making generalisations
about the population under study (e.g. Facebook usage at a single university of 20,000
students) from the data the scholar has collected (e.g. a survey of 400 students at the
same university). Since an important component of quantitative research is such
generalisation, this is a clear limitation. However, the lack of a probability sampling
technique is not viewed as a limitation if one used a qualitative research design. In
qualitative research designs, a non-probability sampling technique is typically selected
over a probability sampling technique.
Even if the scholar used a quantitative research design, but failed to employ a probability
sampling technique, there are still many perfectly justifiable reasons why he could have
made such a choice. For example, it may have been impossible (or near on impossible)
to get a list of the population one was studying (e.g. a list of all the 20,000 students at
the single university one was interested in. Since probability sampling is only
possible when we have such a list, the lack of such a list or inability to attain such a list is
a perfectly justifiable reason for not using a probability sampling technique; even if such
a technique is the ideal.
Finally, the forward looking move builds on the reflecting move by suggesting how the
limitations that have been discussed could be overcome through future research. Whilst
a lot could be written in this part of the Research Limitations section, we would
recommend that it is only around 10-20% of the total word count for this section.
Case studies may be viewed as having the most limitations. You cannot make
causal conclusions from case studies. This is true because we cannot rule out
alternative explanations. It is always unclear about the generality of the findings of a
case study. A case study involves the behavior of one person. The behavior of one
person may not reflect the behavior of most people. Thus, we do not know how others
may behave.
Correlational research also has the same limitations as case studies. Correlational
research merely demonstrates that we can predict a variable from another variable. It is
demonstrating that two variables are associated. However, two variables can be
associated without there being a causal relationship between the variables. We cannot
make causal conclusions from correlational findings because we cannot rule out all
alternative explanations for correlational findings. Thus, making causal conclusions from
correlational findings is a logical error. If we find that A is associated with B, it could
mean that A caused B, B caused A, or some third variable caused both A and B without
there being any causal relationship between A and B. Even if we could rule out one of
the possible relationships (e.g., B caused A), we cannot rule out all alternative
explanations from correlational studies. For every correlational study, there is the
possibility that some third variable caused the two variables without there being a causal
relationship between the variables. Correlational research may also have limitations with
respect to the generality of the findings. Perhaps the study involved a specific group of
people, or the relation between the variables was only investigated in some situations.
Thus, it may be uncertain whether the correlational findings may generalize to other
people or situations.
77
situations, and only some of the possible conceptualizations of variables. Thus, we may
not know whether the findings will generalized to other people, situations, or
conceptualizations of the variables.
Once a statement of limitations has been prepared, the question about where in the
proposal to place it arises. A logical place is near the end of the problem statment
section, somewhere after the statement of purpose. Elsewhere in the proposal, the
researcher may have repeated a general statement of purpose, "the purpose of this
project is...", which presents another opportunity for including the limitations and
delimitations of the study. Again, that may have been at the end of the problem
statement, preceding a justification for selecting the problem in the first place. Another
juncture may have occurred somewhere in the proximity of the section devoted to the
conceptual framework (design of the study).
78
A2Z
PhD
Thesis
Reflections on Academic Research
Chapter X
Objectives
79
OBJECTIVES
Introduction
The objective of the research should be closely related to the research study of the
thesis.The main purpose of the research objective is to focus on research problem,
avoid the collection of unnecessary data and provide direction to research study.
Research is related to the aspiration and objectives are related to the battle-plan.
Objectives should be specific, measurable, achievable, realistic and timely, so that
research problem could be explored effectively.
Specific: Objective should be clear and well defined. It helps to specify the research problems
and provide proper guideline to find the solution of research problem (Alexander, 2008). Specific
objective identify the methods of collecting necessary information related to the research
problem.
Achievable: Objectives should also be achievable in the time and it should provide accurate
result from the use of sufficient resources in the specific time frame. It is related to effective
measure of research problem (Atkinson, 2001). Achievable objectives ensure that every process
of research is finished in accurate time will help to achieve the goals.
80
Realistic: Objective should be realistic, so that available resources like as men, money and
machines could be used effectively. Objectives are most useful, when they accurately define the
problem and take various steps that can be implemented with a specific time period.
Timely: Objective should be measured and achievable into the time frame. The research takes
enough time in finding the solution of research problem. Timeline indicate when the objective will
be accomplished (Frey & Osterloh, 2002).
Properly formulated, specific objectives will facilitate the development of your research
methodology and will help to orient the collection, analysis, interpretation and utilization
of data.
It is important that the objectives, especially in a research study, are stated in a good
way. It is ensured that the objectives of the research study:
Cover the different aspects of the problem and its contributing factors in a
coherent way and in a logical sequence;
Are clearly phrased in operational terms, specifying exactly what you are going to
do, where, and for what purpose;
Are realistic considering local conditions;
Use action verbs that are specific enough to be evaluated (Examples of action
verbs are: to determine, to compare, to verify, to calculate, to describe, and to
establish). Avoid the use of vague non-action verbs (Examples of non-action
verbs: to appreciate, to understand, or to study).
It is note worthy to remember that when the thesis is evaluated, the results will be
compared to the objectives. If the objectives have not been spelled out clearly or stated
ambiguously, the thesis will suffer during the evaluation leading, in some cases, to
rejection of the research thesis.
There are four components of an objective: (a) the action verb, (b) conditions, (c)
standard, and (d) the intended audience (always the research scholars). The action verb
is the most important element of an objective and can never be omitted. The action verb
81
states precisely what the scholars will do following instruction. Verbs are categorized by
domains of learning and various hierarchies. The three domains of learning are the
cognitive domain that emphasizes thinking; the affective domain highlighting attitudes
and feelings; and the psychomotor domain featuring doing.
The cognitive domain is further divided into six levels or hierarchies. They are:
Knowledge
Comprehension
Application
Analysis
Synthesis
Evaluation
Sometimes these six hierarchies or levels listed above are grouped into three
categories:
Recall objectives are at the basic taxonomic level and involve recall or description of
information. Interpretation is a higher level of learning and involves application and
examination of knowledge. Problem-solving skills test the highest level of learning and
involve construction and assessment of knowledge.
Scholars should remember that the objectives of a research study form and define the
direction and path of the research journey. Greater and meticulous attention given here
would make the scholars feel at ease at the later stages of the reaearch study.
Objectives should:
82
a. the scope of the research study must be consistent with the time frame
and level of effort available for research study.
4. provide the scholar and at a later stage, thesis evaluators with indicators of how
the scholar:
“Failure comes only when we forget our ideals and objectives and principles.”
<< Jawaharlal Nehru
83
Chapter XI
Research Design
A2 Z
PhD
Thesis
Reflections on Academic Research
84
RESEARCH DESIGN
Introduction
The research design refers to the strategy a scholar chooses to integrate the different
components of the study in a cohesive and coherent way in order to address the
research problem; it constitutes the blueprint for the collection, measurement, and
analysis of data.
Note: Research problem determines the type of design one can use, not the other way around!
The essentials of action research design follow a characteristic cycle whereby initially an
exploratory stance is adopted, where an understanding of a problem is developed and
plans are made for some form of interventionary strategy. Then the intervention is
carried out (the action in Action Research) during which time, pertinent observations are
collected in various forms. The new interventional strategies are carried out, and the
cyclic process repeats, continuing until a sufficient understanding of (or implement able
solution for) the problem is achieved. The protocol is iterative or cyclical in nature and is
intended to foster deeper understanding of a given situation, starting with
conceptualizing and particularizing the problem and moving through several
interventions and evaluations.
Advantages
1. A collaborative and adaptive research design that lends itself to use in work or
community situations.
2. Design focuses on pragmatic and solution-driven research rather than testing theories.
3. When practitioners use action research it has the potential to increase the amount they
learn consciously from their experience. The action research cycle can also be regarded
as a learning cycle.
4. Action search studies often have direct and obvious relevance to practice.
5. There are no hidden controls or preemption of direction by the researcher.
85
Disadvantages
Advantages
Disadvantages
1. A single or small number of cases offers little basis for establishing reliability or to
generalize the findings to a wider population of people.
2. The intense exposure to study of the case may bias a researcher's interpretation of the
findings.
3. Design does not facilitate assessment of cause and effect relationships.
4. Vital information may be missing, making the case hard to interpret.
5. The case may not be representative or typical of the larger problem being investigated.
6. If the criterion for selecting a case is because it represents a very unusual or unique
phenomenon or problem for study, then your intepretation of the findings can only apply
to that particular case.
86
III - Causal Design
Advantages
1. Causality research designs helps researchers understand why the world works the way it
does through the process of proving a causal link between variables and eliminating
other possibilities.
2. Replication is possible.
3. There is greater confidence the study has internal validity due to the systematic subject
selection and equity of groups being compared.
Disadvantages
1. Not all relationships are casual! The possibility always exists that, by sheer coincidence,
two unrelated events appear to be related [e.g., Punxatawney Phil could accurately
predict the duration of Winter for five consecutive years but, the fact remains, he's just a
big, furry rodent].
2. Conclusions about causal relationships are difficult to determine due to a variety of
extraneous and confounding variables that exist in a social environment. This means
causality can only be inferred, never proven.
3. If two variables are correlated, the cause must come before the effect. However, even
though two variables might be causally related, it can sometimes be difficult to determine
which variable comes first and therefore to establish which variable is the actual cause
and which is the actual effect.
87
IV - Cohort Design
Often used in the medical sciences, but also found in the applied social sciences, a
cohort study generally refers to a study conducted over a period of time involving
members of a population which the subject or representative member comes from, and
who are united by some commonality or similarity. Using a quantitative framework, a
cohort study makes note of statistical occurrence within a specialized subgroup, united
by same or similar characteristics that are relevant to the research problem being
investigated, rather than studying statistical occurrence within the general population.
Using a qualitative framework, cohort studies generally gather data using methods of
observation. Cohorts can be either "open" or "closed."
Advantages
1. The use of cohorts is often mandatory because a randomized control study may be
unethical. For example, you cannot deliberately expose people to asbestos, you can only
study its effects on those who have already been exposed. Research that measures risk
factors often relies on cohort designs.
2. Because cohort studies measure potential causes before the outcome has occurred, they
can demonstrate that these “causes” preceded the outcome, thereby avoiding the debate
as to which is the cause and which is the effect.
3. Cohort analysis is highly flexible and can provide insight into effects over time and related
to a variety of different types of changes [e.g., social, cultural, political, economic, etc.].
4. Either original data or secondary data can be used in this design.
88
Disadvantages
1. In cases where a comparative analysis of two cohorts is made [e.g., studying the effects
of one group exposed to asbestos and one that has not], a researcher cannot control for
all other factors that might differ between the two groups. These factors are known as
confounding variables.
2. Cohort studies can end up taking a long time to complete if the researcher must wait for
the conditions of interest to develop within the group. This also increases the chance that
key variables change during the course of the study, potentially impacting the validity of
the findings.
3. Because of the lack of randominization in the cohort design, its external validity is lower
than that of study designs where the researcher randomly assigns participants.
V - Cross-Sectional Design
Advantages
89
Disadvantages
1. Finding people, subjects, or phenomena to study that are very similar except in one
specific variable can be difficult.
2. Results are static and time bound and, therefore, give no indication of a sequence of
events or reveal historical contexts.
3. Studies cannot be utilized to establish cause and effect relationships.
4. Provide only a snapshot of analysis so there is always the possibility that a study could
have differing results if another time-frame had been chosen.
5. There is no follow up to the findings.
VI - Descriptive Design
Descriptive research designs help provide answers to the questions of who, what, when,
where, and how associated with a particular research problem; a descriptive study
cannot conclusively ascertain answers to why. Descriptive research is used to obtain
information concerning the current status of the phenomena and to describe "what
exists" with respect to variables or conditions in a situation.
Advantages
Disadvantages
1. The results from a descriptive research can not be used to discover a definitive answer or
to disprove a hypothesis.
2. Because descriptive designs often utilize observational methods [as opposed to
quantitative methods], the results cannot be replicated.
3. The descriptive function of research is heavily dependent on instrumentation for
measurement and observation.
90
VII - Experimental Design
A blueprint of the procedure that enables the researcher to maintain control over all
factors that may affect the result of an experiment. In doing this, the researcher attempts
to determine or predict what may occur. Experimental Research is often used where
there is time priority in a causal relationship (cause precedes effect), there is consistency
in a causal relationship (a cause will always lead to the same effect), and the magnitude
of the correlation is great. The classic experimental design specifies an experimental
group and a control group. The independent variable is administered to the experimental
group and not to the control group, and both groups are measured on the same
dependent variable. Subsequent experimental designs have used more groups and
more measurements over longer periods. True experiments must have control,
randomization, and manipulation.
Advantages
1. Experimental research allows the researcher to control the situation. In so doing, it allows
researchers to answer the question, “what causes something to occur?”
2. Permits the researcher to identify cause and effect relationships between variables and to
distinguish placebo effects from treatment effects.
3. Experimental research designs support the ability to limit alternative explanations and to
infer direct causal relationships in the study.
4. Approach provides the highest level of evidence for single studies.
Disadvantages
1. The design is artificial, and results may not generalize well to the real world.
2. The artificial settings of experiments may alter subject behaviors or responses.
3. Experimental designs can be costly if special equipment or facilities are needed.
4. Some research problems cannot be studied using an experiment because of ethical or
technical reasons.
5. Difficult to apply ethnographic and other qualitative methods to experimental designed
research studies.
An exploratory design is conducted about a research problem when there are few or no
earlier studies to refer to. The focus is on gaining insights and familiarity for later
investigation or undertaken when problems are in a preliminary stage of investigation.
91
The goals of exploratory research are intended produce the following possible insights:
Advantages
Disadvantages
1. Exploratory research generally utilizes small sample sizes and, thus, findings are typically
not generalizable to the population at large.
2. The exploratory nature of the research inhibits an ability to make definitive conclusions
about the findings.
3. The research process underpinning exploratory studies is flexible but often unstructured,
leading to only tentative results that have limited value in decision-making.
4. Design lacks rigorous standards applied to methods of data gathering and analysis
because one of the areas for exploration could be to determine what method or
methodologies best fit the research problem.
IX - Historical Design
The purpose of a historical research design is to collect, verify, and synthesize evidence
from the past to establish facts that defend or refute your hypothesis. It uses secondary
sources and a variety of primary documentary evidence, such as, logs, diaries, official
records, reports, archives, and non-textual information [maps, pictures, audio and visual
recordings]. The limitation is that the sources must be both authentic and valid.
Advantages
1. The historical research design is unobtrusive; the act of research does not affect the
results of the study.
92
2. The historical approach is well suited for trend analysis.
3. Historical records can add important contextual background required to more fully
understand and interpret a research problem.
4. There is no possibility of researcher-subject interaction that could affect the findings.
5. Historical sources can be used over and over to study different research problems or to
replicate a previous study.
Disadvantages
1. The ability to fulfill the aims of your research are directly related to the amount and quality
of documentation available to understand the research problem.
2. Since historical research relies on data from the past, there is no way to manipulate it to
control for contemporary contexts.
3. Interpreting historical sources can be very time consuming.
4. The sources of historical materials must be archived consistentally to ensure access.
5. Original authors bring their own perspectives and biases to the interpretation of past
events and these biases are more difficult to ascertain in historical resources.
6. Due to the lack of control over external variables, historical research is very weak with
regard to the demands of internal validity.
7. It rare that the entirety of historical documentation needed to fully address a research
problem is available for interpretation; therefore, gaps need to be acknowledged.
X - Longitudinal Design
A longitudinal study follows the same sample over time and makes repeated
observations. With longitudinal surveys, for example, the same group of people is
interviewed at regular intervals, enabling researchers to track changes over time and to
relate them to variables that might explain why the changes occur. Longitudinal research
designs describe patterns of change and help establish the direction and magnitude of
causal relationships. Measurements are taken on each variable over two or more distinct
time periods. This allows the researcher to measure change in variables over time. It is a
type of observational study and is sometimes referred to as a panel study.
Advantages
93
Disadvantages
XI - Observational Design
This type of research design draws a conclusion by comparing subjects against a control
group, in cases where the researcher has no control over the experiment. There are two
general types of observational designs. In direct observations, people know that you are
watching them. Unobtrusive measures involve any method for studying behavior where
individuals do not know they are being observed. An observational study allows a useful
insight into a phenomenon and avoids the ethical and practical difficulties of setting up a
large and cumbersome research project.
1. Observational studies are usually flexible and do not necessarily need to be structured
around a hypothesis about what you expect to observe (data is emergent rather than pre-
existing).
2. The researcher is able to collect a depth of information about a particular behavior.
3. Can reveal interrelationships among multifaceted dimensions of group interactions.
4. You can generalize your results to real life situations.
5. Observational research is useful for discovering what variables may be important before
applying other methods like experiments.
6. Observation researchd esigns account for the complexity of group behaviors.
1. Reliability of data is low because seeing behaviors occur over and over again may be a
time consuming task and difficult to replicate.
2. In observational research, findings may only reflect a unique sample population and,
thus, cannot be generalized to other groups.
3. There can be problems with bias as the researcher may only "see what they want to
see."
4. There is no possiblility to determine “cause and effect” relationships since nothing are
manipulated.
5. Sources or subjects may not all be equally credible.
94
6. Any group that is studied is altered to some degree by the very presence of the
researcher, therefore, skewing to some degree any data collected (the Heisenburg
Uncertainty Principle).
Ontology -- the study that describes the nature of reality; for example, what is
real and what is not, what is fundamental and what is derivative?
Epistemology -- the study that explores the nature of knowledge; for example, on
what does knowledge and understanding depend upon and how can we be certain of
what we know?
Axiology -- the study of values; for example, what values does an individual or
group hold and why? How are values related to interest, desire, will, experience, and
means-to-end? And, what is the difference between a matter of fact and a matter of
value?
Advantages
95
Disadvantages
1. Limited application to specific research problems [answering the "So What?" question in
social science research].
2. Analysis can be abstract, argumentative, and limited in its practical application to real-life
issues.
3. While a philosophical analysis may render problematic that which was once simple or
taken-for-granted, the writing can be dense and subject to unnecessary jargon,
overstatement, and/or excessive quotation and documentation.
4. There are limitations in the use of metaphor as a vehicle of philosophical analysis.
5. There can be analytical difficulties in moving from philosophy to advocacy and between
abstract thought and application to the phenomenal world.
Sequential research is that which is carried out in a deliberate, staged approach [i.e.
serially] where one stage will be completed, followed by another, then another, and so
on, with the aim that each stage will build upon the previous one until enough data is
gathered over an interval of time to test your hypothesis. The sample size is not
predetermined. After each sample is analyzed, the researcher can accept the null
hypothesis, accept the alternative hypothesis, or select another pool of subjects and
conduct the study once again. This means the researcher can obtain a limitless number
of subjects before finally making a decision whether to accept the null or alternative
hypothesis. Using a quantitative framework, a sequential study generally utilizes
sampling techniques to gather data and applying statistical methods to analze the
data.Using a qualitative framework, sequential studies generally utilize samples of
individuals or groups of individuals [cohorts] and use qualitative methods, such as
interviews or observations, to gather information from each sample.
Advantages
1. The researcher has a limitless option when it comes to sample size and the sampling
schedule.
2. Due to the repetitive nature of this research design, minor changes and adjustments can
be done during the initial parts of the study to correct and hone the research method.
Useful design for exploratory studies.
3. There is very little effort on the part of the researcher when performing this technique. It is
generally not expensive, time consuming, or workforce extensive.
4. Because the study is conducted serially, the results of one sample are known before the
next sample is taken and analyzed.
96
Disadvantages
1. The sampling method is not representative of the entire population. The only possibility of
approaching representativeness is when the researcher chooses to use a very large
sample size significant enough to represent a significant portion of the entire population.
In this case, moving on to study a second or more sample can be difficult.
2. Because the sampling technique is not randomized, the design cannot be used to create
conclusions and interpretations that pertain to an entire population. Generalizability from
findings is limited.
3. Difficult to account for and interpret variation from one sample to another over time,
particularly when using qualitative methods of data collection.
Throughout the design construction task, it is important to have in mind some endpoint,
some criteria which we should try to achieve before finally accepting a design strategy.
The criteria discussed below are only meant to be suggestive of the characteristics
found in good research design. It is worth noting that all of these criteria point to the
need to individually tailor research designs rather than accepting standard textbook
strategies as is.
1. Theory-Grounded. Good research strategies reflect the theories which are being
investigated. Where specific theoretical expectations can be hypothesized these are
incorporated into the design. For example, where theory predicts a specific treatment
effect on one measure but not on another, the inclusion of both in the design improves
discriminant validity and demonstrates the predictive power of the theory.
2. Situational. Good research designs reflect the settings of the investigation. This was
illustrated above where a particular need of teachers and administrators was explicitly
addressed in the design strategy. Similarly, intergroup rivalry, demoralization, and
competition might be assessed through the use of additional comparison groups who are
not in direct contact with the original group.
3. Feasible. Good designs can be implemented. The sequence and timing of events are
carefully thought out. Potential problems in measurement, adherence to assignment,
database construction and the like, are anticipated. Where needed, additional groups or
measurements are included in the design to explicitly correct for such problems.
4. Redundant. Good research designs have some flexibility built into them. Often, this
flexibility results from duplication of essential design features. For example, multiple
replications of a treatment help to insure that failure to implement the treatment in one
setting will not invalidate the entire study.
5. Efficient. Good designs strike a balance between redundancy and the tendency to
overdesign. Where it is reasonable, other, less costly, strategies for ruling out potential
threats to validity are utilized.
97
This is by no means an exhaustive list of the criteria by which we can judge good
research design. Nevertheless, goals of this sort help to guide the researcher toward a
final design choice and emphasize important components which should be included.
“Design is what you do when you don't [yet] know what you are doing.”<< George Stiny
98
A2Z
PhD
Business Name
Thesis
Reflections on Academic Research
Chapter XII
Sampling
99
SAMPLING
In the language of sampling:
-a population is the entire collection of people or things you are interested in;
-a census is a measurement of all the units in the population;
-a population parameter is a number that results from measuring all the units in
the population;
-a sampling frame is the specific data from which the sample is drawn, e.g., a
telephone book;
-a unit of analysis is the type of object of interest, e.g., arsons, fire departments,
firefighters;
-a sample is a subset of some of the units in the population;
-a statistic is a number that results from measuring all the units in the sample;
-statistics derived from samples are used to estimate population parameters.
For example, to find out the average age of all motor vehicles in the state in 2011:
Why Sample?
Types of Samples:
Non-probability (non-random) samples: These samples focus on volunteers, easily
available units, or those that just happen to be present when the research is done. Non-
100
probability samples are useful for quick and cheap studies, for case studies, for
qualitative research, for pilot studies, and for developing hypotheses for future research.
Purposive sample: the researcher selects the units with some purpose in mind, for
example, students who live in dorms on campus, or experts on urban development.
Quota sample: the researcher constructs quotas for different types of units. For example,
to interview a fixed number of shoppers at a mall, half of whom are male and half of
whom are female.
Other samples that are usually constructed with non-probability methods include library
research, participant observation, marketing research, consulting with experts, and
comparing organizations, nations, or governments.
Simple random sample: Each unit in the population is identified, and each unit has an
equal chance of being in the sample. The selection of each unit is independent of the
selection of every other unit. Selection of one unit does not affect the chances of any
other unit.
For example, to select a sample of 25 people who live in your college dorm, make a list of all the
250 people who live in the dorm. Assign each person a unique number, between 1 and 250. Then
refer to a table of random numbers. Starting at any point in the table, read across or down and
note every number that falls between 1 and 250. Use the numbers you have found to pull the
names from the list that correspond to the 25 numbers you found. These 25 people are your
sample. This is called the table of random numbers method.
Another way to select this simple random sample is to take 250 ping-pong balls and number then
from 1 to 250. Put them into a large barrel and mix them up, and then grab 25 balls. Read off the
numbers. Those are the 25 people in your sample. This is called the lottery method.
Systematic random sampling: Each unit in the population is identified, and each unit has
an equal chance of being in the sample.
For example, to select a sample of 25 dorm rooms in your college dorm, make a list of all the
room numbers in the dorm. Say there are 100 rooms. Divide the total number of rooms (100) by
the number of rooms you want in the sample (25). The answer is 4. This means that you are
going to select every fourth dorm room from the list. But you must first consult a table of random
numbers. Pick any point on the table, and read across or down until you come to a number
between 1 and 4. This is your random starting point. Say your random starting point is "3". This
101
means you select dorm room 3 as your first room, and then every fourth room down the list (3, 7,
11, 15, 19, etc.) until you have 25 rooms selected.
This method is useful for selecting large samples, say 100 or more. It is less
cumbersome than a simple random sample using either a table of random numbers or a
lottery method. For example, you might have to sample files in a large filing cabinet. It is
easier to select every 17th file than to pull out all the files and number them, etc.
However, you must be aware of problems that can arise in systematic random sampling.
If the selection interval matches some pattern in the list (e.g., each 4th dorm room is a
single unit, where all the others are doubles) you will introduce systematic bias into your
sample.
Stratified random sampling: Each unit in the population is identified, and each unit has a
known, non-zero chance of being in the sample. This is used when the researcher
knows that the population has sub-groups (strata) that are of interest.
For example, if you wanted to find out the attitudes of students on your campus about
immigration, you may want to be sure to sample students who are from every region of the
country as well as foreign students. Say your student body of 10,000 students is made up of
8,000 - West; 1,000 - East; 500 - Midwest; 300 - South; 200 - Foreign.
If you select a simple random sample of 500 students, you might not get any from the
Midwest, South, or Foreign. To make sure that you get some students from each group,
you can divide the students into these five groups, and then select the same percentage
of students from each group using a simple random sampling method. This is
proportional stratified random sampling.
However, you may still have too few of some types of students. Instead, you may divide
students into the five groups and then select the same number of students from each
group using a simple random sampling method. This is disproportionate stratified
random sampling. This allows you to have enough students in each sub-group so that
you can perform some meaningful statistical analyses of the attitudes of students in each
sub-group. In order to say something about the attitudes of the total student population
of the university, however, you will have to apply weights to the findings for each sub-
group, proportional to its presence in the total student body.
102
Cluster sampling: cluster sampling views the units in a population as not only being
members of the total population but as members also of naturally-occurring in clusters
within the population. For example, city residents are also residents of neighborhoods,
blocks, and housing structures.Cluster sampling is used in large geographic samples
where no list is available of all the units in the population but the population boundaries
can be well-defined. For example, to obtain information about the drug habits of all high
school students in a state, you could obtain a list of all the school districts in the state
and select a simple random sample of school districts. Then, within in each selected
school district, list all the high schools and select a simple random sample of high
schools. Within each selected high school, list all high school classes, and select a
simple random sample of classes. Then use the high school students in those classes
as your sample. Cluster sampling must use a random sampling method at each stage.
This may result in a somewhat larger sample than using a simple random sampling
method, but it saves time and money. It is also cheaper to administer than a statewide
sample of high school seniors, because there are many fewer sites to obtain information
from.
The size of the sample depends on the type of research design being used; the desired
level of confidence in the results; the amount of accuracy wanted; and the characteristics
of the population of interest. Sample size has little to do with the size of the population,
however.
Random sampling procedures are based on probability theory; this is why they are also
called probability sampling methods. Say we are interested in knowing what is the
average monthly income of all the full-time students at our university. There are 5 full-
time students each with a different monthly income as follows: Rs.500; Rs.650; Rs.400;
Rs.700; Rs.600. This is our population of students. Say we take a simple random
sample of 2 students and figure the average for the sample.
It is entirely possible that we could take a simple random sample 2 students from the 5
students above and get an average as low as Rs.450 per month. It is equally possible
that we could take a different simple random sample of 2 students and get an average
103
as high as Rs.675 per month. Try it with the following figures. There are 10 possible
samples of two students:
We know from probability theory that if we took all possible combinations of samples of 2
full-time students from our population of 5, found the average monthly wage for all
possible samples, and took the average of all those averages, we would find the exact
typical monthly income of all 5 students.
Now in this example, of course it would be easier to just find the average monthly wage
for all five students in the population. However, we can apply this same principle to much
larger populations, where it would be nearly impossible to measure every unit in the
population.
Say we wanted to find the average monthly wage of all 10,000 full-time students at our
university. We can take a simple random sample of 150 students, find the average
monthly wage for the 150 students in the sample, and then use that number (a sample
statistic) to estimate the average monthly wage for the entire population of students (a
population parameter).
We know from probability theory that if we took a very large number of simple random
samples of 150 students from our student population, and found the average monthly
wage for each sample, that those averages would tend to distribute themselves in the
pattern of a "bell-shaped" curve, also called "the normal curve." That curve has well-
established properties.
104
For example, approximately 68% of the sample averages would fall within plus or minus
one standard deviation of the true population average. We also know that approximately
95% of the sample averages would fall within plus or minus two standard deviations of
the true population average. And finally, we know that approximately 99% of the sample
averages would fall within plus or minus three standard deviations of the true population
average.
Using these established principles, we do not have to take repeated simple random
samples (fortunately!). Instead, we can use these principles to estimate how well our
sample statistic estimates the population parameter. We can also use these principles to
select an adequate sample size for our research.
Say we want to know what proportion of the support of students at our university support
the death penalty. To calculate sample size, we must make four decisions:
Second, how sure do we want to be that we could get the same results if we did the
study multiple times? Do we want to be 50% sure, 90% sure, 95% sure, or 99% sure?
This is called the confidence level. The more sure we want to be, the larger the sample
size needs to be. In this case, we want a confidence level of 95%.
Fourth, how is the population distributed on the variable of interest? That is, in a yes/no
situation, how many do we think will say yes? How many will say no? The most
conservative way to approach this is to guess that the population is split 50/50 on the
105
question. In this case we guess that 50% of the students will support the death penalty,
and 50% will oppose it.
If we are doing a survey of a population, and are not interested in sub-samples within the
population, and will accept a 95% confidence level, and a 4% margin of error, and
assume a probability of .5 on the variable (.5 will say yes), then the formula for sample
size is as follows:
the square root of sample size = [the square root of (.5) x (1-.5)] x 1.96/.05 =
the square root of sample size = the square root of .25 x 1.96/.05 =
the square root of sample size = the square root of (.5) x 49 =
the square root of sample size = the square root of 24.5
Squaring both sides, we have
the sample size = 24.5 squared =
the sample size = 600.25 (round off to 600)
As the margin of error decreases, the sample size will need to increase (and vice versa).
If we wanted to change the margin of error to plus or minus 3%, (keeping the confidence
level at 95%), the required sample size increases to 1,067. If we could afford to use a
margin of error of plus or minus 5%, the sample size would decrease to 384.
Similarly, if the confidence level increases, the sample size will need to increase. If we
increase the confidence level to 99%, the sample size increases to 1,036 (with the
margin of error remaining at 4%). If the confidence level decreases to 90%, the sample
size decreases to 413.
If you have a fixed sample size, you can increase the confidence level and decrease the
accuracy, or you can increase the accuracy and decrease the confidence level, but you
cannot do both.
As the variability in the population on the variable of interest increases, the sample size
increases. A probability of 50/50 demonstrates the greatest variability in the population.
If the variability decreases to 60/40, or 70/30, then a smaller sample size will result.
106
The following table summarizes the calculations for sample sizes for survey research,
assuming a probability of 50/50 on a dichotomous question, and no sub-populations.
10 68 96 166
20 17 24 41
If the researcher wants to study sub-populations as well as the whole population, then
larger sample sizes will be needed. In addition, if more than one variable is being studied
at the same time, then the rule of thumb is to have a total of at least 10 cases per
variable.
If the research is to be a controlled experiment, then smaller sample sizes can be used.
However, it is recommended to use samples of no smaller than 30 for each group in the
experiment (e.g., experimental and control groups). Many common statistics are based
on sample sizes of a minimum of 30; for sample sizes of less than 30, other special
statistics must be used.
Sample Quality
Sampling error arises from two principal sources: random error, and non-random error.
Random error results from taking a sample from a population, instead of measuring the
entire population. It is predictable, using probability theory. It is the reason that sample
statistics only provide estimates of population parameters, but the amount of random
error is known.
107
Non-random error results from bias being introduced into the sample from some flaw in
the design or implementation of the sample. For example, using a telephone book as the
sampling frame for all the residents of a city will result in some bias, because some
people are not listed in the directory or do not have telephones. People who refuse to
take part in a study (which is their right) also may introduce bias into the sample. Some
people may provide erroneous information, which also biases the results. Finally,
mistakes in computing the required sample size, in identifying the actual units to be
included in the sample, or other errors can introduce bias into the sample.
Adequate Sample?
To assess whether an adequate sample was used in a piece of research, ask the
following questions:
Size--was the size adequate for the purpose of the study, especially if there were many
sub-groups included in the analysis, or many variables used simultaneously?
Once the scholar has all the information, the following formula can be used to calculate
the minimum sample size:
2
z
n p% q%
e%
Where
108
Table
90 % certain 1.65
95 % certain 1.96
99 % certain 2.57
Where your population is less than 10000, a smaller sample size can be used without
affecting the accuracy. This is called the adjusted minimum sample size. It is calculated
using the following formula
n
ni ,
n
1
N
Where
ni is the adjusted minimum sample size
n is the minimum sample (as calculated above)
N is the total population
109
Chapter XIII
Designing a Questionnaire
A2Z
PhD
Thesis
110
DESIGNING A QUESTIONNAIRE
fulcrum of research
Introduction
This is the information age. More information has been published in the last decade than in all
previous history. Everyone uses information to make decisions about the future. If our
information is accurate, we have a high probability of making a good decision. If our information
is inaccurate, our ability to make a correct decision is diminished. Better information usually
leads to better decisions. The most of used form of collecting information is Questionnaire. Ask
yourself, why should I use a questionnaire? It is worth being self reflective when beginning to
construct your own questionnaire, by writing down your reasons for choosing such a research
instrument rather than another (say interviews or observation), for inventing your own rather
than using one already available in the literature, and for posing the sorts of questions you want
to use. Such notes may be useful when you come to write the ‘methods’ chapter/section of your
research report. The fundamental question that must then be asked is, what are you trying to
find out? Every questionnaire must have a purpose, ie it must draw from some underlying
hypotheses about what are the important facts or opinions and even make some predictions
about which facts may be relevant in explaining the opinions expressed.
Questionnaire Design
Perhaps the most important stage of the survey process is the creation of questions that
accurately measure the opinions, experiences and behaviors of the public. Accurate random
sampling and high response rates will be wasted if the information gathered is built on a shaky
foundation of ambiguous or biased questions. Creating good measures involves both writing
good questions and organizing them to form the questionnaire.
Questionnaire design is a multiple-stage process that requires attention to many details at the
same time. Designing the questionnaire is a complicated process because surveys can ask
about topics in varying degrees of detail, questions can be asked in different ways, and
questions asked earlier in a survey may influence how people respond to later questions.
Researchers are also often interested in measuring change over time and therefore must be
attentive to how opinions or behaviors have been measured in prior surveys. Surveyors may
111
conduct pilot tests or focus groups in the early stages of questionnaire development in order to
better understand how people think about an issue or comprehend a question. Finally,
pretesting a survey to evaluate how people respond to the overall questionnaire and specific
questions is an essential step in the questionnaire design process.
Principles of Wording
The wording of a question is extremely important. Researchers strive for objectivity in surveys
and, therefore, must be careful not to lead the respondent into giving a desired answer.
Unfortunately, the effects of question wording are one of the least understood areas of
questionnaire research.
Many investigators have confirmed that slight changes in the way questions are worded can
have a significant impact on how people respond (Arndt and Crane, 1975; Belkin and
Lieverman, 1967; Cantril, 1944; Kalton, Collins, and Brook, 1978; Petty, Rennier and Cacioppo,
198; Rasinski, 1989; Schuman and Presser, 1981, 1977; ). Several authors have reported that
minor changes in question wording can produce more than a 25 percent difference in people's
opinions (Payne, 1951; Rasinski, 1989).
One important area of question wording is the effect of the interrogation and assertion question
formats. The interrogation format asks a question directly, where the assertion format asks
subjects to indicate their level of agreement or disagreement with a statement. Schuman and
Presser (1981) reported no significant differences between the two formats, however, other
researchers hypothesized that the interrogation format is more likely to encourage subjects to
think about their answers (Burnkrant and Howard, 1984; Petty, Cacioppo, and Heesacker, 1981;
Swasy and Munch, 1985; Zillman, 1972). Petty, Rennier and Cacioppo (1987) found that the
interrogation format caused greater polarization in subjects' responses, suggesting that there
was greater cognition than the assertion format.
Other investigators have looked at the effects of modifying adjectives and adverbs (Bradburn
and Miles, 1979; Hoyt, 1972; Schaeffer, 1991). Words
like usually, often, sometimes, occasionally, seldom, and rarely are "commonly" used in
questionnaires, although it is clear that they do not mean the same thing to all people. Simpson
(1944), and a replication by Hakel (1968), looked at twenty modifying adjectives and adverbs.
112
These researchers found that the precise meanings of these words varied widely between
subjects, and between the two studies. However, the correlation between the two studies with
respect to the relative ranking of the words was .99. Some adjectives have high variability and
others have low variability. The following adjectives have highly variable meanings and should
be avoided in surveys: a clear mandate, most, numerous, a substantial majority, a minority of, a
large proportion of, a significant number of, many, a considerable number of, and several. Other
adjectives produce less variability and generally have more shared meaning. These are: lots,
almost all, virtually all, nearly all, a majority of, a consensus of, a small number of, not very
many of, almost none, hardly any, a couple, and a few.
There are good and bad questions. The qualities of a good question are as follows:
1. Evokes the truth. Questions must be non-threatening. When a respondent is concerned about the
consequences of answering a question in a particular manner, there is a good possibility that the answer
will not be truthful. Anonymous questionnaires that contain no identifying information are more likely to
produce honest responses than those identifying the respondent. If your questionnaire does contain
sensitive items, be sure to clearly state your policy on confidentiality.
2. Asks for an answer on only one dimension. The purpose of a survey is to find out information. A
question that asks for a response on more than one dimension will not provide the information you are
seeking. For example, a researcher investigating a new food snack asks "Do you like the texture and
flavor of the snack?" If a respondent answers "no", then the researcher will not know if the respondent
dislikes the texture or the flavor, or both. Another questionnaire asks, "Were you satisfied with the quality
of our food and service?" Again, if the respondent answers "no", there is no way to know whether the
quality of the food, service, or both were unsatisfactory. A good question asks for only one "bit" of
information.
3. Can accommodate all possible answers. Multiple choice items are the most popular type of survey
questions because they are generally the easiest for a respondent to answer and the easiest to analyze.
Asking a question that does not accommodate all possible responses can confuse and frustrate the
respondent.
4. Has mutually exclusive options. A good question leaves no ambiguity in the mind of the respondent.
There should be only one correct or appropriate choice for the respondent to make.
5. Produces variability of responses. When a question produces no variability in responses, we are left
with considerable uncertainty about why we asked the question and what we learned from the
information. If a question does not produce variability in responses, it will not be possible to perform any
statistical analyses on the item.
6. Follows comfortably from the previous question. Writing a questionnaire is similar to writing anything
else. Transitions between questions should be smooth. Grouping questions that are similar will make the
questionnaire easier to complete, and the respondent will feel more comfortable. Questionnaires that
jump from one unrelated topic to another feel disjointed and are not likely to produce high response rates.
113
7. Does not presuppose a certain state of affairs. Among the most subtle mistakes in questionnaire
design are questions that make an unwarranted assumption.
8. Does not imply a desired answer. The wording of a question is extremely important. We are striving for
objectivity in our surveys and, therefore, must be careful not to lead the respondent into giving the answer
we would like to receive. Leading questions are usually easily spotted because they use negative
phraseology.
9. Does not use emotionally loaded or vaguely defined words. This is one of the areas overlooked by both
beginners and experienced researchers. Quantifying adjectives (e.g., most, least, majority) are frequently
used in questions. It is important to understand that these adjectives mean different things to different
people.
10. Does not use unfamiliar words or abbreviations. Remember who your audience is and write your
questionnaire for them. Do not use uncommon words or compound sentences. Write short sentences.
Abbreviations are okay if you are absolutely certain that every single respondent will understand their
meanings. If there is any doubt at all, do not use the abbreviation.
11. Is not dependent on responses to previous questions. Branching in written questionnaires should be
avoided. While branching can be used as an effective probing technique in telephone and face-to-face
interviews, it should not be used in written questionnaires because it sometimes confuses respondents.
12. Does not ask the respondent to order or rank a series of more than five items. Questions asking
respondents to rank items by importance should be avoided. This becomes increasingly difficult as the
number of items increases, and the answers become less reliable. This becomes especially problematic
when asking respondents to assign a percentage to a series of items. In order to successfully complete
this task, the respondent must mentally continue to re-adjust his answers until they total one hundred
percent. Limiting the number of items to five will make it easier for the respondent to answer.
Question Hierarchy
Each question should follow comfortably from the previous question. Writing a questionnaire is
similar to writing anything else. Transitions between questions should be smooth.
Questionnaires that jump from one unrelated topic to another feel disjointed and are not likely to
produce high response rates.
Most investigators have found that the order in which questions are presented can affect the
way that people respond. One study reported that questions in the latter half of a questionnaire
were more likely to be omitted, and contained fewer extreme responses. Some researchers
114
have suggested that it may be necessary to present general questions before specific ones in
order to avoid response contamination. Other researchers have reported that when specific
questions were asked before general
questions, respondents tended to exhibit greater interest in the general questions. It is not clear
whether or not question-order affects response. A few researchers have reported that question-
order does not affect responses, while others have reported that it does. Generally, it is believed
that question-order effects exist in interviews, but not in written surveys.
Types of Questionnaire
There are two types of questionnaires: structured and unstructured. The design of a
questionnaire differs according to how it is administered; in particular the amount of contact
researcher has with respondents.
These contains concrete, definite and preordained questions. Additional questions may be
thought of and asked only when some clarification is needed or additional administration is
sought from the respondents. Answers to these questions are usually very precise without any
vagueness and ambiguity. The structured questionnaire is divided into two categories:
[a] Closed-ended questionnaires: Questions are set in such a manner that leaves only a few alternative
answers. For example, yes or no, with a limited number of answers for a respondent to choose from.
[b] Open-ended questionnaires: Respondents have the choice of using their own style, diction, expression
of language, length and perception. The respondents are restricted in their replies to the question and
their answers may be free and spontaneous. Though ample freedom is available to the respondents, it
creates problems of proper classifications, tabulation and analysis.
115
In a pictorial questionnaire, alternative answers in the form of pictures are given and the
respondents are required to tock the picture concerned to indicate their selection. This type of
questionnaire is useful for illiterate and less knowledgeable respondents.
Generally it has been observed that, long questionnaires get less response than short
questionnaires. However, some studies have shown that the length of a questionnaire does not
necessarily affect response (Berdie, 1973; Champion and Sear, 1979; Childers and Ferrell,
1979; Duncan, 1979; Layne and Thompson, 1981; Mason Dressel, and Bain, 1961). "Seemingly
more important than length is question content." (Berdie, Anderson, and Niebuhr, 1986, p. 53) A
subject is more likely to respond if they are involved and interested in the research topic (Bauer,
1947; Brown and Wilkins, 1978; Reid, 1942; Speer and Zold, 1971). Questions should be
meaningful and interesting to the respondent. Finally, simple, short questions are preferable to
long ones. As a rule of thumb, a question or a statement in the questionnaire should not exceed
20 words, or exceed one full line in print.
The physical appearance of a written survey may largely determine if the respondent will return
it (Levine and Gordon, 1958). Therefore, it is important to use professional production methods
for the questionnaire--either desktop publishing or typesetting and keylining (Robinson and
Agisim, 1951; Robinson, 1952; Sletto, 1940; Toops, 1937). Every questionnaire should have a
title that is short and meaningful to the respondent (Berdie, Anderson, and Niebuhr, 1986). The
rationale is that a questionnaire with a title will be perceived as more credible than one without
atitle.
Well-designed questionnaires include clear and concise instructions on how they should be
completed. These must be very easy to understand, so use short sentences and basic
vocabulary. The questionnaire itself should have the return address printed on it since
questionnaires often get separated from the reply envelopes (Berdie, Anderson, and Niebuhr,
1986).
Questionnaires should use simple and direct language (Norton, 1930). The questions must be
clearly understood by the respondent, and have the same meaning that the researcher intended
116
(Freed, 1964; Huffman, 1948). The wording of a question should be simple, to the point, and
familiar to the target population (Freed, 1964; Moser and Kalton, 1971). Surprisingly, several
researchers (Blair et al., 1977; Laurent, 1972) have found that longer questions elicit more
information than shorter ones, and that the information tends to be more accurate. However, it is
generally accepted that questionnaire items should be simply stated and as brief as possible
(Payne, 1951). The rationale is that this will reduce misunderstandings and make the
questionnaire appear easier to complete. One way to eliminate misunderstandings is to
emphasize crucial words in each item by using bold, italics or underlining (Berdie, Anderson,
Niebuhr, 1986).
Uncommon words, jargon, and abbreviations may be included in a questionnaire provided that
they are familiar to the population being investigated (Bartholomew, 1963). Slang is often
ambiguous, and should be excluded from all questionnaires (Payne, 1951). Questionnaires
should leave adequate space for respondents to make comments. One criticism of
questionnaires is their inability to retain the "flavor" of a response. Leaving space for comments
will provide valuable information not captured by the response categories. Leaving white space
also makes the questionnaire look easier and this might increase response (Berdie, Anderson,
and Neibuhr, 1986).
Researchers should design the questionnaire so it holds the respondent's interest. The goal is
to make the respondent want to complete the questionnaire. One way to keep a questionnaire
interesting is to provide variety in the type of items used. Varying the questioning format will also
prevent respondents from falling into "response sets". If a questionnaire is more than a few
pages and is held together by a staple, include some identifying data on each page (such as a
respondent ID number). Pages often accidentally separate (Berdie, Anderson, and Neibuhr,
1986).
117
people. Nearly everyone has had some experience completing questionnaires and they
generally do not make people apprehensive.
Questionnaires reduce bias. There is uniform question presentation and no middle-man bias.
The researcher's own opinions will not influence the respondent to answer questions in a certain
manner. There are no verbal or visual clues to influence the respondent. Questionnaires are
less intrusive than telephone or face-to-face surveys. When a respondent receives a
questionnaire in the mail, he is free to complete the questionnaire on his own time-table. Unlike
other research methods, the respondent is not interrupted by the research instrument.
One major disadvantage of written questionnaires is the possibility of low response rates. Low
response is the curse of statistical analysis. It can dramatically lower our confidence in the
results. Response rates vary widely from one questionnaire to another (10% - 90%), however,
well designed studies consistently produce high response rates. Another disadvantage of
questionnaires is the inability to probe responses.
Questionnaires are structured instruments. They allow little flexibility to the respondent with
respect to response format. In essence, they often lose the "flavor of the response" (i.e.,
respondents often want to qualify their answers). By allowing frequent space for comments, the
researcher can partially overcome this disadvantage. Comments are among the most helpful of
all the information on the questionnaire, and they usually provide insightful information that
would have otherwise been lost. Nearly ninety percent of all communication is visual. Gestures
and other visual cues are not available with written questionnaires. The lack of personal contact
will have different effects depending on the type of information being requested. A questionnaire
requesting factual information will probably not be affected by the lack of personal contact. A
questionnaire probing sensitive issues or attitudes may be severely affected. When returned
questionnaires arrive in the mail, it's natural to assume that the respondent is the same person
you sent the questionnaire to. This may not actually be the case. Many times business
questionnaires get handed to other employees for completion. Housewives sometimes respond
for their husbands. Kids respond as a prank. For a variety of reasons, the respondent may not
be who you think it is. It is a confounding error inherent in questionnaires. Finally,
questionnaires are simply not suited for some people. For example, a written survey to a group
118
of poorly educated people might not work because of reading skill problems. More frequently,
people are turned off by written questionnaires because of misuse.
An anonymous study is one in which nobody (not even the researcher) can identify who
provided data. It is difficult to conduct an anonymous questionnaire through the mail because of
the need to follow-up on non-responders. The only way to do a follow-up is to mail another
survey or reminder postcard to the entire sample. However, it is possible to guarantee
confidentiality, where those conducting the study promise not to reveal the information to
anyone. For the purpose of follow-up, identifying numbers on questionnaires are generally
preferred to using respondents' names. It is important, however, to explain why the number is
there and what it will be used for.
Some studies have shown that response rate is affected by the anonymity/confidentiality policy
of a study (Jones, 1979; Dickson et al., 1977; Epperson and Peck, 1977). Klein, Maher, and
Dunnington (1967) reported that responses became more distorted when subjects felt
threatened that their identities would become known. Others have found that
anonymity/confidentiality issues do not affect response rates or responses (Butler, 1973; Fuller,
1974; Futrell and Swan, 1977; Skinner and Childers, 1980; Watkins, 1978; Wildman, 1977).
One researcher reported that the lack of anonymity actually increased response (Fuller, 1974).
Pre-notification Letters
Many researchers have studied pre-notification letters to determine if they increase response
rate. A meta-analysis of these studies revealed an aggregate increase in response rate of 7.7
percent. Pre-notification letters might help to establish the legitimacy of a survey, thereby
contributing to a respondent's trust. Another possibility is that a pre-notification letter builds
expectation and reduces the possibility that a potential respondent might disregard the survey
when it arrives. Pre-letters are seldom used in marketing research surveys. They are an
excellent (but expensive) way to increase response. The researcher needs to weigh the
additional cost of sending out a pre-letter against the probability of a lower response rate. When
sample sizes are small, every response really counts and a pre-letter is highly recommended.
119
Cover Letters
The cover letter is an essential part of the survey. To a large degree, the cover letter will affect
whether or not the respondent completes the questionnaire. It is important to maintain a friendly
tone and keep it as short as possible. The importance of the cover letter should not be
underestimated. It provides an opportunity to persuade the respondent to complete the survey.
If the questionnaire can be completed in less than five minutes, the response rate can be
increased by mentioning this in the cover letter. Flattering the respondent in the cover letter
does not seem to affect response. Altruism or an appeal to the social utility of a study has
occasionally been found to increase response, but more often, it is not an effective motivator.
The signature of the person signing the cover letter has been investigated by several
researchers. Ethnic sounding names and the status of the researcher (professor or graduate
student) do not affect response (Friedman and Goldstein, 1975; Horowitz and Sedlacek, 1974).
One investigator found that a cover letter signed by the owner of a marina produced better
response than one signed by the sales manager (Labrecque, 1978). The literature is mixed
regarding whether a hand-written signature works better than one that is mimeographed. Two
researchers (Blumenfeld, 1973 ; Kawash and Aleamoni, 1971) reported that mimeographed
signatures worked as well as a hand-written one, while another reported that hand-written
signatures produced better response (Reeder, 1960). Another investigator (Smith, 1977) found
that cover letters signed with green ink increased response by over 10 percent.
Even after the researcher has proceeded along the lines suggested, the draft questionnaire is a
product evolved by one or two minds only. Until it has actually been used in interviews and with
respondents, it is impossible to say whether it is going to achieve the desired results. For this
reason it is necessary to pre-test the questionnaire before it is used in a full-scale survey, to
identify any mistakes that need correcting.
120
whether the questions as they are worded will achieve the desired results
· whether additional or specifying questions are needed or whether some questions should be eliminated
Usually a small number of respondents are selected for the pre-test. The respondents selected
for the pilot survey should be broadly representative of the type of respondent to be interviewed
in the main survey.
If the questionnaire has been subjected to a thorough pilot test, the final form of the questions
and questionnaire will have evolved into its final form. All that remains to be done is the
mechanical process of laying out and setting up the questionnaire in its final form. This will
involve grouping and sequencing questions into an appropriate order, numbering questions, and
inserting interviewer instructions.
Response Rate
A common criticism of mail surveys is that they often have low response rates (Benson, 1946;
Phillips, 1941; Robinson, 1952). Low response is the curse of statistical analysis, and it can
dramatically lower confidence in the results. While response rates vary widely from one
questionnaire to another, well-designed studies consistently produce high response rates.
When returned questionnaires arrive in the mail, it's natural to assume that the respondent is
the same person you sent the questionnaire to. A number of researchers have reported that this
may not actually be the case (Clausen and Ford, 1947; Franzen and Lazersfeld, 1945; Moser
and Kalton, 1971; Scott, 1961). Many times business questionnaires get handed to other
employees for completion. Housewives sometimes respond for their husbands. Kids respond as
a prank. For a variety of reasons, the respondent may not be who you think it is. In a summary
of five studies sponsored by the British Government, Scott (1961) reports that up to ten percent
of the returned questionnaires had been completed by someone other than the intended person.
121
Conclusion
While collecting primary data, selecting the right tools for collecting data is of the utmost
importance. For that, the researcher must have a clear understanding of the context in which
different tools are used. Not only it is important to address issues of wording and measurement
in questionnaire design, but it is also necessary to pay attention to how the questionnaire looks.
An attractive and neat questionnaire with appropriate introduction and well-arrayed set of
questions and response alternatives will make it easier for the respondents to answer them.
Questionnaire is especially useful and economical in situations where the geographical
dispersal of respondents is wide. Questionnaires also isolate respondents from external
influence. The respondents are totally free to express their views according to their knowledge,
views and attitudes in an unbiased manner. Data obtained without external influence is more
valid and reliable.
By asking easy, non-threatening questions at the beginning of the questionnaire, you will put the
respondent at ease, establish interest and build rapport.
02] BE BRIEF
Long, complex questions can confuse respondents and produce inaccurate results. Generally,
the more words to a question, the more likely that the wording itself will influence the response.
Try breaking up a long question into two shorter ones.
Word questions that everyone can understand easily. For example, some people have trouble
understanding double negatives ("Are you against not requiring test?"). Be careful, however, not
to talk down to people.
122
04] DO NOT ASSUME KNOWLEDGE
If you were to ask the question, "Do you approve or disapprove of changing from letter grades
to a portfolio system of assessment?" some (or perhaps many) people will answer without really
understanding what that means. It doesn't help to give examples because people will probably
respond only to the examples you give them. It is better to ask about attitudes only toward
specific and clearly identified proposals.
In many cases, two alternatives cannot adequately measure the range of opinion on a subject.
Instead of asking "Are you satisfied with the food in the school cafeteria?" ask "How satisfied
are you with the food in the school cafeteria -- very satisfied, fairly satisfied, not too satisfied or
not satisfied at all?" This makes it easier for respondents to answer and allows you to measure
the various gradations of opinion. You may sometimes want to use a scale to measure intensity
of opinion: "With a +3 being the highest ranking and -3 the lowest, how would you rate the
following...?"
When you design your answer categories, whether in words or in numbers, there is no fixed rule
about whether you should allow people to choose from among four (forcing them to choose
whether they are more positive or more negative), or whether you should give them five choices
(providing them with neutral ground). Think about how you will use the results, and then provide
an even or odd number of choices, depending on those uses.
07] PLACE SENSITIVE QUESTIONS, SUCH AS ASKING ABOUT ACADEMIC ACHIEVEMENT OR FAMILY INCOME AT THE
END OF THE QUESTIONNAIRE.
Some people are uncomfortable being asked about their family income or how well they are
doing in school and other personal questions. By saving these questions until last, you have a
better chance of getting answers since the respondent feels more comfortable after answering
other questions. It is also helpful to explain that this information is asked for statistical purposes
only. This information is useful 1) in describing and understanding the characteristics of your
sample; 2) in examining responses by background categories such as gender or age; 3) in
determining whether those who completed the survey are representative of the sample as a
whole.
09] USE "FILTER" QUESTIONS TO SEPARATE INFORMED FROM UNINFORMED OPINIONS ON COMPLEX SUBJECTS
Ask a question like "Have you heard or read about Plan X?" and ask follow-up questions only to
those who answer "Yes."
123
10] TRY NOT TO INCLUDE MANY "OPEN-ENDED" QUESTIONS
It's always tempting to ask "open-ended" questions; that is, instead of including a list of
responses from which respondents have to choose, the respondent is asked to explain his or
her position. However, you should try not to include many such questions, but try to limit
yourself to one or two. There are several reasons for this advice:
First, including too many such questions will seriously change your response rate. Answering
open-ended questions requires more time and thought than selecting answers from a pre-
existing list of alternatives. With many open-ended questions, more people will decide that it is
too much trouble to complete the interview.
If your questionnaire is self-administered, including too many open-ended questions will change
the mix of people who complete the questionnaire. For example, you discourage those with poor
verbal skills, and you may encourage those with more free time.
Finally, processing the answers from open-ended questions is very time-consuming. Including
too many such questions transforms your survey from an interesting project into a major
enterprise.
A leading question suggests an answer. For example, you might want to ask: "In order to
improve the quality of education, should teachers be paid higher salaries?" However, this
question presents a widely accepted goal (improving the quality of education) accompanied by
the assumption that the means suggested (raising teachers salaries) will accomplish the goal --
thus influencing the respondent to answer "Yes."
A "double-barrelled" question contains two or more distinct questions but allows only one
answer. For example, if you ask, "Should the school reduce paperwork required of teachers by
hiring more administrators?" you don't know whether a "Yes" answer means the respondent
favors reducing paperwork or hiring more administrators or both.
A question containing ambiguous terms can be easily misunderstood and misinterpreted. For
example, if you ask, "Do you think equipment safety could be improved," you will not know
whether people interpret this to mean reduction of damage in transit and workshop or that more
care should be taken in operating the equipment in a factory.
When a question contains a value-laden term, respondents may answer emotionally, without
regard for the context in which the term is used. If you ask people's attitudes toward a specific
government social program and characterize it as "liberal" or "conservative," people are likely to
react to their feelings about "liberal" or "conservative," and not about the program itself.
124
15] AVOID QUESTIONS WITH "SOCIALLY ACCEPTABLE" ANSWERS
When people are asked about their participation in generally approved activities such as
attending classes, they tend to give socially acceptable answers that may or may not be true.
You should only ask questions such as these when you understand their limitations and when
you have a control to make the answers more meaningful. For example, if you ask students
about their attendance in class, you might ask them how many regularly scheduled classes they
missed last week for reasons other than illness.
16] TELL RESPONDENTS HOW SPECIFIC THEIR ANSWERS SHOULD BE; USE RANGES WHERE APPROPRIATE
If you ask, "How long have you lived in this community, you may get answers ranging from "all
my life" to "13 weeks." To improve efficiency, provide ranges so people can know how specific
to make their answers. For example, you might ask, "How long have you lived in this community
-- less than one year, two to five years, six to ten years, or more than ten years?
In this set of answers, someone with an income of Rs.10,000 could choose either of two
categories, as could those with incomes of Rs.15,000 and Rs.25,000. Set up the answers so
that they are separate: for example, Rs.10,001-Rs.15,000, Rs.15,001-Rs.25,000.
Probably the most difficult question to frame is one that gives the respondent several
alternatives. It is difficult to find mutually exclusive alternatives or to provide enough to cover an
entire range of options. You must also be careful not to word the alternatives in such a way that
it makes one alternative appear better than others.
125
Unlike many other professionals, survey researchers encourage plagiarism when it comes to
copying question wordings; however, they call it "replication." Indeed, the researcher considers
it a compliment when peers use his or her questions in their own studies and for very good
reason.
You will gain a number of benefits from this replication. First, if you use the exact same question
wording, then at least you do not have to worry if differing results have been caused by the
inconsistencies in the wordings of questions.
Secondly, reusing questions means you may be able to compare the opinions of your group
with another group -- a previous survey in the same community or perhaps a national group for
additional insight. So, by all means, build upon the efforts of others who have asked similar
questions.
The physical appearance of a questionnaire can have a significant effect upon both the quantity
and quality of marketing data obtained. The quantity of data is a function of the response rate.
Ill-designed questionnaires can give an impression of complexity, medium and too big a time
commitment. Data quality can also be affected by the physical appearance of the questionnaire
with unnecessarily confusing layouts making it more difficult for interviewers, or respondents in
the case of self-completion questionnaires, to complete this task accurately. Attention to just a
few basic details can have a disproportionately advantageous impact on the data obtained
through a questionnaire.
Use of booklets The use of booklets, in the place of loose or stapled sheets of paper, make it easier for
interviewer or respondent to progress through the document. Moreover, fewer pages tend
to get lost.
Simple, clear The clarity of questionnaire presentation can also help to improve the ease with which
formats interviewers or respondents are able to complete a questionnaire.
Creative use of In their anxiety to reduce the number of pages of a questionnaire these is a tendency to put
space and too much information on a page. This is counter-productive since it gives the questionnaire
typeface the appearance of being complicated. Questionnaires that make use of blank space appear
easier to use, enjoy higher response rates and contain fewer errors when completed.
Use of colour Colour coding can help in the administration of questionnaires. It is often the case that
coding several types of respondents are included within a single survey (e.g. wholesalers and
retailers). Printing the questionnaires on two different colours of paper can make the
handling easier.
Interviewer Interviewer instructions should be placed alongside the questions to which they pertain.
instructions Instructions on where the interviewers should probe for more information or how replies
should be recorded are placed after the question.
127
A2Z
PhD
Thesis
Data Collection
Chapter XIV
128
DATA COLLECTION
Introduction
Regardless of the field of study or preference for defining data (quantitative, qualitative),
accurate data collection is essential to maintaining the integrity of research. Both the
selection of appropriate data collection instruments (existing, modified, or newly
developed) and clearly delineated instructions for their correct use reduce the likelihood
of errors occurring.
While the degree of impact from faulty data collection may vary by discipline and the
nature of investigation, there is the potential to cause disproportionate harm when these
research results are used to support public policy recommendations.
129
Issues related to maintaining integrity of data collection:
The primary rationale for preserving data integrity is to support the detection of errors in
the data collection process, whether they are made intentionally (deliberate falsifications)
or not (systematic or random errors).
Most, Craddick, Crawford, Redican, Rhodes, Rukenbrod, and Laws (2003) describe
‘quality assurance’ and ‘quality control’ as two approaches that can preserve data
integrity and ensure the scientific validity of study results. Each approach is implemented
at different points in the research timeline (Whitney, Lind, Wahl, 1998):
1. Quality assurance - activities that take place before data collection begins
2. Quality control - activities that take place during and after data collection
Quality Assurance
Since quality assurance precedes data collection, its main focus is 'prevention' (i.e.,
forestalling problems with data collection). Prevention is the most cost-effective activity
to ensure the integrity of data collection. This proactive measure is best demonstrated by
the standardization of protocol developed in a comprehensive and detailed procedures
manual for data collection. Poorly written manuals increase the risk of failing to identify
problems and errors early in the research endeavor. These failures may be
demonstrated in a number of ways:
Uncertainty about the timing, methods, and identify of person(s) responsible for reviewing
data
Partial listing of items to be collected
Vague description of data collection instruments to be used in lieu of rigorous step-by-
step instructions on administering tests
Failure to identify specific content and strategies for training or retraining staff members
responsible for data collection
Obscure instructions for using, making adjustments to, and calibrating data collection
equipment (if appropriate)
No identified mechanism to document changes in procedures that may evolve over the
course of the investigation.
Quality Control
While quality control activities (detection/monitoring and action) occur during and after
data collection, the details should be carefully documented in the procedures manual. A
clearly defined communication structure is a necessary pre-condition for establishing
monitoring systems. There should not be any uncertainty about the flow of information
between principal investigators and staff members following the detection of errors in
data collection. A poorly developed communication structure encourages lax monitoring
and limits opportunities for detecting errors.
Detection or monitoring can take the form of direct staff observation during site visits,
conference calls, or regular and frequent reviews of data reports to identify
inconsistencies, extreme values or invalid codes. While site visits may not be
appropriate for all disciplines, failure to regularly audit records, whether quantitative or
quantitative, will make it difficult for investigators to verify that data collection is
proceeding according to procedures established in the manual. In addition, if the
structure of communication is not clearly delineated in the procedures manual,
transmission of any change in procedures to staff members can be compromised
131
Quality control also identifies the required responses, or ‘actions’ necessary to correct
faulty data collection practices and also minimize future occurrences. These actions are
less likely to occur if data collection procedures are vaguely written and the necessary
steps to minimize recurrence are not implemented through feedback and education
(Knatterud, et al, 1998)
In the social/behavioral sciences where primary data collection involves human subjects,
researchers are taught to incorporate one or more secondary measures that can be
used to verify the quality of information being collected from the human subject. For
example, a researcher conducting a survey might be interested in gaining a better
insight into the occurrence of risky behaviors among young adult as well as the social
conditions that increase the likelihood and frequency of these risky behaviors.
To verify data quality, respondents might be queried about the same information but
asked at different points of the survey and in a number of different ways. Measures of
‘ Social Desirability’ might also be used to get a measure of the honesty of responses.
There are two points that need to be raised here, (a) cross-checks within the data
collection process and (b) data quality being as much an observation-level issue as it is
a complete data set issue. Thus, data quality should be addressed for each individual
measurement, for each individual observation, and for the entire data set.
Each field of study has its preferred set of data collection instruments. The hallmark of
laboratory sciences is the meticulous documentation of the lab notebook while social
sciences such as sociology and cultural anthropology may prefer the use of detailed field
notes. Regardless of the discipline, comprehensive documentation of the collection
process before, during and after the activity is essential to preserving data integrity.
132
Quantitative and Qualitative Data collection methods
The Quantitative data collection methods rely on random sampling and structured data
collection instruments that fit diverse experiences into predetermined response
categories. They produce results that are easy to summarize, compare, and generalize.
Quantitative research is concerned with testing hypotheses derived from theory and/or
being able to estimate the size of a phenomenon of interest. Depending on the research
question, participants may be randomly assigned to different treatments. If this is not
feasible, the researcher may collect data on participant and situational characteristics in
order to statistically control for their influence on the dependent, or outcome, variable. If
the intent is to generalize from the research participants to a larger population, the
researcher will employ probability sampling to select participants.
Experiments/clinical trials.
Observing and recording well-defined events (e.g., counting the number of patients
waiting in emergency at specified times of the day).
Obtaining relevant data from management information systems.
Administering surveys with closed-ended questions (e.g., face-to face and telephone
interviews, questionnaires etc).(http://www.achrn.org/quantitative_methods.htm)
Interviews
Face -to -face interviews have a distinct advantage of enabling the researcher to
establish rapport with potential partiocipants and therefor gain their cooperation.These
interviews yield highest response rates in survey research.They also allow the
researcher to clarify ambiguous answers and when appropriate, seek follow-up
information. Disadvantages include impractical when large samples are involved time
consuming and expensive.(Leedy and Ormrod, 2001)
133
Telephone interviews are less time consuming and less expensive and the researcher
has ready access to anyone on the planet who hasa telephone.Disadvantages are that
the response rate is not as high as the face-to- face interview but cosiderably higher
than the mailed questionnaire.The sample may be biased to the extent that people
without phones are part of the population about whom the researcher wants to draw
inferences.
Questionnaires
Web based questionnaires A new and inevitably growing methodology is the use of
Internet based research. This would mean receiving an e-mail on which you would click
on an address that would take you to a secure web-site to fill in a questionnaire. This
type of research is often quicker and less detailed.Some disadvantages of this method
include the exclusion of people who do not have a computer or are unable to access a
computer.Also the validity of such surveys are in question as people might be in a hurry
to complete it and so might not give accurate responses.
(http://www.statcan.ca/english/edu/power/ch2/methods/methods.htm)
Questionnaires often make use of Checklist and rating scales.These devices help
simplify and quantify people's behaviors and attitudes.A checklist is a list of
behaviors,characteristics,or other entities that te researcher is looking for.Either the
134
researcher or survey participant simply checks whether each item on the list is observed,
present or true or vice versa.A rating scale is more useful when a behavior needs to be
evaluated on a continuum.They are also known as Likert scales. (Leedy and Ormrod,
2001)
they tend to be open-ended and have less structured protocols (i.e., researchers may
change the data collection strategy by adding, refining, or dropping techniques or
informants)
they rely more heavily on iteractive interviews; respondents may be interviewed several
times to follow up on a particular issue, clarify concepts or check the reliability of data
they use triangulation to increase the credibility of their findings (i.e., researchers rely on
multiple data collection methods to check the authenticity of their results)
generally their findings are not generalizable to any specific population, rather each case
study produces a single piece of evidence that can be used to seek general patterns
among different studies of the same issue
Sources of Data
The sources of data may be classified into (a) primary sources and (b) secondary
sources.
Primary Sources
Primary sources are original sources from which the researcher directly collects data
that have not been previously collected, e.g., collection of data directly by the researcher
on brand awareness, brand preference, brand loyalty and other aspects of consumer
135
behaviour from a sample of consumers by interviewing them. Primary data are first-hand
information collected through various methods such as observation, interviewing, mailing
etc.
Secondary Sources
These are sources containing data that have been collected and compiled for another
purpose. The secondary sources consist of readily available compendia and already
compiled statistical statements and reports whose data may be used by researches for
their studies, e.g., census reports, annual reports and financial statements of companies,
Statistical statements, Reports of Government Departments, Annual Reports on
currency and finance published by the National Bank for Ethiopia, Statistical Statements
relating to Cooperatives, Federal Cooperative Commission, Commercial Banks and
Micro Finance Credit Institutions published by the National Bank for Ethiopia, Reports of
the National Sample Survey Organisation, Reports of trade associations, publications of
international organisations such as UNO, IMF, World Bank, ILO, WHO, etc., Trade and
Financial Journals, newspapers, etc.
Secondary sources consist of not only published records and reports, but also
unpublished records. The latter category includes various records and registers
maintained by firms and organisations, e.g., accounting and financial records, personnel
records, register of members, minutes of meetings, inventory records, etc.
136
Conclusion
137
A2Z
PhD
Thesis
Reflections on Academic Research
Chapter XV
138
STATISTICAL TOOLS FOR RESEARCH
Introduction
Scholars frequently use statistics to analyze their results. Why do researchers use
statistics? Statistics can help understand a phenomenon by confirming or rejecting a
hypothesis. It is vital to how we acquire knowledge to most scientific theories.
Statistical calculations
When analyzing data, your goal is simple: You wish to make the strongest possible
conclusion from limited amounts of data. To do this, you need to overcome two
problems:
Statistical analyses are necessary when observed differences are small compared to
experimental imprecision and biological variability. When you work with experimental
systems with no biological variability and little experimental error, heed these aphorisms:
If you need statistics to analyze your experiment, then you've done the wrong experiment.
But in many fields, scientists can't avoid large amounts of variability, yet care about
relatively small differences. Statistical methods are necessary to draw valid conclusions
from such data.
Probability
What is probability? Probability can be a complex field of mathematics, but in its simplest
definition it is the likelhood of an event being true divided by the total number of
possibilities. For example, flipping a coin has two possibilities: heads or tails. There is
only one way for a coin to land on heads, so the answer to this probability question is
1:2.
139
Why Is the Study of Probability Important?
Probability and statistics are useful to us in many ways. Knowing the likelihood of an
event happening is important information in decision making that is used in nearly every
field. For example, research studies use probability in determining whether or not a new
drug is worth putting on the market. Does the effectiveness of the drug outweigh the
harm it causes to a patient's body? Probability can help answer that question.
Choosing the right test to compare measurements is a bit tricky, as you must choose
between two families of tests: parametric and nonparametric. Many -statistical test are
based upon the assumption that the data are sampled from a Gaussian distribution.
These tests are referred to as parametric tests. Commonly used parametric tests are
listed in the first column of the table and include the t test and analysis of variance.
Tests that do not make assumptions about the population distribution are referred to as
nonparametric- tests. You've already learned a bit about nonparametric tests in previous
chapters. All commonly used nonparametric tests rank the outcome variable from low to
high and then analyze the ranks. These tests are listed in the second column of the table
and include the Wilcoxon, Mann-Whitney test, and Kruskal-Wallis tests. These tests are
also called distribution-free tests.
Choosing between parametric and nonparametric tests is sometimes easy. You should
definitely choose a parametric test if you are sure that your data are sampled from a
population that follows a Gaussian distribution (at least approximately). You should
definitely select a nonparametric test in three situations:
• The outcome is a rank or a score and the population is clearly not Gaussian. Examples
include class ranking of students, the Apgar score for the health of newborn babies
(measured on a scale of 0 to IO and where all scores are integers), the visual analogue
score for pain (measured on a continuous scale where 0 is no pain and 10 is unbearable
pain), and the star scale commonly used by movie and restaurant critics (* is OK, ***** is
fantastic).
• Some values are "off the scale," that is, too high or too low to measure. Even if the
population is Gaussian, it is impossible to analyze such data with a parametric test since
you don't know all of the values. Using a nonparametric test with these data is simple.
Assign values too low to measure an arbitrary very low value and assign values too high
140
to measure an arbitrary very high value. Then perform a nonparametric test. Since the
nonparametric test only knows about the relative ranks of the values, it won't matter that
you didn't know all the values exactly.
• You are sure that the population is not distributed in a Gaussian manner. If the data are
not sampled from a Gaussian distribution, consider whether you can transformed the
values to make the distribution become Gaussian. For example, you might take the
logarithm or reciprocal of all values. There are often biological or chemical reasons (as
well as statistical ones) for performing a particular transform.
It is not always easy to decide whether a sample comes from a Gaussian population.
Consider these points:
• If you collect many data points (over a hundred or so), you can look at the distribution of
data and it will be fairly obvious whether the distribution is approximately bell shaped. A
formal statistical test (Kolmogorov-Smirnoff test, not explained in this book) can be used
to test whether the distribution of the data differs significantly from a Gaussian
distribution. With few data points, it is difficult to tell whether the data are Gaussian by
inspection, and the formal test has little power to discriminate between Gaussian and
non-Gaussian distributions.
• You should look at previous data as well. Remember, what matters is the distribution of
the overall population, not the distribution of your sample. In deciding whether a
population is Gaussian, look at all available data, not just data in the current experiment.
• Consider the source of scatter. When the scatter comes from the sum of numerous
sources (with no one source contributing most of the scatter), you expect to find a roughly
Gaussian distribution. When in doubt, some people choose a parametric test (because
they aren't sure the Gaussian assumption is violated), and others choose a
nonparametric test (because they aren't sure the Gaussian assumption is met).
Does it matter whether you choose a parametric or nonparametric test? The answer
depends on sample size. There are four cases to think about:
• Large sample. What happens when you use a parametric test with data from a
nongaussian population? The central limit theorem (discussed in Chapter 5) ensures that
parametric tests work well with large samples even if the population is non-Gaussian. In
other words, parametric tests are robust to deviations from Gaussian distributions, so
long as the samples are large. The snag is that it is impossible to say how large is large
enough, as it depends on the nature of the particular non-Gaussian distribution. Unless
the population distribution is really weird, you are probably safe choosing a parametric
test when there are at least two dozen data points in each group.
• Large sample. What happens when you use a nonparametric test with data from a
Gaussian population? Nonparametric tests work well with large samples from Gaussian
populations. The P values tend to be a bit too large, but the discrepancy is small. In other
words, nonparametric tests are only slightly less powerful than parametric tests with large
samples.
• Small samples. What happens when you use a parametric test with data from
nongaussian populations? You can't rely on the central limit theorem, so the P value may
be inaccurate.
141
• Small samples. When you use a nonparametric test with data from a Gaussian
population, the P values tend to be too high. The nonparametric tests lack statistical
power with small samples.
Thus, large data sets present no problems. It is usually easy to tell if the data come from
a Gaussian population, but it doesn't really matter because the nonparametric tests are
so powerful and the parametric tests are so robust. Small data sets present a dilemma. It
is difficult to tell if the data come from a Gaussian population, but it matters a lot. The
nonparametric tests are not powerful and the parametric tests are not robust.
With many tests, you must choose whether you wish to calculate a one- or two-sided P
value (same as one- or two-tailed P value). Let's review the difference in the context of a
t test. The P value is calculated for the null hypothesis that the two population means are
equal, and any discrepancy between the two sample means is due to chance. If this null
hypothesis is true, the one-sided P value is the probability that two sample means would
differ as much as was observed (or further) in the direction specified by the hypothesis
just by chance, even though the means of the overall populations are actually equal. The
two-sided P value also includes the probability that the sample means would differ that
much in the opposite direction (i.e., the other group has the larger mean). The two-sided
P value is twice the one-sided P value.
A one-sided P value is appropriate when you can state with certainty (and before
collecting any data) that there either will be no difference between the means or that the
difference will go in a direction you can specify in advance (i.e., you have specified
which group will have the larger mean). If you cannot specify the direction of any
difference before collecting data, then a two-sided P value is more appropriate. If in
doubt, select a two-sided P value.
If you select a one-sided test, you should do so before collecting any data and you need
to state the direction of your experimental hypothesis. If the data go the other way, you
must be willing to attribute that difference (or association or correlation) to chance, no
matter how striking the data. If you would be intrigued, even a little, by data that goes in
the "wrong" direction, then you should use a two-sided P value. For reasons discussed
in Chapter 10, I recommend that you always calculate a two-sided P value.
142
Paired or Unpaired Test?
When comparing two groups, you need to decide whether to use a paired test. When
comparing three or more groups, the term paired is not apt and the term repeated
measures is used instead.
Use an unpaired test to compare groups when the individual values are not paired or
matched with one another. Select a paired or repeated-measures test when values
represent repeated measurements on one subject (before and after an intervention) or
measurements on matched subjects. The paired or repeated-measures tests are also
appropriate for repeated laboratory experiments run at different times, each with its own
control.
You should select a paired test when values in one group are more closely correlated
with a specific value in the other group than with random values in the other group. It is
only appropriate to select a paired test when the subjects were matched or paired before
the data were collected. You cannot base the pairing on the data you are analyzing.
When analyzing contingency tables with two rows and two columns, you can use either
Fisher's exact test or the chi-square test. The Fisher's test is the best choice as it always
gives the exact P value. The chi-square test is simpler to calculate but yields only an
approximate P value. If a computer is doing the calculations, you should choose Fisher's
test unless you prefer the familiarity of the chi-square test. You should definitely avoid
the chi-square test when the numbers in the contingency table are very small (any
number less than about six). When the numbers are larger, the P values reported by the
chi-square and Fisher's test will he very similar.
The chi-square test calculates approximate P values, and the Yates' continuity correction
is designed to make the approximation better. Without the Yates' correction, the P
values are too low. However, the correction goes too far, and the resulting P value is too
high. Statisticians give different recommendations regarding Yates' correction. With large
sample sizes, the Yates' correction makes little difference. If you select Fisher's test, the
P value is exact and Yates' correction is not needed and is not available.
143
Regression or Correlation?
Linear regression and correlation are similar and easily confused. In some situations it
makes sense to perform both calculations. Calculate linear correlation if you measured
both X and Y in each subject and wish to quantity how well they are associated. Select
the Pearson (parametric) correlation coefficient if you can assume that both X and Y are
sampled from Gaussian populations. Otherwise choose the Spearman nonparametric
correlation coefficient. Don't calculate the correlation coefficient (or its confidence
interval) if you manipulated the X variable.
Calculate linear regressions only if one of the variables (X) is likely to precede or cause
the other variable (Y). Definitely choose linear regression if you manipulated the X
variable. It makes a big difference which variable is called X and which is called Y, as
linear regression calculations are not symmetrical with respect to X and Y. If you swap
the two variables, you will obtain a different regression line. In contrast, linear correlation
calculations are symmetrical with respect to X and Y. If you swap the labels X and Y,
you will still get the same correlation coefficient.
Type of Data
Goal Measurement Rank, Score, or Binomial Survival Time
(from Gaussian Measurement (Two
Population) (from Non- Possible
Gaussian Outcomes)
Population)
Describe one group Mean, SD Median, interquartile Proportion Kaplan Meier
range survival curve
Compare one group One-sample ttest Wilcoxon test Chi-square
to a hypothetical or
value Binomial test
Compare two Unpaired t test Mann-Whitney test Fisher's test Log-rank test
unpaired groups (chi-square or Mantel-
for large Haenszel
samples)
Compare two paired Paired t test Wilcoxon test McNemar's Conditional
groups test proportional
hazards
regression
Compare three or One-way ANOVA Kruskal-Wallis test Chi-square Cox
144
more unmatched test proportional
groups hazard
regression
Compare three or Repeated- Friedman test Cochrane Q Conditional
more matched measures proportional
groups ANOVA hazards
regression
Quantify association Pearson Spearman Contingency
between two correlation correlation coefficients
variables
Predict value from Simple linear Nonparametric Simple logistic Cox
another measured regression regression regression proportional
variable or hazard
Nonlinear regression
regression
Predict value from Multiple linear Multiple Cox
several measured or regression logistic proportional
binomial variables or regression hazard
Multiple nonlinear regression
regression
SPSS
SPSS is the statistical package most widely used by social scientists. There are several
reasons:
1. Force of habit: SPSS has been around since the late 1960s.
2. Of the major packages, it seems to be the easiest to use for the most widely used statistical
techniques;
3. One can use it with either a Windows point-and-click approach or through syntax (i.e.,
writing out of SPSS commands.) Each has its own advantages, and the user can switch
between the approaches;
4. Many of the widely used social science data sets come with an easy method to translate
them into SPSS; this significantly reduces the preliminary work needed to explore new data.
1. SPSS users have less control over statistical output than, for example, Stata or Gauss
users. For novice users, this hardly causes a problem. But, once a researcher wants greater
control over the equations or the output, she or he will need to either choose another
package or learn techniques for working around SPSS’s limitations;
2. SPSS has problems with certain types of data manipulations, and it has some built in quirks
that seem to reflect its early creation. The best known limitation is its weak lag functions, that
is, how it transforms data across cases. For new users working off of standard data sets, this
is rarely a problem. But, once a researcher begins wanting to significantly alter data sets, he
or she will have to either learn a new package or develop greater skills at manipulating
SPSS.
145
Overall, SPSS is a good first statistical package for people wanting to perform
quantitative research in social science because it is easy to use and because it can be a
good starting point to learn more advanced statistical packages.
Conclusion
It is the scholar’s primary responsibility to identify and use the relevant types of statistical
tools that suit his nature of research study. Once these are finalized, the standard [i.e.,
SPSS] statistical software may take care of analysis and further process. However, the
scholar is supposed to have some preliminary knowledge about the salient features of
the software. It is not always safe to rely entirely on statisticians.
“Statistics are like bikinis. What they reveal is suggestive, but what they conceal is vital.”
<< Aaron Levenstein
146
Reflections on Academic Research
A2Z
PhD
Thesis
Chapter XVI
147
RELIABILITY AND VALIDITY
Measurement experts (and many educators) believe that every measurement device
should possess certain qualities. Perhaps the two most common technical concepts in
measurement are reliability and validity. Any kind of assessment, whether traditional or
"authentic," must be developed in a way that gives the assessor accurate information
about the performance of the individual.
A. Reliability:
Definition
• The degree of consistency between two measures of the same thing. (Mehrens and Lehman,
1987).
• The measure of how stable, dependable, trustworthy, and consistent a test is in measuring the
same thing each time (Worthen et al., 1993)
The idea behind reliability is that any significant results must be more than a one-off
finding and be inherently repeatable. Other researchers must be able to perform exactly
the same experiment, under the same conditions and generate the same results. This
will reinforce the findings and ensure that the wider scientific community will accept
the hypothesis. Without this replication of statistically significant results,
the experiment and research have not fulfilled all of the requirements of testability. This
prerequisite is essential to a hypothesis establishing itself as an accepted scientific truth.
For example, if you are performing a time critical experiment, you will be using some
type of stopwatch. Generally, it is reasonable to assume that the instruments are
reliable and will keep true and accurate time. However, diligent scientists
take measurements many times, to minimize the chances of malfunction and maintain
validity and reliability. At the other extreme, any experiment that uses human judgment is
always going to come under question.
For example, if observers rate certain aspects, like in Bandura’s Bobo Doll Experiment,
then the reliability of the test is compromised. Human judgment can vary wildly
148
between observers, and the same individual may rate things differently depending upon
time of day and current mood.
This means that such experiments are more difficult to repeat and are inherently less
reliable. Reliability is a necessary ingredient for determining the overall validity of a
scientific experiment and enhancing the strength of the results.
B. Validity
Definition:
• Truthfulness: Does the test measure what it purports to measure? the extent to which certain
inferences can be made from test scores or other measurement. (Mehrens and Lehman, 1987)
• The degree to which they accomplish the purpose for which they are being used. (Worthen et
al., 1993)
Validity encompasses the entire experimental concept and establishes whether the
results obtained meet all of the requirements of the scientific research method.
For example, there must have been randomization of the sample groups and appropriate
care and diligence shown in the allocation of controls. Internal validity dictates how an
experimental design is structured and encompasses all of the steps of the scientific
research method. Even if your results are great, sloppy and inconsistent design will
compromise your integrity in the eyes of the scientific community. Internal validity and
reliability are at the core of any experimental design. External validity is the process of
examining the results and questioning whether there are any other
possible causal relationships. Control groups and randomization will lessen external
validity problems but no method can be completely successful. This is why the statistical
proofs of a hypothesis called significant, not absolute truth. Any scientific research
design only puts forward a possible cause for the studied effect. There is always the
chance that another unknown factor contributed to the results and findings. This
extraneous causal relationship may become more apparent, as techniques are refined
and honed.
149
Reliability & Validity
We often think of reliability and validity as separate ideas but, in fact, they're related to
each other. Here, the following example illustrates the point:
The favorite metaphor for the relationship between reliability are that of the target. Think
of the center of the target as the concept that you are trying to measure. Imagine that for
each person you are measuring, you are taking a shot at the target. If you measure the
concept perfectly for a person, you are hitting the center of the target. If you don't, you
are missing the center. The more you are off for that person, the further you are from the
center.
The figure above shows four possible situations. In the first one, you are hitting the
target consistently, but you are missing the center of the target. That is, you are
consistently and systematically measuring the wrong value for all respondents. This
measure is reliable, but no valid (that is, it's consistent but wrong). The second, shows
hits that are randomly spread across the target. You seldom hit the center of the target
but, on average, you are getting the right answer for the group (but not very well for
individuals). In this case, you get a valid group estimate, but you are inconsistent. Here,
you can clearly see that reliability is directly related to the variability of your measure.
The third scenario shows a case where your hits are spread across the target and you
are consistently missing the center. Your measure in this case is neither reliable nor
valid. Finally, we see the "Robin Hood" scenario -- you consistently hit the center of the
target. Your measure is both reliable and valid.
150
Conclusion
Always remember that your ability to answer your research question is only as good as
the instruments you develop or your data collection procedure. Well-trained and
motivated observers or a well-developed survey instrument will better provide you with
quality data with which to answer a question or solve a problem. Finally, be aware that
reliability is necessary but not sufficient for validity. That is, for something to be valid it
must be reliable but it must also measure what it is intended to measure.
151
Chapter XVII
Data Analysis
A2Z
PhD
Thesis
Reflections on Academic Research
152
DATA ANALYSIS
Introduction
Before you decide what to wear in the morning, you collect a variety of data: the
season of the year, what the forecast says the weather is going to be like, which
clothes are clean and which are dirty, and what you will be doing during the day. You
then analyze that data. Perhaps you think, “It’s summer, so it’s usually warm.” That
analysis helps you determine the best course of action, and you base your apparel
decision on your interpretation of the information. You might choose a t-shirt and shorts
on a summer day when you know you’ll be outside, but bring a sweater with you if you
know you’ll be in an air-conditioned building.
Though this example may seem simplistic, it reflects the way scientists or any
researcher for that matter, pursue data collection, analysis, and interpretation. Data
(the plural form of the word datum) are scientific observations and measurements that,
once analyzed and interpreted, can be developed into evidence to address a question.
Data lie at the heart of any research study, and all researchers collect data in one form
or another. The weather forecast that helped you decide what to wear, for example,
was an interpretation made by a meteorologist who analyzed data collected by
satellites. Data may take the form of the number of bacteria colonies growing in soup
broth, a series of drawings or photographs of the different layers of rock that form a
mountain range a tally of lung cancer victims in populations of cigarette smokers and
non-smokers , or the changes in average annual temperature predicted by a model of
global climate. Scientific data collection involves more care than you might use in a
casual glance at the thermometer to see what you should wear. Because scientists
build on their own work and the work of others, it is important that they are systematic
and consistent in their data collection methods and make detailed records so that
others can see and use the data they collect. The thoughtful and systematic collection,
analysis, and interpretation of data allow it to be developed into evidence that supports
scientific ideas, arguments, and hypotheses.
153
Definition
The numerical results provided by a data analysis are usually simple: It finds the number
that describes a typical value and it finds differences among numbers. Data analysis
finds averages, like theaverage income or the average temperature, and it finds
differences like the difference in income from group to group or the differences in
average temperature from year to year. Fundamentally, the numerical answers provided
by data analysis are that simple. But data analysis is not about numbers — it uses them.
Data analysis is about the world, asking, always asking, “How does it work?” And that’s
where data analysis gets tricky. Carefully study the following two examples:
Example:
Between 1790 and 1990 the population of the United States increased by 245 million people,
from 4 million to 249 million people. Those are the facts. But if I were to interpret those numbers
and report that the population grew at an average rate of 1.2 million people per year, 245 million
people divided by 200 years, the report would be wrong. The facts would be correct and the
arithmetic would be correct — 245 million people divided by 200 years is approximately 1.2
million people per year. But the interpretation “grew at an average rate of 1.2 million people per
year” would be wrong, dead wrong. The U.S. population did not grow that way, not even
approximately.
Example:
The average number of students per class at my university is 16. That is a fact. It is also a fact
that the average number of classmates a student will find in his or her classes is 37. That too is a
fact. The numerical results are correct in both cases, both 16 and 37 are correct even though one
number is twice the magnitude of the other — no tricks. But the two different numbers respond to
two subtly different questions about how the world (my university) works subtly different questions
that lead to large differences in the result.
By the time you get to the analysis of your data, most of the really difficult work has been
done. It's much more difficult to: define the research problem; develop and implement a
sampling plan; conceptualize, operationalize and test your measures; and develop a
154
design structure. If you have done this work well, the analysis of the data is usually a
fairly straightforward affair.
In most social research the data analysis involves three major steps, done in roughly this
order:
Data Preparation involves checking or logging the data in; checking the data for
accuracy; entering the data into the computer; transforming the data; and developing
and documenting a database structure that integrates the various measures.
Descriptive Statistics are used to describe the basic features of the data in a study. They
provide simple summaries about the sample and the measures. Together with simple
graphics analysis, they form the basis of virtually every quantitative analysis of data.
With descriptive statistics you are simply describing what is, what the data shows.
Inferential Statistics investigate questions, models and hypotheses. In many cases, the
conclusions from inferential statistics extend beyond the immediate data alone. For
instance, we use inferential statistics to try to infer from the sample data what the
population thinks. Or, we use inferential statistics to make judgments of the probability
that an observed difference between groups is a dependable one or one that might have
happened by chance in this study. Thus, we use inferential statistics to make inferences
from our data to more general conditions; we use descriptive statistics simply to describe
what's going on in our data.
In most research studies, the analysis section follows these three phases of analysis.
Descriptions of how the data were prepared tend to be brief and to focus on only the
more unique aspects to your study, such as specific data transformations that are
performed. The descriptive statistics that you actually look at can be voluminous. In most
write-ups, these are carefully selected and organized into summary tables and graphs
that only show the most relevant or important information. Usually, the researcher links
each of the inferential analyses to specific research questions or hypotheses that were
155
raised in the introduction, or notes any models that were tested that emerged as part of
the analysis. In most analysis write-ups it's especially critical to not "miss the forest for
the trees." If you present too much detail, the reader may not be able to follow the
central line of the results. Often extensive analysis details are appropriately relegated to
appendices, reserving only the most critical analysis summaries for the body of the
report itself.
1. Look at the Data / Think About the Data / Think About the Problem / Ask what it is you
Want to Know Think about the data. Think about the problem. Think about what it is you
are trying to discover. That would seem obvious, “Think.” But, it is the most important
step and often omitted as if, somehow, human intervention in the processes of science
were a threat to its objectivity and to the solidity of the science. But, no, thinking is
required: You have to interpret evidence in terms of your experience. You have to
evaluate data in terms of your prior expectations (and you had better have some
expectations). You have to think about data in terms of concepts and theories, even
though the concepts and theories may turn out to be wrong.
2. Estimate the Central Tendency of the Data. The “central tendency” can be something
as simple as an average: The average weight of these people is 150 pounds. Or it can
be something more complicated like a rate: The rate of growth of the population is two
percent per annum. Or it can be something sophisticated, something based on a theory:
The orbit of this planet is an ellipse. And why would you have thought to estimate
something as specific as a rate of growth or the trace of an ellipse? Because you
thought about the data, about the problem, and about where you were going (Rule 1).
3. Look at the Exceptions to the Central Tendency If you’ve measured a median, look at
the exceptions that lie above and below the median. If you’ve estimated a rate, look at
the data that
are not described by the rate. The point is that there is always, or almost always,
variation: You may have measured the average but, almost always, some of the cases
are not average. You may have measured a rate of change but, almost always, some
156
numbers are large compared to the average rate, and some are small. And these
exceptions are not usually just the result of embarrassingly human error or regrettable
sloppiness: On the contrary, often the exceptions contain information about the process
that generated the data. And sometimes they tell you that the original idea (to which the
variations are the exception) is wrong, or in need of refinement. So, look at the
exceptions which, as you can see, brings us back to rule 1, except that this time the data
we look at are the exceptions. That circle of three rules describes one of the constant
practices of analysis, cycling between the central tendencies and the exceptions as you
revise the ideas that are guiding your analysis.
Trying to describe the Rules from another angle, another theme that organizes the rules
of evidence can be introduced by three key words: falsifiability, validity, and parsimony.
[a] Falsifiability
Falsifiability requires that there be some sort of evidence which, had it been found, your
conclusions would have had to be judged false. Even though it’s your theory and your
evidence, it’s up to you to go the additional step and formulate your ideas so they can be
tested — and falsified if they are false. More, you yourself have to look for the counter
evidence. This is another way to describe one of the previous rules which was “Look at
the Exceptions”.
[b] Validity
Validity in the scientific sense, requires that conclusions be more than computationally
correct. Conclusions must also be “sensible” and true statements about the world: For
example, I noted earlier that it would be wrong to report that the population of the United
States had grown at an average rate of 1.2 million people per year. — Wrong, even
though the population grew by 245 million people over an interval of 200 years. Wrong
even though 245 divided by 200 is (approximately) 1.2. Wrong because it is neither
sensible nor true that the American population of 4 million people in the United States in
1790 could have increased to 5.1 million people in just twelve months. That would have
been a thirty percent increase in one year — which is not likely
157
(and didn’t happen). It would be closer to the truth, more valid, to describe the annual
growth using a percentage, stating that the population increased by an average of 2
percent per year — 2 percent per year when the population was 4 million (as it was in
1790), 2 percent per year when the population was 250 million (as it was in 1990). That’s
better.
[c] Parsimony
Parsimony is the analyst’s version of the phrase “Keep It Simple.” It means getting the
job done with the simplest tools, provided that they work. In military terms you might
think about weapons that provide the maximum “bang for the buck”. In the sciences our
“weapons” are ideas and we favor simple ideas with maximum effect. This means that
when we choose among equations that predict something or use them to describe facts,
we choose the simplest equation that will do the job. When we construct explanations or
theories we choose the most general principles that can explain the detail of particular
events. That’s why sociologists are attracted to broad concepts like social class and why
economists are attracted to theories of rational individual behavior — except that a
simple explanation is no explanation at all unless it is also falsifiable and valid.
Conclusion
But make no mistake, it is these broad and not-well-specified principles that generate the
specific rules we follow: Think about the data. Look for the central tendency. Look for the
variation. Strive for falsifiability, validity, and parsimony. Perhaps the most powerful rule
is the first one, “Think”. The data are telling us something about the real world, but what?
Think about the world behind the numbers and let good sense and reason guide the
analysis.
The following tips and questions from Calhoun (1994), Mills (1999), and Padak and
Padak (1994) are helpful for assisting with data analysis:
Continue to ask questions: who, what, where, when, why, and how?
158
Identify or look for themes: Sort data into piles so that each pile shares a broad
characteristic. Write a summary statement for each pile.
Determine how data from various sources--test scores, grades, surveys, interviews, and
observations, and documents--compare or contrast.
Note:
159
Chapter XVIII
A2Z
PhD
Thesis
Reflections on Academic Research
160
FINDINGS OF RESEARCH STUDY
After the completion of literature review, collection of data and analysis of data, the findings
of the thesis are the heart of the research thesis. The value of a scholar’s thesis will stand or
fall on the validity and quality of the thesis findings. Critical as well as the most significant
stage of the thesis is identifying and finalizing the findings of the thesis. The following steps
may be considered when the scholar takes up the findings section:
Understand what thesis findings are. Thesis findings consist two broad categories. One is
the aggregate data you collect, such as totals from surveys in social science or
observations of plant populations in botany. The other is the results of data analysis, such
as statistics generated from the raw data. Thesis findings do not include interpretation of
the results to draw conclusions or formulate theoretical explanations. These are vital
parts of the thesis, but they are distinct and separate from thesis findings.
Operationalize the hypotheses. This is the process of devising a specific test or tests to
obtain data that either support or fail to support a hypothesis. Consider carefully whether
the test is valid (does it measure what you want it to measure) and is it reliable (under the
test conditions will you get consistent results). Whenever possible, do preliminary runs to
verify the validity and reliability of the testing procedure before collecting the data.
Collect the data. The watchword here is quality. Adhere strictly to the procedures you've
established. If a deviation is unavoidable, record it. Take the time to be through and
meticulous. Careless execution of your observational procedures will result in invalid data
and can ruin a thesis.
Perform data analysis. You will need to organize your observations and compile them
into totals, percentages, and other basic information. Follow this up with the more
detailed data analysis, such as generating statistics like standard deviations and
regression analysis. Review your findings and look for gaps in your data. If you are doing
genuinely original research, your findings at this point will almost certainly bring out new
questions you need to answer.
Revisit the data collecting phase of your research if needed. Gather more data to address
questions brought out in your data analysis and repeat Steps 3 and 4 to process the
additional data.
Present your findings. Using tables, graphs and text, write up your findings. Remember
tat thesis findings go in a section of your written thesis separate from your literature
review, discussion and other sections. Keep your writing clearly defined and focused.
You should prepare to present your findings to audiences in your department and at
conferences. Finally, discuss your findings with faculty and prepare to answer objections
and challenges to your work.
1. Describe the findings in a manner that allows the reader to gain a clear
understanding of the type of study that was involved in the research. It should be clear to
161
the reader whether the study was a case study, a correlational study, or an experiment.
It would be best to state the type of study when describing the findings. For example, if
it was an experiment, a sentence could start with the words, "In the experiment..."
2. If the findings are from a correlational study, the description of the findings could
involve a brief description of how the variables were measured . For example, if the study
addressed the relationship between empathy and helping behavior, the description of
the findings could involve a description of how empathy was measured in the study.
3. If the findings are from an experiment, the description of the study could involve a
description of the conditions in the experiment. For example, imagine that an experiment
addressed the influence of listening to music on productivity, and there were two
conditions: experimental condition with music and control condition without music. In
this example, it would be will to describe both the experimental condition and the control
condition.
162
A2Z
PhD
Thesis
Reflections on Academic Research
Chapter XIX
Structure of Thesis
163
STRUCTURE OF THESIS
Introduction
cover a field which fascinates the candidate sufficiently for him or her to endure
years of hard and solitary work;
build on the candidate's previous studies, for example, his or her course work in
a Master's degree;
be in an area of `warm' research activity rather than in a `cold', overworked area
or in a `hot', too-competitive, soon-to-be extinguished area;
be in an area near the main streams of a discipline and not at the margins of a
discipline or straddling two disciplines - being near the main streams makes it
easier to find thesis examiners, to gain academic positions, and to get
acceptance of journal articles about the research;
be manageable, producing interesting results and a thesis in the shortest time
possible;
have accessible sources of data;
open into a program of research projects after the PhD is completed; and provide
skills and information for obtaining a job in a non-research field.
Delimitations
Another limitation of the approach is that it is restricted to presenting the final thesis.
This chapter does not address the techniques of actually writing a thesis. Moreover, the
approach discussed here does not refer to the actual sequence of writing the thesis, nor
is it meant to imply that the issues of each chapter have to be addressed by the scholar
in the order shown. For example, the hypotheses at the end of Chapter 2 are meant to
appear to be developed as the chapter progresses, but the scholar might have a good
idea of what they will be before he or she starts to write the chapter. And although the
methodology of Chapter 3 must appear to be selected because it was appropriate for
the research problem identified and carefully justified in Chapter 1, the candidate may
have actually selected a methodology very early in his or her candidature and then
developed an appropriate research problem and justified it.
Moreover, after a scholar has sketched out a draft table of contents for each chapter, he
or she should begin writing the `easiest parts' of the thesis first as they go along,
whatever those parts are - and usually introductions to chapters are the last to be
written. But it should be borne in mind that the research problem, limitations and
research gaps in the literature must be identified and written down before other parts of
the thesis can be written.
Flexibility of Structure
A five chapter structure can be used to effectively present a PhD Thesis, and the thesis
should have a unified structure.
Chapter I : introduces the core research problem and then `sets the scene' and outlines
the path which the examiner will travel towards the thesis' conclusion.
Chapter II : the research problem and hypotheses arising from the body of knowledge
developed by the global research community.
165
Chapter III : methods used in this research to collect data about the hypotheses.
Chapter IV : results of applying those methods in this research.
Chapter V : conclusions about the hypotheses and research problem based on the
results of Chapter 4, including their place in the body of knowledge outlined previously in
Chapter 2
This five chapter structure can be justified. Firstly, the structure is a unified and focussed
one, and so addresses the major faults observed in the postgraduate theses and that is,
it clearly addresses the evalators’ difficulty in discerning what was the `thesis' of the
thesis? Supervisors need to emphasise throughout the research process that they are
striving in the thesis to communicate one big idea and that one big idea is the research
problem stated in the earlier pages of the thesis and explicitly solved in Chapter 5.
Easterby-Smith et al. (1991) also emphasise the importance of consistency in a PhD
thesis, and Phillips and Pugh (1987, p. 38) confirm that a thesis must have a thesis or a
`position'. The proposed structure is explicitly or implicitly followed by many writers of
articles in prestigious academic journals such as The Academy of Management Journal
and Strategic Management Journal (for example, Datta et al. 1992). Above all, the
proposed structure is akin to a standard proposal much like that which will be used by
the scholars later in their career, to apply for research grants (Krathwohl 1977; Poole
1993). Finally, by reducing time wasted on unnecessary tasks or on trying to demystify
the PhD process, the five chapter structure provides a mechanism to shorten the time
taken to complete a PhD, an aim becoming desired in many countries (Cude 1989).
Special and salient feature of this five Chapter structure is its inherent astonishing
flexibility – both adaptable and adoptable. For example, a research scholar may find it
convenient to expand the number of chapters to six or seven because of unusual
characteristics of the analysis in his or her research study; for example, a PhD might
consist of two stages: some qualitative research could be positioned in Chapters 3 and
4 of the thesis described below, which is then followed by some quantitative research to
refine the initial findings which could be positioned in Chapters 5 and 6; the Chapter 5
described below would then become Chapter 7. Ultimate aim is, PhD research must
remain an essentially creative exercise. However, the five chapter structure proposed
herein functions as a very good and solid starting point for understanding what a PhD
thesis should set out to achieve, and also provides a basis for communication between a
research scholar and others, viz his/her supervisor, examiners as well as entire
166
academic fraternity. It is earnestly suggested that academic research scholars, or
business researchers for that matter, may find this flexible structure most suitable and
save lot of time and energy in their research study. In other words, by reducing time
wasted on unnecessary tasks or on trying to demystify the PhD process, this five chapter
structure offers a mechanism to shorten the time taken to complete a PhD, an aim
becoming most desired in many countries (Cude 1989).
As any PhD thesis should have a unified structure, great care should be taken to ensure
that all the chapers [either 5 or 6 or 7] should stand alone and at the same time, should
be seamlessly linked without compromising the quality and merit of each chaper. Each
chapter (except the first) should have an introductory section linking the chapter to the
main idea of the previous chapter and outlining the aim and the organisation of the
chapter. The introductory section of chapter 5 (that is, section 5.1) will be longer than
those of other chapters, for it will summarise all earlier parts of the thesis prior to making
conclusions about the research described in those earlier parts; that is, section 5.1 will
repeat the research problem and the research questions/hypotheses. Each chapter
should also have a concluding summary section which outlines major themes
established in the chapter, without introducing new material. The five chapters may have
these respective percentages of the thesis' words: 5, 30, 15, 25 and 25 percent. [For six
chapters: these may 5, 20, 15, 15, 20 and 25; for seven chapters: these may be 5, 20,
10, 15, 15, 15, 20. [Note: The percentage is highly flexible and varies depending upon
the areas, scope, depth and nature of research study undertaken.]
Style
Within each of the chapters of the thesis, the spelling, styles, formats, etc should be
followed scrupulously, so that the scholar uses consistent styles from the first draft and
throughout the thesis for processes such as using bold type, underlining with italics,
indenting quotations, single and double inverted commas, making references, spaces
before and after side headings and lists, and gender conventions. Moreover, using the
authoritative APA Style Manual provides a defensive shield against an examiner who
may criticise the thesis from the viewpoint of his or her own idiosyncratic style. A PhD
167
thesis has some style rules of its own. Chapter 1 is usually written in the present tense
with references to literature in the past tense; the rest of the thesis is written in the past
tense as it concerns the research after it has been done, except for the findings in
Chapter 5 which are presented in the present tense. More precisely for Chapters 2 and
3, schools of thought and procedural steps are written of in the present tense and
published researchers and the candidate's own actions are written of in the past tense.
Further, value judgements and words should not be used in the objective pursuit of truth
that a thesis reports. For example, `it is unfortunate', `it is interesting', `it is believed', and
`it is welcome' are inappropriate. Although first person words such as `I' and `my' are
now acceptable in a PhD thesis, their use should be meticulously controlled or preferably
totally avoided. In case, research scholar feels and would like to call any authority’s
opinion as ‘wrong’, instead it could be worded as ‘misleading’. In short, the research
scholar should always be trying to communicate with the examiners in an easily-followed
way.
Another important aspect to be considered is that the word `etc' is too imprecise to be
used in a thesis. Furthermore, words such as `this', `these', `those' and `it' should not be
left dangling - they should always refer to an object; for example, `This rule should be
followed' is preferred to `This should be followed'. Also brackets should be rearely used
or if possible, totally avoided. Paragraphs should be short; as a rule of thumb, two to
three paragraphs should start on each page if the preferred line spacing of 1.5 and Arial
12 point font is used to provide adequate structure and complexity of thought on each
page. Margins should be those suggested by the university.
These above obervations about structure and style adequately imply that a PhD thesis
with its readership of two/three examiners is different from a book which has a very wide
readership (Derricourt 1992), and from shorter conference papers and journal articles
which do not require the burden of proof and references to broader bodies of knowledge
required in PhD theses. Candidates should be aware of these differences and could
therefore consider concentrating on completing the thesis before adapting parts of it for
other purposes.
168
The thesis will have to go through many drafts (Zuber-Skerritt & Knight 1986). The first
draft will be started early in the research process, be crafted after initial mindmapping
and a tentative table of contents of a chapter and a section, through the `right', creative
side of the brain and will emphasise basic ideas without much concern for detail or
precise language. Facilitating the creative first drafts of sections, the relatively visible
and structured `process' of this paper's structure allows the candidate to
be more creative and rigorous with the `content' of the thesis than he or she would
otherwise be. After the first rough drafts, later drafts will be increasingly crafted through
the `left', analytical side of the brain and emphasise fine tuning of arguments, justification
of positions and further evidence gathering from other research literature.
Title page
Abstract (with keywords)
Table of contents
List of tables
List of figures
Abbreviations
Statement of original authorship
Acknowledgments
2.1 Introduction
2.2 Parent disciplines and classification models
2.3 Developing and Current Literature
2.4 Earlier Literature
2.5 Immediate discipline and analytical models
2.6 Research Gap in the available Literature
2.7 Area identified for the Research Study
2.8 Conclusion
169
3 Methodology [maximum 5 sections, each havig 2/3 subsections]
3.1 Introduction
3.2 Justification of Methodology
3.3 Details of Research Procedures
3.4 Ethical considerations
3.5 Conclusion
4.1 Introduction
4.2 Statistical Tools used
4.3 Data about subjects
4.4 Detailed Pattern of data
4.5 Conclusion
5.1 Introduction
5.2 Discussion about each research question
5.3 Discussion about the research problem
5.4 Implications for theory
5.5 Delimitations
5.6 Major suggestions
5.7 Major recommendations
5.8 Scope for Further Research
Bibliography
Appendices
Note: This five chapter model forms the basis of the structure of the Thesis and, if necessary
could be extended to either six or seven chapters depending upon the circumstances. A research
scholar may find it convenient to expand the number of chapters to six or seven. For example, in
the following cases:
number of chapters, number of sections, number of subsections and also order of sections may
significantly vary. However, this five chapter model remains as a very strong foundation.
”Study without desire spoils the memory, and it retains nothing that it takes in.”
<< Leonardo da Vinci
170
Chapter XX
A2Z
PhD
Thesis
Reflections on Academic Research
Research Discussion
171
RESEARCH DISCUSSION
Introduction
A Discussion section should not be simply a summary of the results the scholar has
found and at this stage he/she will have to demonstrate original thinking. First, the
scholar should highlight and discuss how the research has reinforced what is already
known about the area. Many research scholars make the mistake of thinking that they
should have found something new; in fact, very few research stufies have findings that
are unique. Instead, the scholar is likely to have a number of findings that reinforce what
is already known about the field and the scholar needs to highlight these.
Second, the research scholar may have discovered something different and if this is the
case, he will have plenty to discuss! He should outline what is new and how thi
compares to what is already known. He should also attempt to provide an explanation as
to why the research identified these differences. Third, he needs to consider how the
results extend knowledge about the field. Even if there are similarities between the
results and the existing work of others, the research extends knowledge of the area, by
reinforcing current thinking. It is important that this section is comprehensive and well
structured; making clear links back to the literature you reviewed earlier in the project.
This will allow the scholar the opportunity to demonstrate the value of the research study
and it is therefore very important to discuss the researach work thoroughly.
Definition
The discussion section explains your interpretation of the findings as they relate to the
research problem you have investigated. This section is comprised of all new information
and focuses on the implications of your findings in relation to the overall scope of other
research that has taken place. The significance of the research findings should be
clearly described.
This section is often considered the most important part of a research paper because it
most effectively demonstrates your ability as a researcher to think critically about an
172
issue, to develop creative solutions to problems based on the findings, and to formulate
a deeper, more profound understanding of the issues you are studying.
The discussion section is where you explore the underlying meaning of your research , its
possible implications on other areas of study, and the possible improvements that can
be made in order to further develop the concerns of your research. This is also the
section where you need to present the importance of your study and how it may be able
to contribute to the field.
This part of the paper is not strictly governed by objective reporting of information but,
rather, it is where you can engage in creative thinking about issues through evidence-
based interpretation of findings.
Contents
1. Explanation of results: comment on whether or not the results were expected and
present explanations for the results; go into greater depth when explaning
findings that was unexpected or especially profound.
2. References to previous research: compare your results with those reported in the
literature, or use of the literature to support a claim. This can include re-visiting
key studies already cited in your literature review section, or, save them to cite
later in the discussion section.
3. Deduction: a claim for how the results can be applied more generally. For
example, describing lessons learned or proposing recommendations that can
help improve a situation.
4. Hypothesis: a more general claim or possible conclusion arising from the results
[which may be proved or disproved in subsequent research].
Briefly reiterate for your readers the research problem or problems you are investigating
and the methods you used to investigate them, and then move quickly to describe the
major findings of the study. You should write a direct, declarative, and succinct
proclamation of the study results.
[b] Explain the Meaning of the Findings
No one has thought as long and hard about your study as you have. Systematically
explain the meaning of the findings and why you believe they are important. After
reading the discussion section, you want the reader to think about the results [“why
hadn’t I thought of that?”]. You don’t want to force the reader to go through the paper
multiple times to figure out what it all means.
[c] Relate the Findings to Similar Studies
173
research study findings to those of other studies, particularly if questions raised by
previous studies served as the motivation for the research study, the findings of other
studies support the findings [which strengthens the importance of the research study
results], and/or they point out how the current study differs from other similar studies.
It is important to remember that the purpose of research is to discover and not to prove.
When writing the discussion section, the scholar should carefully consider all possible
explanations for the study results, rather than just those that fit the prior assumptions or
biases.
[e] Acknowledge the Study’s Limitations
It is far better for you to identify and acknowledge your study’s limitations than to have
them pointed out by the supervisor! Describe the generalizability of the results to other
situations, if applicable to the method chosen, then describe in detail problems that have
been encountered in the method(s) that is used to gather information. Note any
unanswered questions or issues the study did not address.
Although the research study may offer important insights about the research problem,
other questions related to the problem likely remain unanswered. Moreover, some
unanswered questions may have become more focused because of the current study.
The scholar should make suggestions for further research in the discussion section.
[Note: Recommendations for further research can be included in your conclusion instead, but don't repeat in
both.]
Background information
A strong relationship between X and Y has been reported in the literature.
Prior studies that have noted the importance of ......
In reviewing the literature, no data was found on the association between X and Y.
As mentioned in the literature review, ......
Very little was found in the literature on the question of .....
174
This study set out with the aim of assessing the importance of X in ......
The third question in this research was ......
It was hypothesized that participants with a history of ......
The present study was designed to determine the effect of ......
Statements of result
The results of this study show/indicate that .......
This experiment did not detect any evidence for ......
On the question of X, this study found that ......
The current study found that ......
The most interesting finding was that ......
Another important finding was that .....
The results of this study did not show that ....../did not show any significant increase in ......
In the current study, comparing X with Y showed that the mean degree of ......
In this study, Xs were found to cause .....
X provided the largest set of significant clusters of ......
It is interesting to note that in all seven cases of this study......
Unexpected outcome
Surprisingly, X was found to .......
Surprisingly, no differences were found in ......
One unanticipated finding was that .....
It is somewhat surprising that no X was noted in this condition ......
What is surprising is that ......
Contrary to expectations, this study did not find a significant difference between .......
However, the observed difference between X and Y in this study was not significant.
However, the ANOVA (one way) showed that these results were not statistically significant.
This finding was unexpected and suggests that ......
175
Although, these results differ from some published studies (Smith, 1992; Jones, 1996), they are
consistent with those of ......
These results differ from X's 2003 estimate of Y, but they are broadly consistent with earlier .....
176
Noting implications
This finding has important implications for developing .....
An implication of this is the possibility that ......
One of the issues that emerges from these findings is ......
Some of the issues emerging from this finding relate specifically to ......
This combination of findings provides some support for the conceptual premise that .....
Commenting on findings
However, these results were not very encouraging.
These findings are rather disappointing.
The test was successful as it was able to identify students who ......
The present results are significant in at least major two respects.
The results of this study do not explain the occurrence of these adverse events.
Conclusion
Besides the literature review section, the preponderance of references to sources in the
research paper should be in the discussion section. A few historical references may be
helpful for perspective but most of the references should be relatively recent and
included to aid in the interpretation of the results and to similar studies. If a study that
already cited disagrees with the findings, they should not be ignored, but should be
clearly explained why the study's findings differ from the present research study.
“Discussion is just a tool. You have to aim; the final goal must be a decision.”
<< Harri Holkeri
177
A2Z
PhD
Thesis
Reflections on Academic Research
178
Writing a Thesis
Chapter XXI
WRITING A THESIS
Research scholars encounter many pitfalls when writing a thesis. A well-written thesis is
essentially a sustained analysis of a research topic and even the most careful scholar
can succumb to commonly made mistakes in a work of this magnitude. The primary
problems that research scholars encounter when writing a thesis are related to the
matters of clarity and organization. In an analysis of this length and breadth, it is easy to
lose focus and direction. Because of the substantial research that goes into producing a
thesis, one can veer off track and lose stamina. This chapter discusses the planning of
the writing process, issues/difficulties encounted and mainly, commonly made mistakes
of writing thesiss such as the danger of disorganization, the problem of writing a worthy
conclusion and the problem of writing an analytical literature review and offers some
strategies to overcome them.
Two important adjectives used to describe a thesis are ``original'' and ``substantial.'' The
research performed to support a thesis must be both.
The scientific method means starting with a hypothesis and then collecting evidence to
support or deny it. Before one can write a thesis, one must collect evidence that supports
it. Thus, the most difficult aspect of writing a thesis consists of organizing the evidence
and associated discussions into a coherent form.
The essence of a thesis is critical thinking, not experimental data. Analysis and concepts
form the heart of the work.
A thesis concentrates on principles: it states the lessons learned, and not merely the
facts behind them.
Each statement in a thesis must be correct and defensible in a logical and scientific
sense. Moreover, the discussions in a thesis must satisfy the most stringent rules of logic
applied to mathematics and science.
One of the main problems of writing a thesis is maintaining organized trains of thought. It
is all too easy to fail to define concepts clearly and to waste time and energy on only
marginally related topics. A good thesis defines important concepts clearly and concisely
and uses the same terminology and its attendant definitions consistently throughout the
entire thesis. Do not make the mistake of using different words to describe a particular
terminology and do not define the terminology in one way in one instance and an in a
different way in an another. It is important that the writer be consistent with definitions.
Otherwise, the reader will not be able to understand the definitions presented in the
thesis.
The literature review is an important part of any custom written thesis and research
scholars can all too easily fall into the trap of writing summaries of articles. This method
prevents the research scholar from showcasing his/her critical thinking and analytical
skills and may also force the reader to lose interest in the research topic. The literature
review essentially is a chance for the research scholar to demonstrate his/her knowledge
on the research topic and to provide evidence for the arguments presented in the paper.
Because the thesis demands a substantial amount of research, the student, being tired
180
and frustrated, resorts to writing summaries of their research materials. This is a
mistake. The purpose of the literature review is to provide a perspective on the research
topic, to introduce and discuss important theoretical frameworks, to define key concepts
and point out connections between main ideas. In short, it provides a reader with a
model of what is going on in the thesis. By organizing the literature review by categories
of analysis, he/she can avoid organizing the thesis by summary.
By the time that the student reaches the point where he/she is able to write a conclusion,
he/she is drained of energy and willpower. It is important to remember that the
conclusion is important for the reader because it ties together all of the ideas and
concepts analyzed in the custom written thesis. It reafirms what the reader has learned
from the thesis and explains key inferences. It essentially brings home what the thesis is
all about. The writer should avoid repeating the thesis and should concentrate on
explaining what can be inferred from the evidence presented in the thesis and should
also discuss the implications of the points made. A good conclusion leaves the reader
with the feeling that he/she has grasped the main ideas of the thesis.
adverbs Mostly, they are very often overly used. Use strong words instead.
For example, one could say, ``Writers abuse adverbs.''
jokes or puns They have no place in a formal document.
``bad'', ``good'', nice'', A scientific dissertation does not make moral judgements. Use
``terrible'', ``stupid'' ``incorrect/correct'' to refer to factual correctness or errors. Use
precise words or phrases to assess quality (e.g., ``method A
requires less computation than method B''). In general, one
181
should avoid all qualitative judgements.
``true'', ``pure'', In the sense of ``good'' (it is judgemental)
1. Write up a preliminary version of the background section first. This will serve as
the basis for the introduction in your final paper.
2. As you collect data, write up the methods section. It is much easier to do this
right after you have collected the data. Be sure to include a description of the
research equipment and relevant calibration plots.
3. When you have some data, start making plots and tables of the data. These will
help you to visualize the data and to see gaps in your data collection. If time
permits, you should go back and fill in the gaps. You are finished when you have
a set of plots that show a definite trend (or lack of a trend). Be sure to make
adequate statistical tests of your results.
4. Once you have a complete set of plots and statistical tests, arrange the plots and
tables in a logical order. Write figure captions for the plots and tables. As much
as possible, the captions should stand alone in explaining the plots and tables.
Many scientists read only the abstract, figures, figure captions, tables, table
captions, and conclusions of a paper. Be sure that your figures, tables and
captions are well labeled and well documented.
5. Once your plots and tables are complete, write the results section. Writing this
section requires extreme discipline. You must describe your results, but you must
NOT interpret them. (If good ideas occur to you at this time, save them at the
bottom of the page for the discussion section.) Be factual and orderly in this
section, but try not to be too dry.
6. Once you have written the results section, you can move on to the discussion
section. This is usually fun to write, because now you can talk about your ideas
183
about the data. If you can come up with a good cartoon/schematic showing your
ideas, do so. Many papers are cited in the literature because they have a good
cartoon that subsequent authors would like to use or modify.
7. In writing the discussion session, be sure to adequately discuss the work of other
authors who collected data on the same or related scientific questions. Be sure to
discuss how their work is relevant to your work. If there were flaws in their
methodology, this is the place to discuss it.
8. After you have discussed the data, you can write the conclusions section. In this
section, you take the ideas that were mentioned in the discussion section and try
to come to some closure. If some hypothesis can be ruled out as a result of your
work, say so. If more work is needed for a definitive answer, say that.
9. The final section in the paper is a recommendation section. This is really the end
of the conclusion section in a scientific paper. Make recommendations for further
research or policy actions in this section. If you can make predictions about what
will be found if X is true, then do so. You will get credit from later researchers for
this.
10. After you have finished the recommendation section, look back at your original
introduction. Your introduction should set the stage for the conclusions of the
paper by laying out the ideas that you will test in the paper. Now that you know
where the paper is leading, you will probably need to rewrite the introduction.
“We do not write because we want to; we write because we have to.”
<< Somerset Maugham
184
A2Z
PhD
Thesis
Reflections on Academic Research
Chapter XXII
Chapter 6
Anatomy of an Abstract
185
ANATOMY OF AN ABSTRACT
Dissertation or detailed discourse is a document represents author’s findings and
research relating to a particular field and is submitted in support of such person to obtain
a degree or professional qualification. The thought or theory of one’s dissertation
explains what will be written in his/her dissertation. It must be written in such a way so
that reader finds it interesting to read and also encourages the whole dissertation.
Arousing interest in the mind about one’s dissertation is really important so that it will be
well received afterwards. For this, it is crucial to have a perfect thesis abstract, and it
must be written in such a way in which it is supposed to be written and the way in which
it is expected from your institution.
Many people struggle to write a good abstract because they know that a poor abstract
will wrecked their whole dissertation. Even if the whole dissertation is perfect, a mere
indifference in the quality of abstract will turnoff the mind of the reader from the whole
thesis. Thus you should put your maximum efforts to write a astonishing thesis abstract
so that output can be obtained in form of encouragement and nice suggestions from the
readers.
What is an abstract?
An abstract is a condensed version of a longer piece of writing that highlights the major
points covered, concisely describes the content and scope of the writing, and reviews
the writing's contents in abbreviated form.
1. Descriptive Abstracts
o tell readers what information the report, article, or paper contains.
o include the purpose, methods, and scope of the report, article, or paper.
o do not provide results, conclusions, or recommendations.
o are always very short, usually under 100 words.
o introduce the subject to readers, who must then read the report, article, or paper
to find out the author's results, conclusions, or recommendations .
2. Informative Abstracts
o communicate specific information from the report, article, or paper.
o include the purpose, methods, and scope of the report, article, or paper.
o provide the report, article, or paper's results, conclusions, and recommendations.
186
o are short -- from a paragraph to a page or two, depending upon the length of the
original work being abstracted. Usually informative abstracts are 10% or less of
the length of the original piece.
o allow readers to decide whether they want to read the report, article, or paper.
uses one or more well developed paragraphs: these are unified, coherent, concise, and
able to stand alone.
uses an introduction/body/conclusion structure which presents the article, paper, or
report's purpose, results, conclusions, and recommendations in that order.
follows strictly the chronology of the article, paper, or report.
provides logical connections (or transitions) between the information included.
adds no new information, but simply summarizes the report.
is understandable to a wide audience.
oftentimes uses passive verbs to downplay the author and emphasize the information.
Reread the article, paper, or report with the goal of abstracting in mind.
o Look specifically for these main parts of the article, paper, or report: purpose,
methods, scope, results, conclusions, and recommendation.
o Use the headings, outline heads, and table of contents as a guide to writing your
abstract.
o If you're writing an abstract about another person's article, paper, or report, the
introduction and the summary are good places to begin. These areas generally
cover what the article emphasizes.
After you've finished rereading the article, paper, or report, write a rough draft without
looking back at what you're abstracting.
o Don't merely copy key sentences from the article, paper, or report: you'll put in
too much or too little information.
o Don't rely on the way material was phrased in the article, paper, or report:
summarize information in a new way.
Revise your rough draft to
o correct weaknesses in organization.
o improve transitions from point to point.
o drop unnecessary information.
o add important information you left out.
187
o eliminate wordiness.
o fix errors in grammar, spelling, and punctuation.
Print your final copy and read it again to catch any glitches that you find.
A good abstract explains in one line why the paper is important. It then goes on
to give a summary of your major results, preferably couched in numbers with
error limits. The final sentences explain the major implications of your work. A
good abstract is concise, readable, and quantitative.
Length should be ~ 2-3 paragraphs, approx. 150-250 words [1 Page in A4].
Absrtracts generally do not have citations.
Information in title should not be repeated.
Be explicit.
Use numbers where appropriate.
Answers to these questions should be found in the abstract:
1. What did you do?
2. Why did you do it? What question were you trying to answer?
3. How did you do it? State methods.
4. What did you learn? State major results.
5. Why does it matter? Point out at least one significant implication.
“When I examine myself and my methods of thought, I come to the conclusion that the gift of
fantasy has meant more to me than any talent for abstract, positive thinking.” << Albert
Einstein
188
A2Z
PhD
Thesis
Reflections on Academic Research
Chapter XXIII
Endnotes
189
ENDNOTES
Introduction
Millions of researchers, scholarly writers, students, and librarians use EndNote to search
online bibliographic databases, organize their references, images and PDFs in any
language, and create bibliographies and figure lists instantly. Instead of spending hours
typing bibliographies, or using index cards to organize their references, they do it the
easy way—by using EndNote! An endnote is source citation that refers the readers to a
specific place at the end of the paper where they can find out the source of the
information or words quoted or mentioned in the paper. Endnotes are used: (1) to cite
the source of statements quoted or closely paraphrased in the text, (2) to make
additional comments about some point of the text, or (3) to acknowledge someone else
for an idea or argument.
Endnotes only serve one purpose, to allow the reader to access with ease and
confidence the source that you have used. Any citation form that does well this is
appropriate, but most disciplines insist on their own particular way of citing information,
and you must follow those preferences. There is nothing magical about these forms --
they all do the same thing -- but you should get used to the fact that different disciplines
require different citation forms. In the case of either footnotes or endnotes, the only
indication that goes in the text itself is the footnote number, the small supra-number after
the text you wish to reference.
An Illustration
Example:
Let's say that you have quoted a sentence from Lloyd Eastman's history of Chinese
social life. You have written this sentence:
According to Eastman, "The family was the central core of the Chinese social system."1
The superscript number corresponds to a note placed at the end of the paper (which is
called an endnote). Your word-processor will create a note number and a space at the
end of your paper, where you then fill in the citation. This endnote lets the reader know
where you found your information.
Note numbers are sequential: first note in your paper is numbered 1, the second note is
2 (even if you are quoting the same source as in #1), etc.
AGAIN, even if you are repeating a reference to the same source, your numbers must
continue in sequence (1, 2, 3, 4, 5). You must use "Arabic" numbers (1, 2, 3...), not
Roman numerals (i, ii, iii...)!
What do I put in the endnote (the part that appears at the end the paper) the first time I
refer to a source?
The first time you have a citation to a particular source, the note at the end of the paper
must include the following information in the following order:
Author’s first name then last name, Title of Book (City of publication: Publishing
company’s name, Date of Publication), Page Number of quoted, paraphrased, or
summarized material.
Example:
According to Eastman, "The family was the central core of the Chinese social system."1
At the end of the paper (in the space set aside for this note by your word-processing
software), you would put the following information in the following order:
1Lloyd E. Eastman [1988], Family, Field, and Ancestors: Constancy and Change in
China's Social and Economic History, 1550-1949, New York: Oxford University Press,
53.
191
In addition to including this information at the end of the paper, this source of information
should also be included in Bibliography, at the end of the book.
If you cite the same source again in you paper, use a short form for all subsequent
citations to that source:
OR
Example:
You have already cited the Eastman, but then you cite it again in note #3:
Conclusion
The Endnote is a very useful tool for academic researchers who should make
use of the same to enrich the depth and quality of their research study and to
gain ample and sizable knowledge and experience in their own fields of
speciality. The quantity and quality of citations noted in a research thesis reflect
the seriousness and curiosity of research scholars. Evaluators or examiners
would definitely take a note of the scholarship of scholars and appreciate the
efforts put by the scholars.
192
A2Z
PhD
Thesis
Reflections on Academic Research
193
RESEARCH CONCLUSION
Introduction
An effective concluding paragraph should provide closure for a paper, leaving the reader
feeling satisfied that the thesis has been fully explained. Probably the shortest paragraph
of an essay, the conclusion should be brief and to the point. The conclusion should
provide a restatement of the thesis, a summary of the author's conclusions, and perhaps
a solution to the problem, if this is the writer's intent. However, a good writer avoids a
blatant repetition of the thesis statement which can leave a reader feeling annoyed a
disappointed after reading an otherwise interesting paper. Repeating the thesis, word
for word, in the conclusion seems lazy and is not very interesting. It is best to restate the
ideas using different language, perhaps even to create a sort of dramatic effect that
comes from repetition. Good conclusions might have a dramatic quality -- rather like a
grand finale. The conclusion should leave the reader with an overall sense of how the
writer feels about the subject. Concluding statements which refer back to the
introductory paragraph are appropriate here. Frequently, the ideas in the body of an
essay lead to some significant conclusion that can be stated and explained in this final
paragraph. Finally, this is not the place to introduce ideas you forgot to mention in the
body of the paper!. For most essays, one well-developed paragraph is sufficient for a
conclusion, although in some cases, a two-or-three paragraph conclusion may be
required.
Definition
The conclusion is intended to help the reader understand why your research should
matter to them after they have finished reading the paper. A conclusion is not merely a
summary of your points or a re-statement of your research problem but a synthesis of
key points. For most essays, one well-developed paragraph is sufficient for a conclusion,
although in some cases, a two-or-three paragraph conclusion may be required.
Significane of a Conclusion
When writing the conclusion to your paper, follow these general rules:
The function of your paper's conclusion is to restate the main argument. It reminds the
reader of the strengths of your main argument(s) and reiterates the most important
evidence supporting the argument(s). Make sure, however, that your conclusion is not
simply a repetitive summary of the findings because this reduces the impact of the
argument(s) you have developed in your essay. Consider the following points to help
ensure your conclusion is appropriate:
1. If the argument or point of your paper is complex, you may need to summarize the
argument for your reader.
2. If, prior to your conclusion, you have not yet explained the significance of your findings or
if you are proceeding inductively, use the end of your paper to describe your main points
and explain their significance.
3. Move from a detailed to a general level of consideration that returns the topic to the
context provided by the introduction or within a new context that emerges from the data.
4. Suggest what aspects of this topic need further research.
The conclusion also provides a place for you to persuasively and succinctly restate your
research problem given that the reader has now been presented with all the information
about the topic. Depending on the discipline you are writing in, the concluding paragraph
may contain your reflections on the evidence presented, or on the essay's central
research problem. However, the nature of being introspective about the research you
195
have done will depend on your topic and whether your professor wants you to express
your observations in this way.
Some Strategies
If your essay deals with a contemporary problem, warn readers of the possible
consequences of not attending to the problem.
Common Problems
Failure to be concise
The conclusion section should be concise and to the point. Conclusions that are too long
often have unnecessary detail. The conclusion section is not the place for details about
your methodology or results. Although you should give a summary of what was learned
from your research, this summary should be relatively brief, since the emphasis in the
conclusion section is on the implications, evaluations, insights, etc. that you make.
196
challenges, etc. encountered during your research study can be included as a way of
qualifying your conclusions.
Conclusion
Finally, the conclusion of a thesis should be closed summarizing everything that has
come before, explaining in simple terms the way in which the research study ended,
relating it to the greater environment of the world at large, and leaving the reader with
the ability to draw his or her own conclusions from what you have described. Concluding
statements which refer back to the introductory paragraph are appropriate here.
Frequently, the ideas in the body of an essay lead to some significant conclusion that
can be stated and explained in this final paragraph. Finally, this is not the place to
introduce ideas you forgot to mention in the body of the paper!
The only possible conclusion the social sciences can draw is: some do, some don't.”
<< Ernest Rutherford
197
Chapter XXV
Editing &
Proofreading
A2Z
PhD
Thesis
Reflections on Academic Research
198
EDITING AND PROOFREADING
Definition
Proofreading is the act of searching for errors before you hand in your final research
thesis. Errors can be both grammatical and typographical in nature, but proofreading can
also be used to identify problems with the flow of your paper [i.e., the logical sequence of
thoughts and ideas] and to find any word processing errors [e.g., different font types,
indented paragraphs, line spacing, etc.].
Getting started
Be sure you've revised the larger aspects of your text. Don't make corrections at
the sentence and word level if you still need to work on the focus, development, and
arrangement of the whole paper, of sections, or of paragraphs.
Set your text aside for a while between writing and proofreading. Some distance
between writing your paper and proofreading it will help you identify mistakes more
easily.
Eliminate unnecessary words before looking for mistakes. Throughout your paper,
you should try to avoid using inflated diction if a simpler phrase works equally well.
Simpler, more precise language is easier to proofread than overly complex sentence
construction and vocabulary.
Know what to look for. Based upon the comments of your professors on previous
drafts of your paper, make a list of mistakes you need to watch for.
1. Work from a printout, not a computer screen. Besides sparing your eyes the strain
of glaring at a computer screen, proofreading from a printout allows you to easily
skip around to where errors might have been repeated in multiple places
throughout the research paper.
2. Read out loud. This is especially helpful for spotting run-on sentences, but you'll
also hear other problems that you may not pick up when reading silently.
Reading your paper out loud also helps you play the role of the reader, thereby
encouraging you to understand the paper as your audience might.
3. Use a blank sheet of paper to cover up the lines below the one you're reading. This
technique keeps you from skipping ahead of possible mistakes.
4. Use the search function of the computer to find mistakes you're likely to make . For
example, search for "it" if you confuse "its" and "it's;" search for for "-ing" if
dangling modifiers are a problem; search for opening parentheses or quote
marks if you tend to leave out the closing ones.
199
5. If you tend to make many mistakes, check separately for each kind of error,
moving from the most to the least important, and following whatever technique
works best for you to identify that kind of mistake. For instance, read through
once (backwards, sentence by sentence) to check for fragments; read through
again (forward) to be sure subjects and verbs agree, and again (perhaps using a
computer search for "this," "it," and "they") to trace pronouns to antecedents.
6. End with using a computer spell checker or reading backwards word by word . But
remember that a spelling checker won't catch mistakes with homonyms (e.g.,
"they're," "their," "there") or certain typos (like "he" for "the").
7. Leave yourself enough time. Since many errors are made and overlooked by
speeding through writing and proofreading, taking the time to carefully looking
over your writing will help you catch errors you might otherwise miss. Always
read through your writing slowly. If you read through the paper at a normal
speed, you won't give your eyes sufficient time to spot errors.
8. Ask a friend to read your paper. Offer to proofread a friend's paper if they will
review yours. Having another set of eyes look for errors will often spot errors that
you otherwise have missed.
Find out what errors you typically make. Review instructors' comments about your
writing and/or review your paper with a tutor.
Learn how to fix those errors. Talk with your professor about helping you
understand why you make the errors you do make so that you can learn to avoid
them.
Use specific strategies. Use the strategies detailed below to find and correct your
particular errors in usage, sentence structure, and spelling and punctuation.
Given the rules and the multiple exceptions to every rule that characterizes the English
language, there are many, many sites on the web that discuss avoiding grammar
mistakes. Listed below are the most common and, thus, the ones you should focus on
locating and removing while proofreading your research paper.
200
bring about," or "to accomplish." "Affect" is almost always a verb and generally
means "to influence." However, affect can be used as a noun when you're
talking about the mood that someone appears to have. [Ugh!]
2. Apostrophes -- the position of an apostrophe depends on if the noun is singular
or plural. For singluar words, add an "s" to the end, even if the final letter is an
"s." For contractions, replace missing letters with an apostrophe; but remember
that it is where the letters no longer are, which is not always where the words
are joined [e.g., "is not" and "isn't"].
3. Capitalization -- a person’s title is capitalized when it precedes the name and is,
thus, seen as part of the name [e.g., President Zachary Taylor]; once the title
occurs, further references to the person holding the title appear in lowercase
[e.g., the president]. For groups or organizations, the name is capitalized when
it is the full name [e.g., the Department of Justice]; further references should be
written in lowercase [e.g., the department]. Note that, in general, the use of
capital letters should be minimized as much as possible.
4. Colorless verbs and bland adjectives –- passive voice, use of the to be verb, is
a lost opportunity to use a more interesting and accurate verb when you can.
Adjectives can also be used very specifically to add to the sentence. Try to
avoid generic or bland adjectives and be specific. Use adjectives that add to
the meaning of the sentence.
5. Comma splices -- a comma splice is the incorrect use of a comma to connect
two independent clauses (an independent clause is a phrase that is
grammatically and conceptually complete: that is, it can stand on its own as a
sentence). To correct the comma splice, you can: replace the comma with a
period, forming two sentences; replace the comma with a semicolon; or, join
the two clauses with a conjunction such as "and," "because," "but," etc.
6. Compared with vs. compared to -- compare to is to point out or imply
resemblances between objects regarded as essentially of a different order;
compare with is mainly to point out differences between objects regarded as
essentially of the same order [e.g., life has been compared to a journey;
Congress may be compared with the British Parliament].
7. Confusing singular possessive and plural nouns –- singular possessive nouns
always take an apostrophe, with few exceptions, and plural nouns never take
an apostrophe. Omitting an apostrophe or adding one where it does not belong
makes the sentence unclear.
8. Coordinating conjunctions -- words, such as but, and, yet, join grammatically
similar elements (i.e., two nouns, two verbs, two modifiers, two independent
clauses). Be sure that the elements they join are equal in importance and in
structure.
9. Dangling participle -- a participial phrase at the beginning of a sentence must
refer to the grammatical subject of the sentence.
10. Dropped commas around clauses–-place commas around words, phrases, or
clauses that interrupt a sentence. Do not use commas around restrictive
clauses, which provide essential information about the subject of the sentence.
11. The Existential "this" -- always include a referent with "this," such as "this
theory..." or "this approach to understanding the...." With no referent, "this" can
confuse the reader.
12. The Existential "it" -- the "existential it" gives no reference for what "it" is. Be
specific!
201
13. Its / it's--"its" is the possessive form of "it." "It's" is the contraction of "it is." They
are not interchangeable.
14. Interrupting clause –- this clause or phrase interrupts a sentence, such as,
"however." Place a comma on either side of the interrupting clause.
15. Know your non-restrictive clauses –- this clause or phrase modifies the subject
of the sentence but is not essential to understanding the sentence. The word
“which” is the relative pronoun usually used to introduce the nonrestrictive
clause.
16. Know your restrictive clauses –- this clause limits the meaning of the nouns it
modifies. The restrictive clause introduces information that is essential to
understanding the meaning of the sentence. The word “that” is the relative
pronoun normally used to introduce this clause. Without this clause or phrase,
the meaning of the sentence changes.
17. Lonely quotes –- quotes cannot stand on their own as a sentence. Integrate
them into a sentence.
18. Misuse and abuse of semicolons –- semicolons are used to separate two
related independent clauses or to separate items in a list that contains
commas. Do not abuse semicolons by using them often. They are best used
sparingly.
19. Overuse of unspecific determinates -- words such as "super" [as in super
strong] or "very" [as in very strong], are unspecific determinates. How
many/much is "very"? How big is super? If you ask ten people how cold, "very
cold" is, you would get ten different answers. Academic writing should be
precise, so eliminate as many unspecific determinants as possible.
20. Sentence fragments –- these occur when a dependent clause is punctuated as
a complete sentence. Dependent clauses must be used together with an
independent clause.
21. Singular words that sound plural -- when using words like "each," "every,"
"everybody," "nobody," or "anybody" in a sentence, we're likely thinking about
more than one person or thing. But all these words are grammatically singular:
they refer to just one person or thing at a time. And unfortunately, if you change
the verb to correct the grammar, you create a pedantic phrase like "he or she"
or "his or her."
22. Split Infinitive -- an infinitive is the form of a verb that begins with "to." Splitting
an infinitive means placing another word or words between the "to" and the
infinitive verb. This is considered bad by purists, but it is nowadays considered
a matter of style and not bad grammar.
23. Subject/pronoun disagreement –- there are two types of subject/pronoun
disagreement. Shifts in number refer to the shifting between singular and plural
in the same sentence. Be consistent. Shifts in person occurs when the person
shifts within the sentence from first to second person, from second to third
person, etc.
24. That vs. which -- that clauses (called restrictive) are essential to the meaning of
the sentence; which clauses (called nonrestrictive) merely add additional
information. In general, most nonrestrictive clauses in academic writing are
incorrect or superfluous. While proofreading, go on a "which" hunt and turn
most of them into restrictive clauses!
25. Verb Tense Agreement -- do not switch verbs from present to past or from past
to present without a good reason.
202
26. Who / whom -- who is used as the subject of the clause it introduces; whom is
used as the object of a preposition, as a direct object, or as an indirect object.
A key to remembering which word to use is to simply substitute who or whom
with a pronoun. If you can substitute he, she, we, or they in the clause, and it
still sounds okay, then you know that who is the correct word to use. If,
however, him, her, us, or them sounds more appropriate, then whom is the
correct choice for the sentence.
Source: http://libguides.usc.edu/writingguide
203
A2Z
PhD
Thesis
Reflections on Academic Research
Chapter XXVI
204
WRITING AN ANNOTATED BIBLIOGRAPHY
Definition
In lieu of writing a formal research thesis, your supervisor may ask you to create an
annotated bibliography. You may be assigned this for a number of reasons, including
showing that you understand the literature underpinning your research problem, to
demonstrate that you can conduct an effective review of pertinent literature, or to share
sources among your peer-scholars so that, collectively, everyone obtains a
comprehensive understanding of key research on the subject. Think of an annotated
bibliography as a more deliberate, in-depth review of the literature than what is normally
conducted for a research paper.
To learn about your topic: Writing an annotated bibliography is excellent preparation for a
research project. Just collecting sources for a bibliography is useful, but when you have
205
to write annotations for each source, you're forced to read each source more carefully.
You begin to read more critically instead of just collecting information. At the professional
level, annotated bibliographies allow you to see what has been done in the literature and
where your own research or scholarship can fit. To help you formulate a thesis: Every
good research paper is an argument. The purpose of research is to state and support a
thesis. So a very important part of research is developing a thesis that is debatable,
interesting, and current. Writing an annotated bibliography can help you gain a good
perspective on what is being said about your topic. By reading and responding to a
variety of sources on a topic, you'll start to see what the issues are, what people are
arguing about, and you'll then be able to develop your own point of view.
The Process
1. Descriptive: This type of annotation describes the source without summarizing the actual
argument, hypothesis, or message in the content. Like an abstract, it describes what the
206
source addresses, what issues are investigated, and any special features, such as
appendices or bibliographies that supplement the main text. What it does not include is
any evaluation or criticism of the content. This type of annotation seeks to answer the
question: Does this source cover or address the topic one is researching?
Your method for selecting which sources to annotate depends upon the purpose of the
assignment and the research problem you select. For example, if the research problem
is to compare the social factors that led to protests in Egypt with the social factors that
led to protests against the government of the Phillipines in the 1980's, you will have to
include non-U.S. and historical sources in your bibliography.
207
research topic are the same you can use to define what to include in your bibliography.
These are:
Aspect--choose one lens through which to view your topic, or look at just one facet of your topic
(e.g., rather than writing a bibliography of sources about the role of food in religious rituals; create a
bibliography on the role of food in Hindu ceremonies).
Time--the shorter the time period, the more narrow the focus.
Geography--the smaller the area of analysis, the more narrow the focus (e.g., rather than cite
sources about trade relations in West Africa, include only sources that examine trade relations between
Niger and Cameroon).
Relationship--review sources that examine how two or more different topics relate to one another?
(e.g., cause/effect, compare/contrast, etc.)
Type--focus on your bibliography in terms of a specific type or class of people or things (e.g.,
research on health care provided to elderly men in Japan).
Source--your bibliography includes specific types of materials (e.g., only books, only scholarly
journal articles, only films, etc.).
Combination--use two or more of the above strategies to focus your bibliography very narrowly or
broaden coverage of a very speciafic research problem.
Introduction
Your bibliography should include a brief introduction that explains the rationale for selecting the
sources that you did and note, if appropriate, what sources were excluded and the reasons why.
Citation
This first part of your entry contains the bibliographic information written in a standard
documentation style, such as, MLA, Chicago, or APA. Be consistent!
Annotation
The second part should summarize, in paragraph form, the material contained in the source.
What you say about the source is dictated by the type of annotation you are asked to write (see
above). In most cases, your annotation should provide critical commentary that evaluates the
source and its usefulness for your topic and for your paper. Things to think about when writing
include: Does the essay offer a good introduction on the issue? Does the source deal with a
particular aspect of the issue? Would novices find the piece accessible or is it intended for an
audience already familiar with the topic? What limitations, if any, does the source have [reading
level, timeliness, reliability, etc.]? What is your overall reaction to the source?
Length
Annotations can vary significantly in length, from a couple of sentences to a couple of pages.
However, they are normally about 150 words. The length will depend on the purpose. If you're
just writing summaries of your sources, the annotations may not be very long. However, if you are
208
writing an extensive analysis of each source, you'll need to devote more space.
The following example uses the APA format for the journal citation.
Waite, L. J., Goldschneider, F. K., & Witsberger, C. (1986). Nonfamily living and the erosion
of traditional family orientations among young adults. American Sociological
Review, 51 (4), 541-554.
The authors, researchers at the Rand Corporation and Brown University, use data from
the National Longitudinal Surveys of Young Women and Young Men to test their
hypothesis that nonfamily living by young adults alters their attitudes, values, plans, and
expectations, moving them away from their belief in traditional sex roles. They find their
hypothesis strongly supported in young females, while the effects were fewer in studies
of young males. Increasing the time away from parents before marrying increased
individualism, self-sufficiency, and changes in attitudes about families. In contrast, an
earlier study by Williams cited below shows no significant gender differences in sex role
attitudes as a result of nonfamily living.
o Read the book, paper, article, etc well and purposefully to ensure the correct and
proper understanding.
o Ensure the narration is precise and well within about 150 words.
“Some books are to be tasted, others to be swallowed, and some few to be chewed and
digested.”
<< Sir Francis Bacon
209
A2Z
PhD
Thesis
Reflections on Academic Research
Chapter XXVII
Research Results
210
RESEARCH RESULTS
Definition
The results section of the research paper is where you report the findings of the
research study based upon the information gathered as a result of the methodology [or
methodologies] applied in the research study. The results section should simply state
the findings, without bias or interpretation, and arranged in a logical sequence. The
results section should always be written in the past tense. A section describing results is
particularly necessary if the research includes data generated from the study.
In formulating the results section, it is useful to note that the results of a study do not
prove anything. Research results can only confirm or reject the research problem
underpinning the research study. However, articulating the results helps the scholar to
understand the problem from within, to break it into pieces, and to view the research
problem from various perspectives.
The page length of this section is set by the amount and types of data to be reported . Be
concise, using non-textual elements, such as figures and tables, if appropriate, to
present results more effectively. In deciding what data to describe in the results section,
one must clearly distinguish material that would normally be included in a research study
from any raw data or other material that could be included as an appendix. In fact, raw
data should not be included at all unless requested to do so by the supervisor.
Avoid providing data that is not critical to answering the research question . The
background information already described in the Introduction section should provide the
reader with any additional context or explanation needed to understand the results. A
good rule is to always re-read the background section of the study paper after the
scholar has written the results section to ensure that the reader has enough context to
understand the results.
211
Structure and Writing Style
For most research paper formats, there are two ways of presenting and organizing the
results. The first method is to present the results followed by a short explanation of the
findings. For example, you may have noticed an unusual correlation between two
variables during the analysis of your findings. It is correct to point this out in the results
section. However speculating why this correlation exists, and offering a hypothesis about
what may be happening, belongs in the discussion section.
The other approach is to present a section and then discuss it, before presenting the
next section then discussing it, and so on. This is more common in longer papers
because it helps the reader to better understand each finding. In this model, it can be
helpful to provide a brief conclusion in the results section that ties each of the findings
together and links to the discussion. Note that the discussion part of your paper will
generally follow the same structure.
Either place figures, tables, charts, etc. within the text of the result, or include
them in the back of the report--do one or the other but never do both.
In the text, refer to each non-textual element in numbered order [e.g., table 1
table 2; chart 1, chart 2].
If you place non-textual elements at the end of the report, make sure they are
clearly distinguished from any attached appendix materials, such as raw data.
Regardless of placement, each non-textual element must be numbered
consecutively and complete with caption [caption goes under the figure, table, chart,
etc.]
212
Each non-textual element must be titled, numbered consecutively, and complete
with a heading [title with description goes above the figure, table, chart, etc.].
In proofreading your results section, be sure that each non-textual element is
sufficiently complete so that it could stand on its own, separate from the text.
1. Discussing or interpreting your results. Save all this for the next section of the
thesis
3. Ignore negative results. If some of your results fail to support your hypothesis, do
not ignore them. Document them, and then state in your discussion section why
you believe they emerged from your study. Note that negative results, and how
you handle them, oftent provides you with the makings of a great discussion
section, so don't be afraid to highligh them.
4. Include raw data or intermediate calculations. Ask the supervisor professor if you
need to include any raw data generated by your study, such as transcripts from
interviews or data files. If raw data is to be included, place it in an appendix or set
of appendices.
5. Present the same data or repeat the same information more than once . If you feel
the need to highlight something, you will have a chance to do that in the
discussion section.
6. Confuse figures with tables. Be sure to properly label any non-textual elements in
your paper. If you are not sure, look up the term in a dictionary.
213
A2Z
PhD
Thesis
Reflections on Academic Research
Chapter XXVIII
Defending a Thesis
214
DEFENDING A THESIS
presentation with flair
Introduction
The thesis defense or viva voce is like an oral examination in some ways. It is different in
many ways, however. The chief difference is that the candidate usually knows more
about the syllabus than do the examiners.
Be ready
You're nearly ready for the final act. After several years of research and the hard work of
writing up your results, you have submitted your magnum opus to your viva committee
and now face the final step. Ready or not, it's time to put yourself and your work in the
critical examination in a public spotlight.
A variety of formal procedures and regulations, which vary by institution, dictate how and
where your thesis defense is conducted. Usually, a scholar is interrogated endlessly by a
committee of experts, and there is a small but finite chance the candidate will fail. You
want to perform well and bring your Ph.D. studies to the best possible conclusion. To
ensure a successful thesis defense, you need to do three things: prepare, prepare, and
prepare.
215
Defense consists of four stages:
Full Coverage
Your presentation (and thesis) needs to address the following:
Time Management
Generally, the whole defense will not take more than two hours, but should take
considerably less time. Part of the challenge of a defense is to convince the committee
that you can summarize the important points of your work in a very limited time.
Keep to your allotted time. If you've been given 20 minutes for your talk, then talk for 20
minutes. Fifteen minutes is even better so that you can allow some time at the end of
your presentation for questions and/or discussion. For many people, the question-and-
answer session is the most nerve-racking part of the presentation. After all, you have no
control over the questions asked, so you can't really prepare the answers. Or can you? A
good exercise is to try to anticipate the questions you may be asked and prepare the
answers in advance.
Meticulous Preparation
So in the week or two before your thesis defense, read your thesis all the way through
with a critical eye and a highlighter in hand to refresh your memory about experimental
details, protocols, results, and your conclusions. Years have passed since you did some
of that work, so it's important to remind yourself of the fine points.
As you read, put yourself in the role of an examiner. What would you ask the author of
this thesis? Where are the trouble spots, the unresolved issues, the shaky conclusions?
If you can predict some of the questions and prepare the answers, you will be in much
216
better shape during the defense itself. Even if you don't get those questions, the exercise
will give you confidence and reacquaint you with the fundamentals.
As you stand in the spotlight, you may even realize, to your discomfort, that it's been
quite some time since you thought about those thorny issues. Consequently, some
questions will be sincere questions: the examiner asks because he/she doesn't know
and expects that the candidate will be able to rectify this. Scholars often expect
questions to be difficult and attacking, and answer them accordingly. Often the questions
will be much simpler than they expect.
This may be a very crucial and important stage of preparation. This meeting might feel
more like a negotiation than a discussion. At one end of the table is you, the hard-
working Ph.D. scholar who wants to wrap up the work in a reasonable amount of time. At
the other end of the table is your supervisor, possibly hungry for more research results
that can be used in a future presentation or publication. Hopefully, some disagreements
about how much work you need to do can be curtailed by giving your supervisor your
proposed table of contents and your countdown list in advance of the meeting. These
documents will show your progress and demonstrate how much you already have
accomplished.
217
If there is a major disagreement, try not to get angry. Instead, summarize the issues you
don't agree about and ask for some time to reflect on your supervisor's point of view.
Sometimes these planning discussions can't be finished in a single meeting, may
extended to two or three meetings; it's worth doing properly. You will save yourself quite
a bit of thesis-preparation stress if you can structure a countdown plan on which all
parties can agree.
Rehearsal of Presentation
You've structured your talk and made your slides. Now for the fun part: It's time to
rehearse your presentation out loud. First do a self-presentation (this will feel funny at
first, but it is very effective for putting yourself at ease and for getting used to the sound
of your voice in a quiet room). Then practise your talk in front of a few fellow students or
other trusted colleagues. Use these practice sessions to rehearse the pacing of your talk
and to master the effective use of visual aids. Ask your colleagues for their comments
and honest assessment of your performance at the end of the presentation. Productive
criticism from friends is useful for making improvements, and it's better to hear it from
them.
Be ready for a 'free kick'. It is relatively common that a panel will ask one (or more)
questions that, whatever the actual wording may be, are essentially an invitation to you
to tell them (briefly) what is important, new and good in your thesis. You would feel you
are in your comfort-zone, so you should rehearse this. You should be able to produce on
demand (say) a one minute speech and a five minute speech, and be prepared to
extend them if invited by further questions. Do not try to recite your abstract: written and
spoken styles should be rather different. Rather, rehearse answers to the questions:
"What is your thesis about, what are the major contributions and what have you done
that merits a PhD?”
Remember, the German philosopher Goethe's advice of "Do not hurry; do not wait" is
doubly applicable.
218
Anticipate the Questions
Some questions may be just too difficult to answer right away, or you may be caught off-
guard. You may tempt to try to bluff your way through it, but a better solution is to admit
that you don't know but discuss the issues raised by the query intelligently. Examiners
will recognize the distinction between a candidate who prevaricates and one who makes
a real attempt to address the question, even if there's no complete answer.
Don't be surprised if some examiner asks you to get more specific with a question such
as, "If you had this study to do over again, what would you do differently?" or "Is this a
line of research you care to pursue beyond the dissertation and, if so, how?"
Question-Answer Session
1. Listen to the question carefully. Too often, Ph.D. scholars stop listening
halfway through because they believe they know what the question is about, or
they are so nervous they start preparing the answer in their heads while the
question is still being asked. But sometimes the real question comes only at the
very end of a long exposé (in which the examiner may be trying to show off), and
it may not be the question you anticipate. So it is suggested that the you listen
attentively the whole time the examiner is speaking. To help you maintain your
concentration, you might want to take simple notes or jot down key words to
remind you what was said. Just don't let the note-taking distract you from careful
listening.
3. Finally, answer the question. This might seem obvious; but too often the
candidate will make no serious attempt to answer the question properly,
launching instead into a related or unrelated tangent or long-winded explication
that--it is hoped--seems like an answer, but isn't.
Public speaking is an art. Some people are great at it, others less so. The good news is
that many of the necessary skills can be learned. Everyone loves to listen to a great
219
speaker, so aim to be the kind of speaker whose talks you have enjoyed. During your
presentation make your voice, facial expressions, and the body language perfect which
are your most important attributes:
Be conscious of how you use your voice. Note that it's not just what you say that
counts; it's how you say it. Speak clearly and be audible to the last-row. Don't
rush. Use a natural pace, but don't be conversational. Speaking in a monotone is
boring and will put people to sleep, so be sure to vary the speed and pitch of your
voice.
Look at the audience throughout your talk. You will create a rapport with the
audience by establishing eye contact with as many people as possible. At the
same time, be aware of your facial expressions. If you look bored, the audience
will be bored. If you are animated and alert, the audience will be interested in
what you have to say.
Be receptive to the audience. Pay attention to the audience's body language and
nonverbal reactions to your remarks. Know when to stop and when to leave out
part of your presentation if you begin to sense that people in the audience are
losing their ability to pay attention.
Conclusion
After all is said and done, a thesis is an elaborate exercise. It is a demonstration that you
are capable of conceptualizing, conducting, and reporting research in a reasonably
independent way. Only a tiny fraction of the theses written are published, and even then,
they require extensive editing. Why? First of all, theses are written by novices. As a
result, they initiate your career as a scholar rather than define it.
220
"You can tell whether a man is clever by his answers.
You can tell whether a man is wise by his questions."
<< Naguib Mahfouz
221
R
E
F
L
E
C
T
O
I
A2Z
PhD
N
S
O
N
A
C
Thesis Chapter XXIX
A
D
E
M
I
C
R
E
S
E
A
R
C
H
222
READING A RESEARCH PAPER
Introduction
The process of reading research papers effectively is challenging. These papers are
very often written in a very condensed style because of limitations of pages and the
expected audience, which usually know the area well. Moreover, the reasons for writing
the paper may be different than the reasons the paper has been assigned, meaning you
have to work harder to find the content that you are interested. Finally, your time is very
limited, so you may not have time to read every word of the paper or read it several
times to extract all the nuances. For all these reasons, reading a research paper often
requires a special approach as well as skill.
To develop an effective reading style for research papers, it can help to know what you
would get out of the paper, and where that information is located in the paper. Typically,
the introduction will state not only the motivations behind the work, but also indicate the
solution. Often this may be the case from the paper. The body of the paper states the
authors' solution to the problem in detail, and would also describe a detailed evaluation
of the solution in terms of arguments or an empirical evaluation, such as case study,
experiment, etc. Finally, the paper would conclude with a recapitulation, including a
discussion of the primary contributions. A paper would also discuss related work to
some degree. Papers are often repetitive because they present information at different
levels of detail and from different perspectives. As a result, it may be desirable to read
the paper out-of-order or to skip certain sections.
Parameters
Before start understanding how to read a paper, one needs to start at the beginning with
a few following preliminaries:
223
[A] How are papers organized?
In most scientific or other journals, papers almost follow a standard format. They are
divided into several sections, and each section serves a specific purpose in the paper.
Let us first briefly describe the standard format.
The next section of the paper is the Introduction. As its name implies, this section
presents the background knowledge necessary for the reader to understand why the
findings of the paper are an advance on the knowledge in the field. Typically, the
Introduction describes first the accepted state of knowledge in a specialized field; then it
focuses more specifically on a particular aspect, usually describing a finding or set of
findings that led directly to the work described in the paper. If the authors are testing a
hypothesis, the source of that hypothesis is spelled out, and findings are given. Papers
more descriptive or comparative in nature may begin with an introduction to an area
which interests the authors, or the need for a broader database.
The next section in most papers is the Materials and Methods or Research
Methodology. In some journals this section is the last one. Its purpose is to describe the
materials used in the experiments and the methods by which the experiments were
carried out. In principle, this description should be detailed enough to allow other
researchers to replicate the work. In general the practice is that, these descriptions are
often highly compressed, and they often refer back to previous papers by the authors.
The third section is usually Results. This section describes the experiments and the
reasons they were done. Generally, the logic of the Results section follows directly from
that of the Introduction. That is, the Introduction poses the questions addressed in the
early part of Results. Beyond this point, the organization of Results differs from one
paper to another. In some papers, the results are presented without extensive
discussion, which is reserved for the following section. This is appropriate when the data
in the early parts do not need to be interpreted extensively to understand why the later
224
experiments were done. In other papers, results are given, and then they are interpreted,
perhaps taken together with other findings not in the paper, so as to give the logical
basis for later experiments.
The fourth section is the Discussion. This section serves multiple purposes. First, the
data in the paper are interpreted; any limitations to the interpretations would be
acknowledged, and fact would clearly be separated from speculation. Second, the
findings of the paper are related to other findings in the field. This serves to show how
the findings contribute to knowledge, or correct the errors of previous work(s). As stated,
some of these logical arguments are often found in the Results when it is necessary to
clarify why later experiments were carried out.
Papers also contain several Figures and Tables. These contain data described in the
paper. The figures and tables also have legends, whose purpose is to give details of the
particular experiment or experiments shown there.
In most of the journals, the above format is followed. Occasionally, the Results and
Discussion are combined, in cases in which the data need extensive discussion to allow
the reader to follow the train of logic developed in the course of the research. In certain
older papers, the Summary was given at the end of the paper.
The formats for two widely-read scientific journals, Science and Nature, differ markedly
from the above outline. These journals reach a wide range of audience, and many
authors wish to publish in them; accordingly, the space limitations on the papers are
severe, and the prose is usually highly compressed. In both journals, there are no
discrete sections, except for a short abstract and a reference list. In Science, the
abstract is self-contained; in Nature, the abstract also serves as a brief introduction to
the paper. Experimental details are usually given either in endnotes (for Science) or
225
Figure and Table legends and a short Methods section (in Nature). Authors often try to
circumvent length limitations by putting as much material as possible in these places. In
addition, a common practice is to put a substantial fraction of the less-important material,
and much of the methodology, into Supplemental Data that can be accessed online at
any later time.
In response to the pressure to edit and make the paper short and concise, most of the
authors choose to condense or, more typically, omit the logical connections that would
make the flow of the paper easy. In addition, much of the background that would make
the paper accessible to a wider audience is condensed or omitted, so that the less-
informed reader has to consult a review article or previous papers to make sense of
what the issues are and why they are important.
Though it is highly tempting to read the paper straight through as one would does with
most text, it is more efficient to organize the way one you reads. Generally, one first
reads the Abstract in order to understand the major points of the work. The extent of
background assumed by different authors, and allowed by the journal, also varies as just
indicated above.
One most and extremely useful habit in reading a paper is to read the Title and the
Abstract and, before going on, review in one’s mind what one knows about the topic.
This serves several useful purposes. First, it clarifies whether one in fact knows enough
background to appreciate the paper. If not, one might choose to read the background in
a review or textbook, as appropriate/applicable
Second, it refreshes one’s memory about the topic. Third, and possibly most importantly,
it helps a reader as he/she integrates the new information into one’s previous knowledge
about the topic. That is, it is used as a part of the self-education process that any
professional must continue throughout his/her career.
If one is very familiar with the field, the Introduction can be skimmed and/or skipped. As
stated above, the logical flow of most papers goes straight from the Introduction to
Results; accordingly, the paper should be read in that way as well, skipping Materials
226
and Methods or Research Methodology and referring back to this section as needed to
clarify what was actually done. A reader familiar with the field who is interested in a
particular point given in the Abstract often skips directly to the relevant section of the
Results, and from there to the Discussion for interpretation of the findings. This is easy
to do if the paper is properly organized.
Many papers contain code phrases such as ‘data not shown’, ‘unpublished data’,
‘preliminary data’, etc., since they have connotations that are generally not explicit. In
many papers, not all the experimental data are shown, but referred to by "(data not
shown)". This is often for reasons of space; the practice is accepted when the authors
have documented their competence to do the experiments properly (usually in previous
papers). The other two phrases are "unpublished data" and "preliminary data". The
former can either mean that the data are not of publishable quality or that the work is
part of a larger story that will one day be published. The latter means different things to
different people, but one connotation is that the experiment was done only once.
Several difficulties confront the reader, particularly one who is not familiar with the field.
As discussed above, it may be necessary to bring one up to speed before beginning a
paper, no matter how well written it is. Although some problems may lie in the reader,
many are the fault of the writer.
One major or primary problem is that many papers are poorly written. Some scientists
are poor writers. Many others do not enjoy writing, and do not take the time or effort to
ensure that the prose is clear and logical. Also, the author is typically so familiar with the
material that it is difficult to step back and see it from the point of view of a reader not
familiar with the topic and for whom the paper is just another of a large stack of papers
that need to be read.
Bad writing has several consequences for the reader. First, the logical connections are
often left out. Instead of saying why an experiment was done, or what ideas were being
tested, the experiment is simply described. Second, papers are often cluttered with a
great deal of jargon. Third, the authors often do not provide a clear road-map through
227
the paper; side issues and fine points are given equal air time with the main logical
thread, and the reader loses this thread. In better writing, these side issues are relegated
to Figure legends, Materials and Methods, or online Supplemental Material, or else
clearly identified as side issues, so as not to distract the reader.
Another major difficulty arises when the reader seeks to understand just what the
experiment was. All too often, authors refer back to previous papers; these refer in turn
to previous papers in a long chain. Often that chain ends in a paper that describes
several methods, and it is unclear which was used. Or the chain ends in a journal with
severe space limitations, and the description is so compressed as to be unclear. More
often, the descriptions are simply not well-written, so that it is ambiguous what was
done.
Other typical difficulties arise when the authors are uncritical about their experiments; if
they firmly believe a particular model, they may not be open-minded about other
possibilities. These may not be tested experimentally, and may even go unmentioned in
the Discussion. Still another related problem is that many authors do not clearly
distinguish between fact and speculation, especially in the Discussion. This makes it
difficult for the reader to know how well-established the “facts” under discussion are.
One final problem arises from the sociology of science. Many authors are over ambitious
and wish to publish in trendy/modern journals. As a consequence, they overstate the
importance of their findings, or put a speculation into the title in a way that makes it
sound like a well-established finding. Another example of this approach is the "Assertive
Sentence Title", which presents a major conclusion of the paper as a declarative
sentence. In recent times, this trend is becoming prevalent. It's not so bad when the
assertive sentence is well-documented; but quite often such assertive sentence is
nothing more than a speculation and the hasty reader may well conclude that the issue
is settled when it isn't. This practice as far as possible may be avoided.
These last factors represent the public relations side of a competitive field. This behavior
is understandable, if not praiseworthy. But when the authors mislead the reader as to
what is firmly established and what is speculation, it is hard, especially for the novice, to
know what is settled and what is not.
228
Conclusion
The aim of reading a research paper varies depending upon the necessities of the
reader. But it has to be borne in mind , whatever be type of the reader, that one has to
be familiar with the standard format of any research paper and has to have some correct
prospective while reading the paper. At end of the day, the author serves the community
provided the reader gets what he/she wants.
“If one cannot enjoy reading a book over and over again, there is no use in reading it at all.”
<< Oscar Wilde
229
Chapter XXX
Evaluating a
Research Paper
A2Z
PhD
Thesis
Reflections on Academic Research
230
EVALUATING A RESEARCH PAPER
Introduction
Good research reflects a sincere desire to determine what is overall true, based on all
available information; as opposed to bad research that starts with a conclusion and
identifies supporting factoids (individual facts taken out of context). A good research
document empowers readers to reach their own conclusions by including:
• A well-defined question.
• Description of the context and existing information about an issue.
• Consideration of various perspectives.
• Presentation of evidence, with data and analysis in a format that can be replicated by others.
• Discussion of critical assumptions, contrary findings, and alternative interpretations.
• Cautious conclusions and discussion of their implications.
• Adequate references, including original sources, alternative perspectives, and criticism.
Before addressing this question, one needs to be aware that research in Consumer
Behaviour and Customer Relationship Management can be of several different types:
Descriptive research often takes place in the early stages of our understanding of a
system. We can't formulate hypotheses about how a system works, or what its
interconnections are, until we know what is there. Typical descriptive approaches in
231
Consumer Behaviour are behavioural pattern of a consumer and the reasoning thereof.
In Customer Relationship Management, one could regard sustaining and managing the
relationship of a customer as a descriptive endeavor.
Comparative research often takes place when we are asking how general a finding is.
Is it specific to one particular country, or is it broadly applicable? A typical comparative
approach would be comparing the behavioural pattern of a consumer in respect of a
particular product from one country with that from the other countries in which that
product is found. One example of this is the observation that the response/reaction for a
particular product from a European consumer and an African consumer is similar in
some aspects as well as different in some other aspects.
Analytical research generally takes place when we know enough to begin formulating
hypotheses about how a system works, about how the parts are interconnected, and
what the causal connections are. A typical analytical approach would be to devise two
(or more) alternative hypotheses about how a system operates. These hypotheses
would all be consistent with current knowledge about the system. Ideally, the approach
would devise a set of experiments to distinguish among these hypotheses.
Being aware that not all papers have the same approach can orient a reader towards
recognizing the major questions that a paper addresses.
This question can often be answered in a preliminary way by studying the abstract of the
paper. Here the authors highlight what they think are the key points. But, this is not
enough, because abstracts often have severe space constraints; but it can serve as a
starting point. Still, one needs to read the full paper with this question in mind.
232
[C] What evidence supports those conclusions?
Generally, one can get a reasonably good idea about this from the Results section. The
description of the findings points to the relevant tables and figures. This is easiest when
there is one primary experiment to support a point. However, it is often the case that
several different experiments or approaches combine to support a particular conclusion.
For example, the first experiment might have several possible interpretations, and the
later ones are designed to distinguish among these.
In the ideal case, the Discussion begins with a section of the form "Three lines of
evidence provide support for the conclusion that... First, ...Second,... etc." However,
difficulties can arise when the paper is poorly written. The author(s) often do not present
a concise summary of this type, leaving the reader to make it himself/herself. It is always
possible to argue that in such cases the logical structure of the argument is weak and is
deliberately omitted. In any case, one needs to be sure that one does understand the
relationship between the data and the conclusions.
One major advantage of doing this is that it helps the reader to evaluate whether the
conclusions are sound. If it is assumed for the moment that the data are believable, it
still might be the case that the data do not actually support the conclusion the authors
wish to reach. There are at least two different ways this can happen:
a) The logical connection between the data and the interpretation is not sound
b) There might be other interpretations that might be consistent with the data.
One important aspect to look for is whether the authors take multiple approaches to
answering a question. Do they have multiple lines of evidence, from different directions,
supporting their conclusions? If there is only one line of evidence, it is more likely that it
could be interpreted in a different way; multiple approaches make the argument more
persuasive.
233
Another thing to look for is implicit or hidden assumptions used by the authors in
interpreting their data. This can be hard to do, unless you understand the field
thoroughly. Only expert/specialist in a particular field should be able to judge.
This is the hardest nut to crack, for novices and experts alike. At the same time, it is
really a challenging one and one of the most important skills to learn as a young
research scholar. It involves a major reorientation from being a relatively passive
consumer of information and ideas to an active producer and critical evaluator of them.
This is not easy and takes years to master. Beginning scientists often wonder, "Who am
I to question these authorities? After all the paper was published in a top journal, so the
authors must have a high standing, and the work must have received a critical review by
experts." Unfortunately, that's not always the case. In any case, developing one’s ability
to evaluate evidence is one of the hardest and most important aspects of learning.
Here are some steps by which one can evaluate the evidence:
First Step
One has to understand thoroughly the methods used in the experiments. Often these
are described poorly or not at all. The details are often missing, but more importantly the
authors usually assume that the reader has a general knowledge of common methods in
the field. If there is lack of this knowledge, one has to make the extra effort to inform
oneself about the basic methodology before one can evaluate the data. Sometimes you
have to trace back the details of the methods if they are important. The increasing
availability of journals on the Web has made this easier by obviating the need to find a
hard-copy issue, e.g. in the library.
Second Step
One has to have the reasonable knowledge about the limitations of the methodology.
Every method has limitations, and if the experiments are not done correctly they can't be
interpreted.
234
Third Step
Here one has to distinguish between what the data show and what the authors say they
show. The latter is really an interpretation on the authors' part, though it is generally not
stated to be an interpretation. Papers usually state something like "the data in Fig. x
show that ...". This is the authors' interpretation of the data. One need not interpret it the
same way? One has to look carefully at the data to ensure that they really do show what
the authors say they do. One can only do this effectively if one understands the methods
and their limitations.
Fouth Step
It is always helpful to look at the original journal, or its electronic counterpart, instead of a
photocopy. Particularly for half-tone figures, the contrast is distorted, usually increased,
by photocopying, so that the data are misrepresented.
Fifth Step
One should ask and look for if the proper controls are present. Controls tell the reader
that nature is behaving the way we expect it to under the conditions of the experiment. If
the controls are missing, it is harder to be confident that the results really show what is
happening in the experiment. One should try to develop the habit of asking "where are
the controls?" and looking for them.
Do the conclusions make a significant advance in our knowledge? Do they lead to new
insights, or even new research directions? Again, answering these questions requires
the thorough understanding of the field and critical ability to analyze the intricate issues
presented in the paper.
Some Guidelines
235
These fields have professional guidance to help maintain quality research. This has
become increasingly important as the Internet makes unfiltered information more easily
available to a general audience. Guidelines for good research are provided below:
1. Issues are defined in ideological terms. “Straw men” reflecting exaggerated or extreme
perspectives are use to characterize a debate.
2. Research questions are designed to reach a particular conclusion.
3. Alternative perspectives or contrary findings are ignored or suppressed.
4. Data and analysis methods are biased.
5. Conclusions are based on faulty logic.
6. Limitations of analysis are ignored and the implications of results are exaggerated.
7. Key data and analysis details are unavailable for review by others.
8. Researchers are unqualified and unfamiliar with specialized issues.
9. People with differing perspectives are insulted and ridiculed.
10. Citations are primarily from special interest groups or popular media, rather than from
peer reviewed professional and academic organizations.
Conclusion
While research papers contribute to the community in general, the well-judged and well-
balanced evaluation endures the quality of the paper and enriches the value and utility of
the paper.
236
“The greatest sin is judgment without knowledge”
“One test of the correctness of educational procedure is the happiness of the child.”
<< Maria Montessori
237
A2Z
PhD
Thesis
Reflections on Academic Research
JOURNAL IMPACT
Journal Impact Factor
Chapter XXXI
238
FACTORS
Introduction
It has become mandatory for academic research scholars to publish a minimum number
of research papers in peer-reviewed national or international reputed journals having a
reasonable Impact Factor [IF] or also known as Journal Impact Factor [JIF]. Journal
Impact Factor is from Journal Citation Report (JCR), a product of Thomson ISI (Institute
for Scientific Information). JCR provides quantitative tools for evaluating journals. The
impact factor is one of these; it is a measure of the frequency with which the "average
article" in a journal has been cited in a given period of time.
The concept behind citation indexing is fundamentally simple. By recognizing that the
value of information is determined by those who use it, what better way to measure the
quality of the work than by measuring the impact it makes on the community at large.
The widest possible population within the scholarly community (i.e. anyone who uses or
cites the source material) determines the influence or impact of the idea and its
originator on our body of knowledge. Because of its simplicity, one tends to forget that
citation indexing is actually a fairly recent form of information management and retrieval.
There were three factors that led to the development of citation indexing back in the
1950's. With the huge influx of government dollars into research and development
following World War II, the research community naturally began to publicly document its
findings through the accepted channel of published scientific journal literature. The
subsequent burgeoning of the literature created a need for a method of indexing and
retrieval that would be more cost effective and efficient than the then-current model of
human indexing of materials for subject specific indices. While the subtle judgements
made by subject specialists were valuable in giving depth to a subject index, manual
indexing was both a more time consuming process and labor intensive. Its costs
increased in proportion to the growth of material to be indexed. So the need for a better
way of managing information was the first factor.
The second factor was the growing dissatisfaction with the capacity of subject indexing
to meet the needs of the active researcher. At this point in time, a subject index could
239
have excessive lag times in adding materials to the indexes of the time; months could
pass before researchers in one field would learn of published findings in some other field
that had relevance to their own study. Furthermore, there were limitations to the subject
indexing in terms of retrieval. Terminology appropriate to one specific discipline would
not necessarily have meaning to researchers in another, perhaps overlapping, discipline.
At the same time, scientists were recognizing that they had to be aware of, if not
completely familiar with, work in a number of different subject disciplines in order to be
confident that they had properly grounded the research through an appropriate review of
the literature.
Along with this need was the hope that automation might hold the answers, the third and
final factor in the development of citation indexing. Computerization in the 1950s was far
removed from the desktop environment of today, but there was tremendous excitement
over potential benefits to be derived from the application of machines to the generation
and compilation of data. The U.S. government hoped that automation could mitigate or
even eliminate completely the difficulties of manual indexing. A number of projects were
launched by the United States with the intention of investigating these possibilities.
Dr. Eugene Garfield, founder and now Chairman Emeritus of ISI® (now Thomson
Reuters), was deeply involved in the research relating to machine generated indexes in
the mid-1950's and early 1960's. One of his earliest points of involvement was a project
sponsored by the Armed Forces Medical Library (predecessor to the current National
Library of Medicine). The Welch Medical Library Indexing project, as it was called, was
to investigate the role of automation in the organization and retrieval of medical
literature. The hope was that the problems associated with subjective human judgement
in selection of descriptors and indexing terms could be eliminated. By removing the
human element, one might thereby increase the speed with which information was
incorporated in to the indexes. It might also increase the cost-effectiveness of the
indexes. Garfield grasped early on that review articles in the journal literature were
heavily reliant on the bibliographic citations that referred the reader to the original
published source for the notable idea or concept. By capturing those citations, Garfield
believed, the researcher could immediately get a view of the approach taken by another
scientist to support an idea or methodology based on the sources that the published
writer had consulted and cited as pertinent in the bibliography. As retrieval terms,
240
citations could function as well as keywords and descriptors that were thoughtfully
assigned by a professional indexer.
In the early 1960s, Eugene Garfield and Associates developed two pilot projects that
would test the viability and efficiency of citation indexing. The first project involved the
creation of a database that would index the citations of 5,000 chemical patents held by
two private pharmaceutical companies. The referenced citations in this instance were to
prior patents, the documentation sources that the government patent examiners were
using to support a decision to grant or deny a patent. The connections that the patent
citation index made were then analyzed with two comparable classifications and
indexing systems that were currently being used by the participants. Based on this
investigation and analysis, the project sponsors determined that citation indexing
permitted the retrieval of relevant literature across arbitrary classifications in a way that
subject- oriented indexing could not.
A second pilot project in 1962 involved Garfield's recently incorporated enterprise, the
Institute for Scientific Information (now Thomson Reuters), with the United States
National Institutes of Health in building an index to the published literature on genetics.
This project was far more complex in nature than the patents index. Three databases
were built to cover the literature over 1 year, 5 years and 14 years with a varying number
of source publications indexed in each. While this project was to test the feasibility and
utility of a narrow, discipline-oriented citation index, at completion, it was concluded that
the database with the most broadly based set of source publications formed the most
comprehensive and useful guide to the published literature in the field of genetics. The
database for the single-year term had drawn not just on journals that were primarily
devoted to the field of genetics research but had drawn as well from a large pool of
journals that published genetics papers on a more peripheral or occasional basis.
Additionally, while the automated system required a certain level of effort in
standardizing the entries from a wide variety of published materials, the project
demonstrated the cost-effectiveness of citation indexing as opposed to the expense of
traditional subject indexing processes.
While, at the time of the project's completion, the government sponsors chose not to
subsidize the development of a national citation database, Eugene Garfield was
encouraged to move ahead with the private publication of his multidisciplinary citation
241
index as the first edition of the Science Citation Index® (SCI®). Available for purchase
since 1963, the SCI then and now represents the most comprehensive citation index to
the scientific journal literature. Today, the Web-based version of that index covers 5,600
journals across more than 150 scientific disciplines.
Garfield's achievement lay in establishing the utility and objectivity of a citation index in
pulling up related papers in published literature that at first glance might not have
seemed pertinent to the researcher's inquiry. Today, it is considered to be one of the
most reliable of resources in tracing the development of an idea across the multitude of
disciplines that are part of our body of scientific knowledge.
The JCR provides quantitative tools for ranking, evaluating, categorizing, and comparing
journals. The impact factor is one of these; it is a measure of the frequency with which
the "average article" in a journal has been cited in a particular year or period. The
annual impact factor is a ratio between citations and recent citable items published.
Thus, the impact factor of a journal is calculated by dividing the number of current year
citations to the source items published in that journal during the previous two years. The
journal Impact Factor is the average number of times articles from the journal published
in the past two years have been cited in the JCR year. The Impact Factor is calculated
by dividing the number of citations in the JCR year by the total number of articles
published in the two previous years. An Impact Factor of 1.0 means that, on average,
the articles published one or two year ago has been cited one time. An Impact Factor of
2.5 means that, on average, the articles published one or two year ago have been cited
two and a half times. Citing articles may be from the same journal; most citing articles
are from different journals. The impact factor is useful in clarifying the significance of
absolute (or total) citation frequencies. It eliminates some of the bias of such counts
which favor large journals over small ones, or frequently issued journals over less
frequently issued ones, and of older journals over newer ones. Particularly in the latter
case such journals have a larger citable body of literature than smaller or younger
journals. All things being equal, the larger the number of previously published articles,
the more often a journal will be cited
Example [A]:
242
Impact Factor [IF] is calculated as follows:
No. of times articles published in 2010 & 2011 were cited in indexed Journals during the
year 2012 = A
Example [B]:
No. of times articles published in 2007, 2008, 2009, 2010 & 2011 [ie, last 5 years] were
cited in indexed Journals during the year 2012 = A
[Note: The 5-year Impact Factor is available only in JCR 2007 and subsequent years.]
This is somewhat a type of fine-tuned Impact Factor, ie., the aggregate Impact Factor is
meant for a subject category. It is calculated in the same way as the Impact Factor of a
Journal. But here, the number of citations to all journals in a particular category [for
example, market research] and the number of articles from all these journals in that
category are taken into account. An aggregate Impact Factor of 2.0 means that that, on
average, the articles in a particular subject category [ie. market research] published one
or two years ago have been cited twice.
Example [C]:
243
No. of times articles published in 2010 & 2011 were cited in indexed Journals in a
particular category [market research] during the year 2012 = A
No. of articles, reviews, or notes published in these Journals for 2 years as above = B
Some of the journals listed in the JCR are not citing journals, but are cited-only journals.
This is significant when comparing journals by impact factor because the self-citations
from a cited-only journal are not included in its impact factor calculation. Self-citations
often represent about 13% of the citations that a journal receives. Users can identify
cited-only journals by checking the JCR Citing Journal Listing. Cited-only journals may
be ceased journals, suspended journals, or superseded titles. Any journal that appears
elsewhere in JCR, but not in the Citing Journal Listing, is a cited-only journal.
Furthermore, users can establish analogous impact factors, (excluding self-citations), for
the journals they are evaluating using the data given in the Citing Journal Listing.
Example [D]:
Calculation of Impact Factors without self-citations.
Cites in Self-cites
Minus Articles
JCR 2012 to in 2012 to Revised
Self- Published
Impact 2010 & 2010 & Impact
Cites 2010 &
Name of the Factor 2011 2011 Factor
C = (A- 2011
Journals Articles Articles E = C/D
[A / D] B) D
A B
244
Journal I 1.40 525 - 525 375 1.40
A comparison of JCR Impact Factors and Revised Impact Factors of these six sample
Journals highlights the significant difference and tell-tales the fine-tuning of revised IF.
These values alone will be considered when self-citations are excluded.
Title Change
A user's knowledge of the content and history of the journal studied is very important for
appropriate interpretation of impact factors. Situations such as those mentioned above
and others such as title change are very important, and often misunderstood.
A title change affects the impact factor for two years after the change is made. The old
and new titles are not unified unless the titles are in the same position alphabetically. In
the first year after the title change, the impact is not available for the new title unless the
data for old and new can be unified. In the second year, the impact factor is split. The
new title may rank lower than expected and the old title may rank higher than expected
because only one year of source data is included in its calculation. Title changes for the
current year and the previous year are listed in the JCR guide.
245
R = Unified Impact Factor = P / Q
R1 = P1 / Q1 = JCR Factor for the New Title
R2 = P2 / Q2 = JCR Factor for the Superseded Title
There have been many innovative applications of journal impact factors. The most
common involve market research for publishers and others. But, primarily, JCR provides
librarians, academic research scholars as well as professional researchers with a tool for
the management of library journal collections. In market research, the impact factor
provides quantitative evidence for editors and publishers for positioning their journals in
relation to the competition—especially others in the same subject category, in a vertical
rather than a horizontal or intradisciplinary comparison. JCR data may also serve
advertisers interested in evaluating the potential of a specific journal.
Perhaps the most important and recent use of impact is in the process of academic
evaluation. The impact factor can be used to provide a gross approximation of the
prestige of journals in which individuals have been published. This is best done in
conjunction with other considerations such as peer review, productivity, and subject
specialty citation rates. As a tool for management of library journal collections, the
impact factor supplies the library administrator with information about journals already in
the collection and journals under consideration for acquisition. These data must also be
combined with cost and circulation data to make rational decisions about purchases of
journals.
The impact factor can be useful in all of these applications, provided the data are used
sensibly. It is important to note that subjective methods can be used in evaluating
journals as, for example, by interviews or questionnaires. In general, there is good
agreement on the relative value of journals in the appropriate categories. However,
the JCR makes possible the realization that many journals do not fit easily into
established categories. Often, the only differentiation possible between two or three
small journals of average impact is price or subjective judgments such as peer review.
Conclusion
Though the impact factor is found to be a very useful tool by academic fraternity for
evaluation of journals, but it must be used very cautiously and discreetly. Considerations
246
include the amount of review or other types of material published in a journal, variations
between disciplines, and item-by-item impact. But, for junior research scholars, the
Impact Factor of a Journal may be very useful in their research study.
http://www.sciencegateway.org/impact/
http://admin-apps.webofknowledge.com/JCR/help/h_impfact.htm
http://thomsonreuters.com/products_services/science/free/essays/impact_factor/
Faith is the first factor in a life devoted to service. Without it, nothing is possible.
With it, nothing is impossible. << Martin Luther
247
Chapter XXXII
Publishing a
Research Paper
A2Z
PhD
Thesis
Reflections on Academic Research
248
PUBLISHING A RESEARCH PAPER
Peer-reviewed journals are an important medium for reported research output from
universities. A definition of a peer review is:
The process by which a learned journal passes a paper received for publication to outside
experts for their comments on its suitability and worth.
Looking at the impact factor of a journal is a further way of measuring its quality.
The journal impact factor is the average number of times that articles published in a
specific journal in the two previous years (e.g. 2010-2011) were cited in a particular year
(i.e. 2012). The calculation is determined as follows:
There are variations between disciplines. One should view journals in the context of their
specific field. Some disciplines work on a five year impact factor.
Journal impact factors should not be used solely to evaluate journals. Other criteria
should also be considered, such as peer review and scope.
249
Obtaining Journal Impact Factors
Scholar can consult ISI Journal Citation Reports on the Web (JCR Web) to find
the Impact Factor for a single journal title or a range of titles in a subject category. JCR
Web draws citation data from over 7,000 scholarly journals worldwide in the sciences
and social sciences.
JCR Web also depicts the Impact Factors of a journal over the last five years in
the Impact Factor Trend Graph. Another feature is the Immediacy Index which measures
how quickly the average article from a journal is cited within the year of publication. This
number is useful for evaluating journals that publish cutting-edge research.
JCR Web is available via Web of Knowledge and is linked to Web of Science searches.
Once the scholar has identified likely journals, the aims and scope of each are to be
checked to determine whether the paper at hand is suitable for that journal.
If your work meets the aims and scope of the journal you have selected, submit your
manuscript in the appropriate format for the journal. This format is usually indicated
under "instructions to authors".
Instructions to authors
Instructions to authors (also called advice to authors or authors guide) detail what is and
isn't acceptable to a particular publisher. Generally these guidelines include layout,
referencing style, how to submit, submitting tables and figures in text, the audience,
review process and publication.
250
by writing to the editor or publisher of the journal. Email addresses and websites can by
found from Ulrich's Periodicals Directory
Experienced journal editors and authors are willing to pass on their secrets of success.
Here is their best advice.
Write clearly
"There is no substitute for a good idea, for excellent research or for good, clean, clear
writing," says Nora S. Newcombe, PhD, of Temple University, former editor of
APA's Journal of Experimental Psychology: General. Newcombe endorses the advice of
Cornell University's Daryl J. Bem, PhD, who in Psychological Bulletin (Vol. 118, No. 2)
wrote that a review article should tell "a straightforward tale of a circumscribed question
in want of an answer. It is not a novel with subplots and flashbacks, but a short story with
a single, linear narrative line. Let this line stand out in bold relief." Newcombe also
admits that neatness counts. Though she tries not get in a "bad mood" about grammar
mistakes or gross violations of APA style, she says, such mistakes do "give the
impression that you're not so careful."
Get a pre-review
Don't send the manuscript to an editor until you have it reviewed with a fresh eye, warns
Newcombe. Recruit two objective colleagues: one who is familiar with the research area,
another who knows little or nothing about it. The former can provide technical advice,
while the latter can determine whether your ideas are being communicated clearly.
252
a small experiment that I know would never get published in that journal, but I would like
to get some feedback." Not a good idea, Newcombe says, because it wastes editors'
and reviewers' time, and those who reject it from the journal may also be the ones who
have to review the paper when it's submitted to a different journal. "It's a small
community out there. Don't use up your reviewers," she says.
Be Calm
The overwhelming majority of initial journal manuscripts are rejected at first. "Remember,
to get a lot of publications, you also will need to get lots of rejections," says Edward
Diener, PhD, editor of APA's Journal of Personality and Social Psychology: Personality
Processes and Individual Differences. Only a small proportion--5 to 10 percent--are
accepted the first time they are submitted, and usually they are only accepted subject to
revision. Since most papers are rejected from the start, says Newcombe, the key is
whether the journal editors invite you to revise it.
Some reviewers may recommend submitting your work to a different journal. "They're
not saying the article is hopeless," says Neal-Barnett, "they're just saying that it may not
253
be right for that journal." If revision isn't invited following the initial rejection, many new
authors may toss the manuscript and vow to never write again to or change programs.
Newcombe's advice, though, is to read the reviews carefully and determine why that
decision was made. If the research needs more studies or if the methodology needs to
be changed somehow, "if you have a sincere interest in the area, do these things," says
Newcombe. You can resubmit it as a new paper, noting the differences in the cover
letter.Also keep in mind that "quite often, unfortunately, a journal will reject an article
because it's novel or new for its time," says Newcombe. "If you feel that it is valid and
good, then by all means, send it off to another journal."
Do the revisions
If you are invited to revise, "Do it, do it fast and don't procrastinate," says Newcombe.
Also, she warns that because reviewers can at times ask for too much, authors should
take each suggestion into consideration, but decide themselves which to implement.
Be diplomatic
What if reviewers disagree? "There is a wrong and a right way" to address dissention
among reviewers, says Newcombe. She quotes from Daryl Bem's Psychological
Bulletin article:
Wrong: "I have left the section on the animal studies unchanged. If reviewers A and C
can't even agree on what the animals have developed, I must be doing something right."
Right: "You will recall that reviewer A thought the animal studies should be described
more fully whereas reviewer C thought they should be omitted. Other psychologists in
my department agree with reviewer C that the animals cannot be a valid analogue to the
human studies. So, I have dropped them from the text and have attached it as a footnote
on page six."
Ultimately, it's good to keep in mind that the road to being published isn't a lonely one:
"All authors get lots of rejections--including senior authors such as me," says Diener.
"The challenge," he says, "is to persevere, and improve one's papers over time."
1. The research paper topic should be unique and there should be a logical reason to study it.
254
2. Do your homework. Make sure you know what investigators in your field and other fields have
published about your topic (or similar topics). There is no substitute for a good literature review
before jumping into a new project.
3. Take the time to plan your experimental design. As a general rule, more time should be devoted to
planning your study than to actually performing the experiments (though there are some exceptions,
such as time-course studies with lengthy time points). Rushing into the hands-on work without
properly designing the study is a common mistake made by young researchers.
4. When designing your experiment, choose your materials wisely. Look to the literature to see what
others have used. Similar products from different companies do not all work the same way. In fact,
some do not work at all.
5. Get help. If you are performing research techniques for the first time, be sure to consult an
experienced friend or colleague. Rookie mistakes are commonplace in academic research and lead
to wasted time and resources.
6. Know what you want to study, WHY you want to study it, and how your results will contribute to the
current pool of knowledge for the subject.
7. Be able to clearly state a hypothesis before starting your work. Focus your efforts on researching
this hypothesis. All too often people start a project and are taken adrift by new ideas that come along
the way. While ideas are good to note, be sure to keep your focus.
8. Along with keeping focus, know your experimental endpoints. Sometimes data collection goes
smoothly and you want to dig deeper and deeper into the subject. If you want to keep digging
deeper, do it with a follow-up study.
9. Keep in mind where you might like to publish your study. If you are aiming for a high-impact journal,
you may need to do extensive research and data collection. If your goal is to publish in a lower-tier
journal, your research plan may be very different.
10. If your study requires approval by a review board or ethics committee, be sure to get the
documentation as needed. Journals will often require that you provide such information.
11. If your study involves patients or patient samples, explicit permissions are generally required from
the participant or donor, respectively. Journals may ask for copies of the corresponding
documentation.
General
12. Read and follow ALL of the guidelines for manuscript preparation listed for an individual journal.
Most journals have very specific formatting and style guidelines for the text body, abstract, images,
tables, and references.
13. HYPOTHESIS: be sure to have one and state it clearly. This is, after all, why you are doing the
research.
14. Write as though your work is meaningful and important. If you don’t, people will not perceive it as
meaningful and important.
255
15. Use an external peer review service (available through JournalPrep.com) to get your manuscript
reviewed prior to submission. Rapid and expert peer reviews, before you submit, may significantly
increase your odds of getting your manuscript accepted for publication.
16. Critique your own work. Look for areas that reviewers might spot as weaknesses and either correct
these areas or comment on them in your manuscript, leaving reviewers with fewer options for
negative criticisms.
17. Always present the study as a finished piece of work (although you may suggest future directions).
Otherwise, you can be sure reviewers will suggest additional research.
18. Be painstaking. Be thorough and patient with several rounds of editing of your work while
considering all the tiny details of the specifications requested by the journal. It will pay off in the end.
19. Focus. If you have a hypothesis to develop, be consistent to the end. Have substantial and
convincing evidence to prove your theories. Brainstorm your ideas and have a definite direction
mapped out before beginning to write an article.
20. Write in a precise and accurate way. Avoid long sentences; the reader may find them difficult to
follow.
21. Team-like spirit is an important attribute that contributes to successful publishing. Welcome advice
from those around you with potentially valuable input. No matter how competent you feel, having
your work seen through a different lens may help to spot flaws that you were unable to identify.
22. As a final step, after completing your research paper, edit, edit, edit. You need to identify and correct
any and all mistakes that you may have made.
23. Short papers are more likely to be read than long ones.
24. Select a descriptive title. Flash and puns are rarely as appealing as they may seem at first. You are
better off going simple and descriptive. This will also help you get cited.
25. Focus on the information the readers require when following your experiment, modeling description,
or data analysis instead of overloading them with details that might have been important during the
study but are irrelevant for them.
26. Your paper should advance a particular line of research. It does not need to answer every remaining
question about the topic.
27. If you present your work at an academic conference prior to submitting it for publication, get
constructive criticisms from as many potential reviewers as possible.
28. Make sure your paper reads well. A bunch of choppy, simple sentences, while grammatically correct,
is unpleasant to read.
256
30. Non-native English speakers should ALWAYS try to arrange for a review by a native speaker. If you
know someone with excellent proofreading skills and a general knowledge about your research
discipline (ex. Biological Sciences), ask them to help you out. If you don’t know someone who meets
these criteria, use a professional editing service such as that offered at JournalPrep.com. You will
save yourself from a great deal of frustration and lost time.
31. Show friends and colleagues your work, including those in different fields of research. Get as much
feedback as you can before you submit.
32. The body of the paper supports the central idea and must show a thoughtful, comprehensive study
of the research topic; it should be clearly written and easy to follow. It generally includes three main
parts: 1) Methodology, 2) Results & Data Analysis, and 3) Discussion.
33. When referencing other papers, do not simply reference work in the same way other papers have. If
paper X says that paper Y showed a specific result, check for yourself to ensure that this is true
before saying the same thing in your own manuscript. The number of reputable authors who
misunderstand their colleagues’ findings is shocking.
34. If you are in the process of running a follow-up experiment, write your manuscript in such a way that
it begs for that experiment. When reviewers respond and request it you will already have it
completed.
Introduction
35. Start your article with a comprehensive yet concise literature review of your exact subject and
highlight in which way your paper will make a new contribution to the field.
36. Throughout your introduction use the past tense. One exception to this is when you are speaking
about generally accepted facts and figures (ex. Heart disease is the leading cause of death…).
37. Avoid using new acronyms. They will simply confuse the readers.
38. The introduction of a research paper is extremely important. It generally presents a brief literature
review, the problem and the purpose of your research work. It should be powerful, simple, realistic,
and logical to entice the reader to read the full paper.
39. Avoid unnecessarily long paragraphs. Break up your paragraphs into smaller, useful units.
41. Do not over-explain common scientific procedures. For example, you do not need to explain how
PCR or Western Blotting work, just that you used the techniques. If you are using a novel technique,
then you need to explain the steps involved.
257
42. Use third person passive tense. For example, “RNA was extracted from the cells.” Compare this
with, “We extracted RNA from the cells.”
43. Be sure to mention from which companies you purchased any significant reagents for your
experiments.
44. When in doubt about how to report your materials and methods, look to papers published in
recognized journals that use similar methods and/or materials.
45. Do not mention sources of typical labware (beakers, stripettes, pipet tips, cell culture flasks, etc).
Results
46. Make sure your graphs and tables can speak for themselves. A lot of people skim over academic
papers.
48. Do not repeat in words everything that your tables and graphs convey. You can, however, point out
key findings and offer some text that complements the findings.
49. Be sure to number your figures and tables according to journal guidelines and refer to them in the
text in the manner specified by the journal.
50. Clear to read graphs are essential. Do not overload graphs with data. Make sure axis descriptions
are not too small.
Discussion
51. Your discussion section should answer WHY you obtained the observed results. Do not simply
restate the results. Also address WHY your results are important (i.e. how do they advance the
understanding of the topic).
52. If multiple explanations for your results exist, be sure to address each one. You can favor one
explanation but be sure to mention alternative explanations, if some exist. If you don’t, your
reviewers will.
53. If your research findings are suggestive or supportive rather than decisive then make sure to indicate
so. NEVER overstate the importance of your research findings. Rather, clearly point to their true
significance.
54. Understand the message of your paper. You may discover what the message is only after a
literature search, as is occasionally the case for some manuscript types such as case reports.
55. Highlight how your research contributes to the current knowledge in the field and mention the next
steps or what remains. Feel free to explain why your results falsify current theories if that is the case.
258
56. Make sure that your discussion is concise and informative. If you ramble and include a great deal of
unnecessary information, your paper will likely get rejected or at least be looked upon less favorably.
57 The importance of the conclusions section should not be overlooked. It includes a brief restatement
. of the other parts of the research paper, such as the methodology, data analysis and results, and
concludes the overall discussion. It should be brief, concise, and worth remembering.
58 Reference page: All references used as sources of information in your research paper should be
. mentioned to strengthen your paper and also to avoid your work being considered plagiarized.
59 Failure to include every obscure reference to a topic will NOT prevent publication. What WILL prevent
. publication is procrastination by insisting on including such references.
60 Use bibliographic software such as EndNote or RefWorks. This will help you format your references
. section readily when you make changes throughout your paper after getting suggestions from friends,
colleagues or reviewers.
Abstract
61. In your abstract, limit the amount of background information you provide. Try to give only what is
necessary in a couple of sentences or less.
63. When writing an abstract, always use the past tense since you are giving a summary of what was
done. One exception is if you mention future directions in your concluding statement.
64. Write a clear and concise abstract. The reader has to understand the study rationale, the methods
used, and the study findings. Many researchers will only ever read the abstract of your paper so it
must contain the most pertinent information.
65. Be sure to check journal guidelines for abstract length. Many journals will not accept abstracts longer
than 200-250 words.
66. Feel free to hook readers with a “big picture” statement to open the abstract. Remember, many
action editors will know very little about your topic area and, in some cases, your abstract will be the
only thing that dictates whether or not you get through triage.
Journal Selection
67. The most common mistake to be made is not knowing the body of research in which an article fits.
Wrong choice of journal for publishing spells outright rejection. Even if the article is very encouraging
with sound and rigorous scholarly work, it will not stand the test.
259
68. Look at journals that have published articles on your topic previously. This is an encouraging sign
that your work may appeal to the journal editors.
69. Look at journal impact factors. This will give you an idea of the quality of the journal and how difficult
it will be to get your paper accepted.
70. Look at journal acceptance/rejection rates. These are sometimes, but not always, inversely
correlated with impact factor values.
71. Look at average time to publication as well as average time to acceptance/rejection notification. If
you want your work published fast then make sure you choose a journal that offers rapid processing.
Some journals will highlight their rapid processing times as an impetus for authors to submit their
work to those particular journals.
72. Some journals charge fees for manuscript processing or color figure reproduction for accepted
manuscripts. Make sure you are familiar with the costs associated with publication before you submit
your work.
Manuscript Submission
73. Look at papers recently published in your journal of interest. Ask yourself if your paper is of equal or
higher caliber. If not, submit your work to a different journal.
74. Identify the journals related to your field of study and their individual focuses, and then select a
journal with a focus similar to the content of your manuscript. Many journals will clearly describe their
focus and scope on their website.
75. Consider your field of study. Every field of study has several different journals publishing information
pertaining to that field. Knowing the names of those journals narrows your prospective playing field.
76. Select two or three journals with a focus similar to the content of your manuscript. While you are only
going to be published in one, preparing multiple choices keeps you from having to duplicate the
selection process immediately following your possible rejection.
77. Locate the contact information for each journal and any information pertaining to submissions. Make
sure you get the most recent information, as the names of editors and submission policies can
change over time and without warning.
78. Go over your manuscript to ensure it is formatted according to the submission guidelines, paying
special attention to the references/bibliography, text formatting, and citation style.
79. Create your cover letter. This should include the name of the editor to whom you are sending your
work, if available. While you want to be personable, you should avoid being too personal. This is a
business communication, not a letter to your friend. Be sure to keep it professional. Include contact
information for the editor in case he or she should wish to speak with you about your work.
80. Get your cover letter professionally edited. Cover letters are often the first thing that a journal editor
will read. Your letter needs to be strong and impressive, as it can set the tone for the subsequent
260
review process.
81. Submit your work. This could be done physically or electronically, depending on the submission
guidelines of your selected journal. In the case of electronic submissions, some journals will accept
attachments; others will not. Be sure to send your work in the correct format. If you are sending it
physically, include a self-addressed, stamped envelope, either large enough to return your work in or
just large enough for them to send you a letter.
82. Aim high but not too high. Aiming for top tier journals with research findings that are not
groundbreaking will leave you with a lot of rejections and lost time.
83. Do NOT submit your article to more than one journal at a time. This is unethical and you will
eventually get caught.
84. When uploading text, table and image files electronically, many submission systems will dynamically
assemble your files into a single PDF document for easier handling. Be sure to review your PDF
after it is generated to ensure that it looks correct and that all information has been included.
85. Respect word length. Many journals have specific requirements for word length for different
document types (original articles, short reports, case reports, review papers, etc). If the journal says
the word limit is 6000 then do not send a paper with 6100 words.
86. If a journal allows you to suggest reviewers for your manuscript, do so. This can work to your
advantage. Suggest reviewers who know your field well and who might be interested in the results
presented in your paper.
87. If a journal allows you to suggest reviewers who you do not want to review your paper, take
advantage of this to make sure your work is not sent to someone in your field who may not see eye
to eye with you, your supervisor, your lab, or your research in general.
88. If you definitely do not want your paper reviewed by specific individuals in your field, do not submit a
paper to a journal where these individuals have published recently. Editors often look to people who
have recently published on a similar topic in their journal to serve as reviewers.
89. If you think specific reviewers may look favorably upon your work, look to journals where they have
recently published and submit your work there, if it is within scope. In doing so, be sure to reference
these individuals in your manuscript whenever credit is due. There is nothing that angers peer
reviewers more than reviewing an article in which their own work should be cited and is not.
90. Read the mission statement for the journal to which you will submit your work. If your paper is highly
theoretical and the journal clearly states that it does not publish purely theoretical work, find a new
journal.
91. Email the editor to see if your manuscript topic is appropriate. Most will happily direct you elsewhere
if it is inappropriate for their journal.
92. Look for journals that have issued calls for papers. They are more likely to look upon any work
favorably.
261
Post-submission
93. When you get initial peer reviews, consider them carefully. In your resubmission cover letter,
respond to each point made by each reviewer. Highlight the points you followed and the ones you
did not (and indicate why).
94. When you are asked to perform additional studies, do them quickly and resubmit your manuscript as
soon as possible.
95. If reviewers suggest changes/additional studies before the article can be published, respond to the
editor indicating that you will address these suggestions so that they know your intentions.
96. Do not respond to reviewer comments in an argumentative tone. Be polite but straightforward. Feel
free to disagree but be sure to have hard evidence to support your claims.
97. If accepted, be sure to carefully check page proofs and do so quickly. A 24-48 hour turnaround
request is typical.
98. In responding to reviewer comments, it is a good idea to copy and paste the reviewers’ comments
verbatim in one color (e.g. black) and add your responses in another color (e.g. blue). You should
also copy and paste any relevant sections from your revised manuscript into your cover letter.
Ideally, a reviewer should be able to tell how adequately you have addressed their comments
without having to read your revised manuscript.
99. Well-organized, well-written response letters can help a manuscript circumvent re-review. The editor
will see the changes that you have made and may accept it outright.
100 Remember to select as many “Key Words” as possible. Many people do key word searches when
. performing literature reviews. This will increase the likelihood of your manuscript being read.
262
Reflections on Academic Research
A2Z
PhD
Thesis
Academic Research
Chapter XXXIII
Plagiarism
263
PLAGIARISM
What is plagiarism?
Plagiarism is the method of taking another person's writing, conversation, song, or even
idea and showing it off as one’s own. This includes information from web pages, books,
songs, television shows, email messages, interviews, articles, artworks or any other
medium. Whenever you paraphrase, summarize, or take words, phrases, or sentences
from another person's work, it is necessary to indicate the source of the
information within your paper using an internal citation. It is not enough to just list the
source in a bibliography at the end of your paper. Failing to properly quote, cite or
acknowledge someone else's words or ideas with an internal citation is plagiarism.
Plagiarism is deception in a literal sense. Plagiarism is copying the work of others and
claiming it as your own. Whether you copy from a published essay, an encyclopedia
article, or a paper from a fraternity's files, you are plagiarizing. If you do so, you run a
terrible risk. You could be punished, suspended, or even expelled. There is also another
kind of plagiarism, known as accidental plagiarism. This happens when a scholar does
not intend to plagiarize, but fails to cite the sources completely and correctly. Careful
notetaking and a clear understanding of the rules for quoting, paraphraing, and
summarizing sources can help prevent this.
Cite every piece of information that is not a) the result of your own research, or b)
common knowledge. This includes opinions, arguments, and speculations as well
as facts, details, figures, and statistics.
Use quotation marks every time you use the author's words. (For longer quotes,
indenting the whole quotation has the same effect as quotation marks.)
At the beginning of the first sentence in which you quote, paraphrase, or
summarize, make it clear that what comes next is someone else's idea:
o According to Smith...
o Jones says...
o In his 1987 study, Robinson proved...
At the end of the last sentence containing quoted, paraphrased, or summarized
material, insert a parenthetical citation to show where the material came from:
The St. Martin's Handbook defines plagiarism as "the use of someone else's
words or ideas as [the writer's] own without crediting the other person" (Lunsford
and Connors 602).
264
(Notice the use of brackets to mark a change in the wording of the original.)
Because a paraphrase is supposed to contain all of the author's information and none of
your own commentary, a paraphrase with no citation is an example of plagiarism.
The St. Martin's Handbook defines an appropriate paraphrase as follows:
A paraphrase accurately states all the relevant information from a passage in your own words
and phrasing,without any additional comments or elaborations [it] always restates all themain
points of the passage in the same order and in about the same number of words. (Lunsford and
Connors 596)
Lunsford and Connors go on to give two examples of unacceptable paraphrases: one that uses
the author's words, and one that uses the author's sentences structures (597).
Lunsford and Connors also state that "even for acceptable paraphrases you must include a
citation in your essay identifying the source of the information" (597). This point is crucial: without
the information about the source, an appropriate paraphrase becomes plagiarism.
Even if you have avoided using the author's words, sentences structure, or style, an unattributed
paraphrase is plagiarism because it presents the same information in the same order.
2. Misplaced citations
If you use a paraphrase or direct quotation, it is important to place the reference at the
very end of all the material cited. Any quoted, paraphrased, or summarized material that
comes after the reference is plagiarized: it looks like it is supposed to be your own idea.
This is one reason why accurate notetaking is so important; it is possible to forget which
words are yours and which are the original writers.
Original source:
Paraphrasing material helps you digest a passage, because chances are you can't restate the
passage in your own words unless you grasp its full meaning. When you incorporate an accurate
paraphrase into your essay, you show your readers that you understand that source. (Lunsford
and Connors 596)
Lunsford and Connors say that paraphrasing is useful because "[p]araphrasing material helps
you digest a passage, because chances are you can't restate the passage in your own words
265
unless you grasp its full meaning" (596). When you incorporate an accurate paraphrase into your
essay, you show your readers your understanding of that source.
The reader would logically assume that the sentence following the citation is your own
comment on the quotation, when it is actually part of the original quote.
Finally, a point about multiple citations from the same source: cite them all individually. It
is not adequate to give one citation at the end of the paragraph for a bunch of individual
points abstracted from a source.
Parenthetical citations are intended to make citing your sources easy to do; don't be shy
about using them.
Taken from Lunsford and Connors 597-98. Key words and phrases in the original are
in boldface. The changes in wording and sentence structure in the paraphrase are
underlined.
Original
But Frida's outlook was vastly different from that of the Surrealists. Her art was not
the product of a disillusioned European culture searching for an escape from the limits of
logic by plumbing the subconscious. Instead, her fantasy was a product of her
temperament, life, and place; it was a way of coming to terms with reality, not of passing
beyond reality into another realm.
Paraphrase
As Herrera explains, Frida's surrealistic vision was unlike that of the European Surrealists.
While their art grew out of their disenchantment with society and their desire to explore the
subconscious mind as a refuge from rational thinking, Frida's vision was an outgrowth of her own
personality and life experiences in Mexico. She used her surrealistic images to understand better
her actual life, not to create a dreamworld (258).
Here’s the ORIGINAL text, from page 1 of Lizzie Borden: A Case Book of Family and
Crime in the 1890s by Joyce Williams et al.:
266
The rise of industry, the growth of cities, and the expansion of the population were the three great
developments of late nineteenth century American history. As new, larger, steam-powered
factories became a feature of the American landscape in the East, they transformed farm hands
into industrial laborers, and provided jobs for a rising tide of immigrants. With industry came
urbanization the growth of large cities (like Fall River, Massachusetts, where the Bordens lived)
which became the centers of production as well as of commerce and trade.
The increase of industry, the growth of cities, and the explosion of the population were three large
factors of nineteenth century America. As steam-driven companies became more visible in the
eastern part of the country, they changed farm hands into factory workers and provided jobs for
the large wave of immigrants. With industry came the growth of large cities like Fall River where
the Bordens lived which turned into centers of commerce and trade as well as production.
the writer has only changed around a few words and phrases, or changed the
order of the original’s sentences.
the writer has failed to cite a source for any of the ideas or facts.
NOTE: This paragraph is also problematic because it changes the sense of several sentences
(for example, "steam-driven companies" in sentence two misses the original’s emphasis on
factories).
Fall River, where the Borden family lived, was typical of northeastern industrial cities of the
nineteenth century. Steam-powered production had shifted labor from agriculture to
manufacturing, and as immigrants arrived in the US, they found work in these new factories. As a
result, populations grew, and large urban areas arose. Fall River was one of these manufacturing
and commercial centers (Williams 1).
Fall River, where the Borden family lived, was typical of northeastern industrial cities of the
nineteenth century. As steam-powered production shifted labor from agriculture to manufacturing,
267
the demand for workers "transformed farm hands into industrial laborers," and created jobs for
immigrants. In turn, growing populations increased the size of urban areas. Fall River was one of
these hubs "which became the centers of production as well as of commerce and trade" (Williams
1).
Note that if the writer had used these phrases or sentences in her own paper without
putting quotation marks around them, she would be PLAGIARIZING. Using another
person’s phrases or sentences without putting quotation marks around them is
considered plagiarism even if the writer cites in her own text the source of the phrases or
sentences she has quoted.
1. Put in quotations everything that comes directly from the text especially when taking
notes.
2. Paraphrase, but be sure you are not just rearranging or replacing a few words.
Instead, read over what you want to paraphrase carefully; cover up the text with your
hand, or close the text so you can’t see any of it (and so aren’t tempted to use the text
as a “guide”). Write out the idea in your own words without peeking.
3. Check your paraphrase against the original text to be sure you have not accidentally
used the same phrases or words, and that the information is accurate.
Common knowledge: facts that can be found in numerous places and are likely to be
known by a lot of people.
Example: John F. Kennedy was elected President of the United States in 1960.
This is generally known information. You do not need to document this fact.
However, you must document facts that are not generally known and ideas that interpret
facts.
268
Example: According the American Family Leave Coalition’s new book, Family Issues
and Congress, President Bush’s relationship with Congress has hindered family leave
legislation (6).
The idea that “Bush’s relationship with Congress has hindered family leave legislation” is
not a fact but an interpretation; consequently, you need to cite your source.
Quotation: using someone’s words. When you quote, place the passage you are using
in quotation marks, and document the source according to a standard documentation
style.
Example: According to Peter S. Pritchard in USA Today, “Public schools need reform but
they’re irreplaceable in teaching all the nation’s young” (14).
Paraphrase: using someone’s ideas, but putting them in your own words. This is
probably the skill you will use most when incorporating sources into your writing.
Although you use your own words to paraphrase, you must still acknowledge the source
of the information.
The original text from Elaine Tyler May's "Myths and Realities of the American Family"
reads as follows:
Because women's wages often continue to reflect the fiction that men earn the family wage,
single mothers rarely earn enough to support themselves and their children adequately. And
because work is still organized around the assumption that mothers stay home with children,
even though few mothers can afford to do so, child-care facilities in the United States remain
woefully inadequate.
Here are some possible uses of this text. As you read through each version, try to
decide if it is a legitimate use of May's text or a plagiarism.
Version A:
Since women's wages often continue to reflect the mistaken notion that men are the main wage
earners in the family, single mothers rarely make enough to support themselves and their children
very well. Also, because work is still based on the assumption that mothers stay home with
children, facilities for child care remain woefully inadequate in the United States.
Plagiarism: In Version A there is too much direct borrowing of sentence structure and
wording. The writer changes some words, drops one phrase, and adds some new
269
language, but the overall text closely resembles May's. Even with a citation, the writer is
still plagiarizing because the lack of quotation marks indicates that Version A is a
paraphrase, and should thus be in the writer's own language.
Version B:
As Elaine Tyler May points out, "women's wages often continue to reflect the fiction that men earn
the family wage" (588). Thus many single mothers cannot support themselves and their children
adequately. Furthermore, since work is based on the assumption that mothers stay home with
children, facilities for day care in this country are still "woefully inadequate." (May 589).
Plagiarism: The writer now cites May, so we're closer to telling the truth about the
relationship of our text to the source, but this text continues to borrow too much
language.
Version C:
By and large, our economy still operates on the mistaken notion that men are the main
breadwinners in the family. Thus, women continue to earn lower wages than men. This means, in
effect, that many single mothers cannot earn a decent living. Furthermore, adequate day care is
not available in the United States because of the mistaken assumption that mothers remain at
home with their children.
Plagiarism: Version C shows good paraphrasing of wording and sentence structure, but
May's original ideas are not acknowledged. Some of May's points are common
knowledge (women earn less than men, many single mothers live in poverty), but May
uses this common knowledge to make a specific and original point and her original
conception of this idea is not acknowledged.
Version D:
Women today still earn less than men — so much less that many single mothers and their
children live near or below the poverty line. Elaine Tyler May argues that this situation stems in
part from "the fiction that men earn the family wage" (588). May further suggests that the
American workplace still operates on the assumption that mothers with children stay home to
care for them (589).
This assumption, in my opinion, does not have the force it once did. More and more
businesses offer in-house day-care facilities. . . .
No Plagiarism: The writer makes use of the common knowledge in May's work, but
acknowledges May's original conclusion and does not try to pass it off as his or her own.
The quotation is properly cited, as is a later paraphrase of another of May's ideas.
There are some tell-tale signs that a passage, paper, article, etc have been plagiarized.
These include:
270
mixed citation styles
no references or quotations
missing references
bibliography entries that have not been cited
strange formatting
anachronisms
datedness
sharp shifts in style
Run papers, passages, etc.., that display any of these features through one of the
plagiarism detection services listed here:
[a] http://www.plagiarismchecker.com/
[b] http://bedfordstmartins.com/technotes/techtiparchive/ttip102401.htm
[c] http://www.duplichecker.com/
[d] http://www.dustball.com/cs/plagiarism.checker/
[e] http://teaching.berkeley.edu/bgd/prevent.html
[f] http://www.virtualsalt.com/antiplag.htm
[g] http://www.plagiarism.com
[h] http://www.plagiarism.org
[i] http://www.plagiarisma.net
271
A2Z
PhD
Thesis
Reflections on Academic Research
Glossary
272
GLOSSARY
A priori contrasts
A special class of test used in conjunction with the F test i.e., specifically designed to test the
hypotheses of the experiment of the study (in comparison to hoe or unplanned test)
ABAB design
Multiple intervention design in which the experimental manipulation occurs at least twice with an
intervening period in which to observe the effect of the withdrawal of the initial manipulation, also
called a reversal design.
Abstract
A brief summary of the research study. [#] Brief summary that appears at the beginning of most
social research reports; can be retrieved by an abstracting service.
Acceptance criterion
The maximum number of defective items that can be found in the sample and still allow
acceptance of the lot.
Acceptance sampling
A statistical procedure in which the number of defective items found. In a sample is used to
determine whether a lot should be accepted or reflected.
Accuracy The
To which bias is absent from the sample- the underestimates and the overestimates are balanced
among members of the sample (i.e., no systematic variance)
Action research
A methodology with brain storming followed by sequential trial – and error top discover the most
effective solution to a problem; succeeding solutions are tired the desired results are achieved
used with complex problems about which title is known.
Active factors
Those independent variables (IV) the researcher can manipulate by causing the subject to
receive on treatment level or another.
273
Administrative question
A measurement question that identifies the participant, interviewer, interview location (nominal
data)
Aggregate
A group of persons that have certain traits or characteristics in common without necessarily
having any direct social connection with one another. For example, "all female physicians" is an
aggregate; Gross National Income is an aggregation of data about individual incomes.
Aggregate-level data
Based on grouped data using a spatial or temporal unit of analysis.
Alpha level
The significance level. Specifically, alpha is the Type I error, or the probability of concluding that
there is a treatment effect when in reality there is not.
Alpha problem
Difficulty of deciding whether to reject the null hypothesis when a few statistically significant
results are produced by many inferential statistical tests.
Alpha
In tests of statistical significance, the alpha level indicates the Probability of committing a Type I
error; in estimates of internal consistency, a reliability coefficient, as in Cronbach alpha. [#]
Probability of wrongly rejecting a null hypothesis; usually set by researcher before the study (by
consensus .05 unless otherwise indicated).
Analysis of covariance
Statistical procedure for adjusting posttest scores for pretest group differences.
Analysis
The process of synthesizing data to answer the research question
274
Anonymity
The assurance that no one, including the researchers, will be able to link data to specific
individual. [#] A research condition in which no one, including the researcher, knows the
identities of research participants.
ANOVA table
A table to summaries the analysis of variance computations and results. It contains columns
showing the source of variation, the sum of squares, the degrees of freedom, the mean square,
and the F values.
Applied research
That addresses existing problems or opportunities. [#] It is a research done
for an express purpose to solve an identified problem. [#] Research
undertaken with the intention of applying the results to some specific
problem, such as studying the effects of different methods of law
enforcement on crime rates. One of the biggest differences between
applied and basic research is that in applied work the research questions
are most often determined, not by researchers, but by policy makers or
others who want help. Types of applied research include evaluation research and action
research.
Arbitrary scales
Universal practice of ad hoe scale development used by instrument designers to create scales
that are highly specific to the practice or object being studied.
Archives
Ongoing records kept by institutions of society.
Area chart
A graphical presentation that displays total frequency, .group frequency, and time series chart or
surface chart.
Area sampling
A cluster sampling technique applied to a population with well-defined political or natural
boundaries: population is divided into homogeneous cluster from which a single-stage or
multistage sample is drawn.
Argument
Statement that explains, interprets, defends. challenges, or explores meaning.
Artifact correlation
Where distinct subgroups in the data combine to give the impression of one.
275
Assent
Agreement by an individual not competent to give legally valid informed consent (e.g., a child or
cognitively impaired person) to participate in research.
Assignable cause
Variations in process outputs that are due to factors such as machine tools wearing out, incorrect
machine settings, poor-quality raw materials, operator error and so on. Corrective action should
be taken when assignable causes of output variation are detected.
Association
The process used to recognize and understand patterns in data and then used to understand and
exploit natural patterns.
Asymmetrical relationship
In which we postulate that change in one variable (IV) is responsible for change in another
variable (DV).
Atomistic Fallacy: The fallacy one commits when making inferences about groups or
aggregates from individuals (see Ecological Fallacy).
Attenuation
Effect of measurement error or unreliability in reducing the apparent magnitude of association of
two variables.
Attitude a learned
Stable predisposition to respond to oneself. Other person, objects or issues in a consistently
favorable or unfavorable way.
Attribute
A specific value of a variable. For instance, the variable sex or gender has two attributes: male
and female.
Attrition
Loss of subjects from a study over time. [#] Loss of study participants during a study. Attrition
can be a threat to the internal validity of a study, and it can change the composition of the study
sample.
Audience
Characteristics and background of the people or groups for whom the secondary source was
created: one of the five factors used to evaluate the value of a secondary source.
Authority
The level of data and the credibility of a source as indicated by the credentials of the author and
publisher: one of five factors used to evaluate the value of a secondary source.
Authority figure
A projective technique (imagination exercise)where participants are asked to imagine that the
brand or product is an authority figure and to describe the attributes of the figure.
276
Autocorrelation
Correlation in the errors that arises when the error terms at successive points in time are related.
Autonomic system
Portion of human nervous system including two subsystems, the sympathetic and the
parasympathetic, the former of which controls certain bodily responses indicating emotion.
Autonomy
The personal capacity participants should possess in research conditions to consider alternatives,
make choices, and act without undue influence or interference of others.
Average linkage
Method evaluates the distance between two clusters by first finding the geometric center of each
cluster and then computing distance between the two centers.
Backward elimination
2
Sequentially removing the variable from a regression model that change R the least: See >>>
Forward selection and Stepwise selection.
Balanced rating
Has an equal number of categories above and below the midpoint or an equal number of
favorable/unfavorable response choices.
Bar chart
A graphical presentation techniques that represents frequency data as horizontal or vertical bars:
vertical bars are most often used for time series and quantitative classifications (histograms,
stacked bar, and multiple-variable charts are specialized bar charts).
Bar code
Technology employing labels containing electronically read vertical bar data codes.
Bar graph
A graphical device for depicting data that have been summarized in a frequency distribution,
relative frequency distribution or percent frequency distribution.
Bayesian statistics
Uses subjective probability estimates based on general experience rather than on data collected
277
Bell curve
Smoothed histogram or bar graph describing the expected frequency for each value of a variable.
The name comes from the fact that such a distribution often has the shape of a bell.
Beneficence An ethical principle that requires an obligation to protect research participants from
harm. The principle of beneficence can be expressed in two general rules: (1) do not harm; and
(2) protect from harm by maximizing possible benefits and minimizing possible risks of harm.
Beta weights
Standardized regression coefficient where the size of the number reflects the level of influence X
exerts on Y.
Between–subjects design
Experimental design in which the contrast between differently treated groups measures the
treatment effect.
Bias
It is a loss of balance and accuracy in the use of research methods. It can creep into research via
the sampling frame, random sampling, or non-response. It can also occur at other stages in
research, such as while interviewing, in the design of questions, or in the way data are analyzed
and presented. Bias means that the research findings will not be representative of, or
generalisable to, a wider population. [#] That part of the deviation of the observed score from the
true value of the construct being measured that is unchanging or tends in one direction (as
distinct from randomly varying error which sums to zero over enough cases)
Biased Sample
A sample that is not representative of the population from which it was drawn
(>>> Representative Sample).
Binominal experiment
A probability experiment having the following four properties : consisting of n identical trials, and
independent trials.
Biographical Research
A narrative approach to research is primarily qualitative, and includes gathering/ using data in the
form of diaries, stories and life histories.
Bivariate Analysis
Pertaining to two variables only.
278
Bivariate association
Association between two variables.
Blind
Technique of avoiding experimenter expectancy by concealing assignments of subject from
researcher or of avoiding demand characteristic by concealing assignment of subject from
subject. When both subject and experimenter are blind to the assignment, the study is called
‘double blind”. [#] When participants do not know if they are being exposed to the experimental
treatment.
Blocking
The process of using the same or similar experimental units for all treatments. The purpose of
blocking is to remove a source of variation from the error term and hence provide a more
powerful test for a difference in population or treatment means. [#] Dividing subjects into groups
based on a measured independent variable.
Boolean operators
Connecting words such as and or that can identify overlapping or non-overlapping sets of
information.
Box plot
A graphical summary of data. A box, drawn from the first to the third quartiles, shows the location
of the middle 50% of the data. Dashed lines, called whiskers, extending from the ends of the box
show the location of data values greater than the third quartile and data values less than the first
quartile.
Boyes’ theorem
A method used to compute posterior probabilities.
Branch
Technique that skips irrelevant questions and directs the interviewee to the next appropriate item.
Branched question
A measurement question sequence determined by the participant’s previous answer(s): the
answer to one question assumes other questions have been asked or answered and directs the
participant to answer specific questions that follow and skip other questions: branched questions
determine question sequencing.
Brand mapping
A projective technique(type of semantic mapping) where participants are presented with different
brands and asked to talk about their perception, usually in relation to several criteria. They may
also be asked to spatially place each brand on one or more semantic maps.
279
Briefing
A short presentation to a small group ,where statistics constitute much of the content.
Buffer question
A neutral measurement question designed chiefly to establish rapport with the participant (usually
nominal data).
Business research
A systematic inquiry that provides information to guide business decisions,
the process of determining, acquiring, analyzing and synthesizing, and
disseminating relevant business data, information, and insights to decision
makers in way that mobilize the organization to take appropriate action
that, in turn maximize organizational performance.
C creativity session
Qualitative technique where an individual activity exercise is followed by a sharing /discussion
session, where participants build on one another’s creative ideas: often used with children: may
be conducted before or during IDIs or group interviews: usually consists of drawing, visual
compilation, or writing exercises.
Call number
Identifying code numbers and letters by which an item can be located in a library.
Callback
Procedure involving repeated attempts to make contact with a targeted participant to ensure that
the targeted participants is reached and motivated to participate in the study.
Canned experimenter
Standardization of experimental procedure by use of tape- recorded in structions.
280
Case
The entity or thing the hypothesis talks about. [#] Unit of analysis, usually individual subject for
whom measures are collected on each variable.
Categorization
For this scale type, participants put themselves or property indicates in groups or categories:
also, a process for grouping data for any variable into limited number of categories.
Causal Hypothesis
A statement hypothesizing that the independent variable affects the dependent variable in some
way.
Causal Relationship
A relationship where an independent variable affects a dependent variable in some way. [#] A
cause-effect relationship. For example, when you evaluate whether your treatment or program
causes an outcome to occur, you are examining a causal relationship.
Causal study
Research that attempts to reveal a causal relationship between variables.( A produces B or
causes B to occur,)
Causal
Pertaining to a cause-effect relationship. [#] Pertaining to the generation of an effect.
Causation situation
Where one variable leads to a specified effect on the other variable.
Cause construct
Your abstract idea or theory of what the cause is in cause-effect relationship you are
investigating.
Cell
In a cross-tabulation, subgroup of the data created by the value intersection of two (or more)
281
variables: each cell contains the count of cases as well as the percentage of the joint
classification.
Census
A count of all the elements in a population. [#] Survey of the entire population.
Central index
Database of publications searchable by the references or citations included in the articles.
Central tendency
A measure of location, most commonly the mean, median, and mode. [#]An estimate of the
center of a distribution of values. The most usual measures of central tendency are the mean,
median and mode. [#] In descriptive statistics, the value of score best representing a group of
scores (for example, the mean).
Centroid
A term used for the multivariate mean scores in MANOVA.
Chebyshev’s theorem
A theorem applying to any data set that can be used to make statements about the proportion of
items that must be within a specified number of standard deviations of the mean.
Checklist
A measurement question that poses numerous alternatives and encourages multiple unordered
responses: see multiple-choice, multiple-response scale.
Children
Persons who have not yet attained the legal age for consent to treatment or procedures involved
in the research, as determined under the applicable law of the jurisdiction in which the research
will be conducted.
Children’s panel
A series of focus group sessions where the same child may participate in up to three groups in
one year, with each experience several month apart.
Claim
A statement, similar to a hypothesis, which is made in response to the research question at
hand, and that is backed up with evidence based on research.
282
Class midpoint
The point in each class that is halfway between the lower and upper class limits.
Classical method
A method of assigning probabilities that assumes that experimental outcomes are equally likely.
Classification question
A measurement question that provides sociological-demographic variable for use in grouping
participants answer (nominal, ordinal, interval,
Closed-ended Questions
Survey questions that can only be answered in predetermined ways (for example, a scale of one
to five measuring satisfaction with something).
Cluster analysis
Identifies homogeneous subgroups and then draws a sample from each subgroup, a single-stage
or multistage design.
Cluster Sample
A probability sample that is determined by randomly selecting clusters of people from
a population and subsequently selecting every person in each cluster for inclusion in the sample.
Cluster
Sample unit consisting of a group of elements, for example, a college or city.
Clustering
A technique that assigns each data record to a group or segment automatically by clustering
algorithms that identify the similar characteristics in the data set and then partition them into
groups.
Clusters sampling
A probability sampling method in which the population is first divided into clusters and then one
or more clusters are selected for sampling. In single-stage cluster sampling, every element in
each selected cluster is sampled; in tow-stage cluster sampling, a sample of the elements in each
selected cluster is collected.
283
Co linearity
When two independent variables are highly correlated: causes estimated regression coefficients
to fluctuate widely, making interpretation difficult.
Code book
A written description of the data that describes each variable and indicates where and how it can
be accessed.
Code of ethics
An organization’s codified set of norms or standards of behavior that guide moral choices about
research behavior; effective codes are regulative, protect the public interest, are behavior-
specific, and are enforceable.
Codebook
The coding rules for assigning numbers or other symbols to each variable: a.k.a coding scheme.
[#] Index that names the variables and specifies their location in the data set.
Coded Data
Refers to a way of recording material at data collection, either manually or on computer, for
analysis. The data are put into groups or categories, such as age groups, and each category is
given a code number. Data are usually coded for convenience, speed, computer storage space
and to permit statistical analysis.
Codes
Numbers given to indicate specific data items as part of the process of preparing quantitative data
for analysis. The Code Book sets out and labels all the codes in use in a particular piece of
research. While this may be a separate document, prepared as part of the process on getting
data ready for analysis, it may also be incorporated into the questionnaire itself or in the computer
analysis process.
Coding
Assigning numbers or other symbols to responses so that they can be tallied and grouped into a
limited number of categories. [#] The process of categorizing qualitative data.
Coefficient alpha
Reliability coefficient of length adjusted, inter-item or within-test consistency appropriate for tests
with items with three or more answer options (KR- 20 statistic substitutes when items have two
answer options).
Coefficient of determination
A measure of the goodness of fit of the estimated regressing equation. It can be interpreted as
the proportion of the variation in the dependent variable y that is explained by the estimated
2
regression equation. [#] Transformation of r by squaring (r ), which expresses a relations in PRE
terms, that is percentage of the variance explained.
Coefficient of variation
A measure of relative variability for a data set, found by dividing the standard deviation by the
mean and multiplying by 100.
284
2
Coefficients of determination (r )
The amount of common variance in X and Y, two variable in regression: the ratio of the line of
best fit’s error over that incurred by using the mean value of Y.
Cognitively Impaired
Having either a psychiatric disorder (e.g., psychosis, neurosis, personality or behavior disorders,
or dementia) or a developmental disorder (e.g., mental retardation) that affects cognitive or
emotional functions to the extent that capacity for judgment and reasoning is significantly
diminished. Capacity for autonomy and voluntary participation is thus impaired. Others, including
people under the influence of or dependent on drugs or alcohol, those suffering from
degenerative diseases affecting the brain, terminally ill patients, and persons with severely
disabling physical handicaps, may also be compromised in their ability to make decisions in their
best interests.
Cohort Study
A specific kind of trend study involving the study of a cohort over time. [#] Types of trend survey
in which fresh samples are drawn and interviewed from the same subpopulation, known as a
cohort and usually defined by birth year, as it ages.
Cohort
A group of people born within a given time frame or experiencing a life event at approximately
the same time.
Command
In the context of an online catalog search, the part of the user’s instructions and tells the
computer the desired action, for example FIND.
Common causes
Normal or natural variations in process outputs that are due purely to chance. No corrective
action is necessary when output variations are due to common causes.
Communication approach
Involving questioning or surveying people (by personal interview, telephone, mail, computer, or
some combination of these) and recoding their responses for analysis.
Communication study
The research questions the participants and collects their responses by personal means.
Comparative scale
A scale where the participant evaluates an object against a standard using numerical ,graphical,
or verbal scale.
Compensation
Payment or medical care provided to participants injured in research; does not refer to payment
for participation in research
285
Compensatory contamination
Problem of control subjects acquiring the experimental treatment through rivalry of diffusion; has
the effect or reducing the difference between experimental and control conditions.
Compensatory program
A program given to only those who need it on the basis of some screening mechanism.
Compensatory rivalry
A social threat to internal validity that occurs when one group knows the program another group
is getting and, because of that, develops a competitive attitude with the other group. Often it is the
comparison group knowing that the program group is receiving a desirable program (e.g., new
computers) that generates the rivalry.
Competence
Used as a legal term to indicate a person’s capacity to act on one’s own behalf; a person’s ability
to understand information presented, to realize the consequences of acting (or not acting) on that
information, and to make a choice
Complementary inference
This is when the results of two strands of a mixed methods study provide two different but non-
conflicting conclusions or interpretations.
Completion rate
Proportion of the sample that is successfully contacted and interviewed.
Component
Sorts a projective technique where participants are presented with flash cards containing
components features and asked to create new combinations.
Compound item
Question that consists of two or more components.
286
Computer-administered telephone
Survey a telephone survey via voice- synthesized Computer questions: data are tallied
continuously.
Concealment
A technique in an observation study where the observer is shielded from the participant to avoid
error caused by observer’s presence: this is accomplished by one-way mirrors, hidden cameras,
hidden microphones.etc.
Concept
A bundle of meaning or characteristics associated with certain concrete , unambiguous events
,objects, conditions, or situations.
Concept maps
Two dimensional graphs of a group’s ideas where that are more similar are located closer
together and those judged less similarly are more distant. Concept maps are often used by a
group to develop a conceptual framework for research project.
Conceptual Framework
This is a consistent and comprehensive theoretical framework emerging from an inductive
integration of previous literature, theories, and other pertinent information. Conceptual framework
is usually the basis for reframing the research questions and for formulating hypotheses or
making informal tentative predictions about the possible outcome of the study.
Conceptual scheme
The interrelationship between concepts and constructs.
Conceptual utilization
Evaluation use in which the research provides background information or clarification but does
not actually guide the policy choices.
Conclusion validity
The degree to which conclusions your each about relationships in your data are reasonable.
Concordant
When a participant that ranks higher on one ordinal variable also ranks higher on another
variable. the pairs of variables are concordant.
287
Concurrent Mixed Method Design
This is a multistrand design in which both QUAL and QUAN data are collected and analyzed to
answer a single type of research question (either QUAL or QUAN). The final inferences are
based on both data analysis results. The two types of data are collected independently at the
same time or with a time lag.
Concurrent validity
An operationalization’s ability to distinguish between groups that it should theoretically be able to
distinguish between.
Conditioning factor
Variable that affects the relationship between two other variable and may explain conflicts in
literature reviews.
Confidence coefficients
The confidence level expressed as a decimal value. For example, 0.95 is the confidence
coefficient for a 95% confidence level.
Confidence interval
Technically, 1-alpha. The confidence interval is the probability of correctly concluding that there is
no treatment effect. [#] Range of values around the sample estimate within which we can expect
the population value to fall at some probability level. [#] The confidence associated with an
interval estimate. For example, if an interval estimation procedure provides intervals such that
95% of the intervals formed using the procedure will include the population parameter, the
interval estimate is said to be constructed at the 95% confidence level.
Confidentiality
A research condition in which no one except the researcher(s) knows the identities of the
research participants. The treatment of information that a participant has disclosed to the
288
researcher in a relationship of trust and with the expectation that it will not be revealed to others
in ways that violate the original agreement, unless permission is granted by the participant. [#] A
privacy guarantee to retain validity of the research as well as to protect participants. [#] An
assurance made to study participants that identifying information about them acquired through
the study will not be released to anyone outside of the study.
Confirmatory research
Data collection and analysis aimed at testing prior hypotheses.
Confounding Factor
Any factor that might serve as an alternative explanation for a study’s result; confounding factors
include non-randomized samples, selection bias, and any arbitrary differences between people
that are being compared. [#] In a case of spuriousness, the “third” variable, which actually causes
the two variable and makes them appear connected.
Conjoint analysis
Measures complex decision making that requires multiattribute judgments uses input from no
metric independent variables to secure part-worth’s that represent the importance of each aspect
of the participant’s overall assessment: produces a scale value for each attribute or property.
Consensus scaling
Scale development by a panel of experts evaluating instrument items based on topical relevance
and lack of ambiguity.
Consistency
A property of a point estimator that is present whenever larger sample sizes tend to provide point
estimates closer to the population parameter.
Constant dollars
Monetary expression of costs or benefits adjusted for inflation.
Construct
Something that exists theoretically but is not directly observable. (#) A concept developed
(constructed) for describing relations among phenomena or for other research purposes. (#) A
theoretical definition in which concepts are defined in terms of other concepts. For example,
intelligence cannot be directly observed or measured; it is a construct. [#] A definition specifically
invented to represent an abstract phenomenon for a given research project.
Construct validity
Approach to measurement validity that assesses the extent to which the measure reflects the
intended construct with different methods focusing on the relations among observed variables or
on the fit of observed associations with theory. [#] The degree to which inferences can
289
legitimately be made from the operationalizations in your study to the theoretical constructs on
which those operationalizations are based.
Constructivism
The belief that you construct your view of the world based on your experiences and perceptions.
Consumer’s risk
The risk of accepting a poor-quality lot. This is a Type II error.
Contamination
Intrasession events that cause doubt that the experimental and control groups differ only on the
studied variable.
Content analysis
A flexible, widely applicable tool for measuring the semantic content of a communication –
including counts, categorizations, associations interpretation.etc.(e.g..used to study the content of
speeches. Newspaper and magazine editorials , focus group and IDI transcripts); contains four
types of items : syntactical, referential, propositional: initial process is done by computer. [#] The
systematic and quantitative study of some form of communication (e.g. speeches, TV programs,
newspaper articles, advertisements, etc.). [#] The analysis of text documents. The analysis can
be quantitative, qualitative, or both. Typically, the major purpose of content analysis is to identity
patterns in text.
Content validity
A check of the operationalization against the relevant content domain for the construct. [#]
Approach to measurement validity that judges the content of the test (for example, and
achievement test) for its adequacy in representing the domain being covered.
Contingency coefficient
A measure of association for nominal, nonparametric variables: used with any size chi-square
table, the upper limit varies with table’s sizes: does not provide direction of the association or
reflect causation.
Contingency table
A table used to summaries observed and expected frequencies for a test of independence. [#] A
cross-tabulation table constructed for statistical testing, with test determining whether the
classification variables are independent.
Contingency table
Cross –tabulation among two or more variables.
Continuous measure
Type of quantitative variable that can take on any value in its possible range, for example, a
person’s height, which can be measured in fractions of an inch or meter.
Contraindicated
Disadvantageous, perhaps dangerous; a treatment that should not be used in certain individuals
290
or conditions due to risks. For instance, a drug may be contraindicated for pregnant women and
people with high blood pressure. Such individuals should not be involved in the study.
Control
The ability to replicate a scenario and dictate particular outcomes: the ability to exclude, isolate or
manipulate the influence of a variable in a study : a critical factor in inference from an experiment,
implies that all factors. With the exception of the independent variable (IV), must be held constant
and not confounded with another variable that is not part of the study.
Control dimension
In quota sampling a descriptor used to define the sample’s characteristics (e.g. education,
religion).
Control group
A group of participants that is not exposed to the independent variable being studied but still
generates a measure for the dependent variable. [#] This is a
feature of experimental research, and is there to provide a contrast
to the experimental group through the removal of the independent
variable. The use of a control group may be necessary in order to
measure the validity of a research finding. [#] In experimental
research, a group that, for the sake of comparison, does not
receive the treatment the experimenter is interested in. [#] The
group in an experimental design that receives either no treatment
or a different treatment from the experimental group. This group
can thus be compared to the experimental group. [#] Condition in
which the experimental treatment is withheld to provide a
comparison with the treated group.
Control variable
A variable introduced to help interpret the relationship between variables.
Controlled Variables
Researchers may control some variables in order to allow the research to focus on specific
variables without being distorted by the impact of the excluded variables. A common way to
control a variable is to be selective; eg gender is controlled by selecting as respondents only men
or only women; age can be partially controlled by restricting a sample to one age range, rather
than any age. See also Control group.
Controlled vocabulary
Carefully defined subject hierarchies used to search some bibliographic databases. [#]
Set of terms officially designed and recognized by a catalog or file system.
Convenience Sample
A non-probability sample that is determined by selecting participants that are readily accessible
(convenient) to the researcher. [#] Non-probability sample where element selection is based on
ease of accessibility. [#]A non-probabilistic method of sampling whereby elements are selected
291
on the basis of convenience. [#] No probability sampling where researchers use any readily
available individuals as participants.
Convergent Inference
This is when the conclusions or interpretations of two strands of a mixed methods study are
consistent with each other (i.e., agree with each other).
Convergent interviewing
An IDI technique for interviewing a limited number of experts as participants in a sequential
series of IDIs: after each successive interview, the researcher refines the questions, hoping to
converge on the central issuers in a topic area: sometimes called convergent and divergent
interviewing.
Convergent validity
The degree to which the operationalization is similar to (converges on) other operationalizations
to which it should be theoretically similar.
Cook's D
A measure of the influence of an observation based on the residual and leverage.
Correlation
The extent to which two or more things are related ("co-related") to one another. This is usually
expressed as a correlation coefficient.
Correlation coefficient
A numerical measure of the strength of the liner association between two variables that takes
values between -1 and +1. values near +1 indicate a strong positive linear relationship, while
values near -1 indicate a strong negative linear relationship. Value near zero indicate lack of a
linear relationship. [#] A statistical measurement of the degree of correlational
relationship between two variables. Values of correlation coefficients range from –1.00 to +1.00.
A correlation coefficient of 0.00 indicates no relationship between the variables. Correlations
approaching –1.00 or +1.00 indicate strong relationships between the variables.
Correlation hypothesis
A statement indicating that variables occur together in some specified manner without implying
that one causes the other
Correlation matrix
A table of correlations showing all possible relationship among a set of variables. The diagonal of
a correlation matrix (the numbers that go from the upper-left corner to the left right) always
consists of ones because these are the correlations between each variable and itself (and a
variable is always perfectly correlated with itself). Off-diagonal elements are the correlation of the
variables represented by that row and column in the matrix.
292
Correlation relationship
Two variables that perform in a synchronized manner. [#]By which two or more variables changes
together, such that systematic changes in one accompany changes in the systematic changes in
the other.
Correlation
A single number that describes the degree of relationship between two variables.
Correlational design
Research approach in which the independent variable is measured rather than fixed by an
intervention.
Correlational Relationship
A relationship where two variables are associated (this can be measured in terms of strength and
direction using statistical tests) but not causally related. They vary together in some way, but the
variation of one does not itself cause the variation of the other.
Cost-effectiveness analysis
Approach to efficiency assessment that compares different programs producing the same type of
nonmonetary benefit in terms of their respective monetary costs.
Counterbalancing
Technique for studying all possible orders or multiple interventions in incomplete within- subjects
design by use different subjects who, taken together, experience all possible orders.
Covariance
A numerical measure of linear association between two variables. Positive value indicate a
positive relationship, while negative values indicate a negative relationship.
Covariates
Variables you adjust for in your study.
Covariation
A state that exists when two things - such as the price and the sales of a commodity - vary
together. Measures of association are designed to capture the degree of covariation.
Covenation association
Level of one variable predicts the level of another.
Cover story
False explanation for the experiment used to distract the subject from guessing the true nature of
the study.
Cramer’s V
A measure of association for nominal ,nonparametric variable: used with large than 2 X 2 chi-
293
square tables : does not provide direction of the association or reflect causation: range from zero
to +1.0.
Criterion validity
Approach to measurement validity that correlates the measure to be validated with another called
a criterion, which is accepted as valid.
Criterion-related validity
The validation of a measure based on its relationship to another independent measure as
predicted by your theory of how the measures should behaves.
Critical realism
The belief that there is an external reality independent of a person’s thinking (realism) but that we
can never know that reality with perfect accuracy.
Critical value
A value that is compared with the test statistic to determine whether Ho should be rejected. [#]
The dividing point(s) between the region of acceptance and the region of rejection: these values
can be computed in terms of the standardized random variable due to the normal distribution of
sample means. [#] Values derived from the probability distribution of an inferential statistic used
to determine the statistical significance of an observed value of the statistic at any given alpha
level.
Critiquing
Uses the same principles as literature reviews. In a critique of a paper or piece of research you
must analyse method as well as content, and say whether or not you agree with the arguments
and why. If you disagree you need to be able to say what evidence supports your objection. You
also need to identify gaps in the literature.
Cronbach’s Alpha
One specific method of estimating the reliability of a measure. Although not calculated in this
manner, Cronbach’s Alpha can be thought of as analogous to the average of all possible spilt-half
correlations.
Cross – tabulation
Tabular summary of an association in which each individual is assigned to one and only one cell,
representing a combination of the levels of the variables.
294
Cross- sectional correlation
Association of variables measured at the same time; also synchronous, static, or on-time
correlation.
Crossbreaks
Also called cross-tabulation ("crosstabs") and cross partitions. A way of arranging data about
categorical variables in a matrix so that relations can be more clearly seen. This is not to be
confused with a factorial table, in which two or more variables are related to a third. While not all
researchers make these distinctions in the terms, the concepts are quite distinct.
Cross-level inference
Drawing a causal conclusion at of level of analysis using data from another level.
Cross-sectional data
Data collection at the same or approximate the same point in same.
Cross-sectional
A study that takes place at a single point in time.
Cross-tabulating
A process of analysing data according to one or more key variables. A common example is to
analyse data by the gender of the research subject or respondent, so that you can compare
findings for men with findings for women. Also known as cross-referencing.
Cross-tabulation
A tabular summary of data for two variables. Classes for one variable are represented by the
rows, while classes for the other variable are represented by the columns.
Cycles
Patterns in time series marked by recurring highs and lows.
295
Data
Information collected by a researcher. (Data is the plural term; datum the singular). Data are often
thought of as statistical or quantitative, but they may take many other forms as well--such as
transcripts of interviews or videotapes of social interactions. Non-quantitative data such as
transcripts or videotapes are often coded or translated into numbers to make them easier to
analyze.
Data
The facts and figures that are collected, analyzed, and interpreted.
Data Analysis
Subjecting data to a systematic analysis which can range from statistical to textual analysis,
either manually or electronically.
Data audit
The process of reviewing data collection procedures and data to make judgments about the
potential for bias or discussion.
Data base
A collection of data organized for rapid search and retrieval, usually by a computer; often a
consolidation of many records previously stored separately.
Data Consolidation
This means combining qualitative and quantitative data to create new or consolidated variables or
data sets.
Data Conversion/Transformation
Collected quantitative data types are converted into narratives that can be analyzed qualitatively
(i.e., qualitized), and/or qualitative data types are converted into numerical codes that can be
statistically analyzed (i.e., quantitized).
Data Entry
Data may be entered on to computer directly via a keyboard, or by transferring blocks of coded or
raw data into the analysis program. With a manual approach it is usual to set out a matrix on
squared paper, with a square for each data item.
296
Data Quality
This is the degree to which the collected data (results of measurement or observation) meet the
standards of quality to be considered valid (trustworthy) and reliable (dependable). This term
has been used by Punch (1998) to represent “quality control of data (p. 257)” “in terms of
procedures in the collection of the data, and … in terms of three technical aspects of quality of the
data: reliability, validity, and reactivity, p.257)” (1) Data/Measurement validity: Do the results of
data collection truly represent the construct or phenomenon that they are expected to capture
(measure or represent)? “How well the data represent the phenomena for which they stand
(Punch, 1998, p.258).” See also convergent validity and discriminant validity.
(2) Data/Measurement reliability: Do the obtained results of measurement or observation
accurately reflect the magnitude, intensity, or quality of the attribute or phenomenon that is being
measured or observed?
Data set
All the data collected in a particular study.
Data set
A collection of related data items, such as answers given by respondents to all questions on a
survey.
Data
In everyday conversation 'data' and 'information' are often used as meaning much the same
thing, but sometimes they are used differently. In research it is common to refer to the raw
material gathered as 'data', and it is then processed manually or on computer. It becomes
information when it acquires meaning through aggregation or interpretation by the researcher or
automated analysis. Some depict a hierarchy of: data -> information -> knowledge.
Debriefing
After running a study, explaining to a participant what happened and what the study is for,
explaining any deception used in the study, asking for any remaining comments or concerns, and
ensuring that the participant is left with no adverse consequences from the experience. This
sometimes involves providing contact information for groups that can provide support regarding a
difficult issue.
Debriefing
Researcher’s interview with the subject after the experiment to check the subject’s beliefs about
the study and to tell the subject about the purpose of the study.
Deception
The intentional withholding of information from participants, or deception about the study’s
purpose and exact nature, that is deemed necessary by the researcher in order to meet the
study’s goals. Deception should only be used when the researcher feels that participant
knowledge about the study would alter participants’ behavior or responses in the study.
Deception should not cause any adverse consequences to the participants, and participants
should be debriefed after running the study.
Deduction
Drawing of specific assertions from general principles.
Deductive
Top-down reasoning that works from the more general to the more specific.
297
Deductive inference (in research cycle)
This is a process in which hypotheses or predictions are formed on the basis of (1) a conceptual
framework that is constructed from the literature, (2) the inferences of a previous strand of a
mixed methods study, or (3) an existing theory. See also inference and inference quality.
Deductive logic
(Erzberger & Kelle, 2003) (1) This refers to the application of general rules to specific cases. For
example, from the general rule that all men are mortal, it can be deduced that if Socrates is a
man, then he will be mortal. (2) This refers to a type of reasoning usually applied if a link is drawn
from an already formulated theoretical statement to a statement about observable empirical facts,
a link that can be generalized in the following term: “If A (a theoretical statement) is true, then we
would expect the fact C to happen.”
Degrees of freedom
A parameter of the t distribution. When the t distribution is used in the computation of an interval
estimate of a population mean, the appropriate t distribution bas n-1degrees of freedom, where n
is the size of the simple random sample.
Degrees of freedom
In inferential statistics, amount of free information left in the data after calculating the inferential
statistic.
Delphi method
A qualitative forecasting method that obtains forecasts through group consensus.
Demand Characteristics
A bias that results when participants display characteristics because they are aware that they are
being observed.
Demand characteristics
Cues in the experimental situation that guide a subject’s view of the study.
Demographics
Information about the sample that includes areas such as age, sex, social class, presence of
children, etc.
Dependent variable
The variable that is being predicted or explained. It is denoted by y.
Dependent variable
The presumed effect in a study; so called because it "depends" on another variable. (#) The
variable whose values are predicted by the independent variable, whether
or not caused by it. For example, in a study to see if there is a relationship
between students' drinking of alcoholic beverages and their grade point
averages, the drinking behavior would be the presumed cause
(independent variable); the grade point average would be the effect
(dependent variable). # A variable that varies due (at least in part) to the
298
impact of the independent variable – that is, its value “depends” on the value of the independent
variable. In the variables “sex” and “academic major,” academic major is the dependent variable,
meaning that your major can’t determine whether you are male or female, but your sex might
indirectly lead you to favor one major over another (nationally, men tend to major in engineering,
women in education).
Descriptive association
Observing a relationship between variables without claiming causality.
Descriptive Research
Describes certain characteristics of populations, and identifies and explores relationships
between variables.
Descriptive statistics
Tabular, graphical and numerical methods used to summarize data.[#] Statistics that summarize
a data set, e.g., mean, median, mode, standard deviation. [#] Statistics used to describe the basic
feature of the data in a study. [#] Statistics used to characterize a group of observations (for
example, central tenancy or variability) or an association of two or more variables.
Descriptive Study
Any study that is not truly experimental (e.g., quasi-experimental studies, correlational studies,
record reviews, case histories, and observational studies).
Descriptive
Characterizing something or some relationship.
Deseasonalize
To remove the seasonal cycle from a time series by a statistical method.
Design
In research, the arrangement of subjects, experimental manipulation, and observation of results.
Desk Study
An umbrella name given to sedentary research, primarily reading and note taking, and thinking.
Detrend
To remove that trend from a time series by a statistical method.
Developmental Research
A variant of applied research in that the research has a problem solving function and leads to
further research on the basis of its own findings
Dialectical position
(Greene & Caracelli, 2003) To think dialectically is to invite the juxtaposition of opposed or
contradictory ideas, to interact with the tensions invoked by these contesting arguments, or to
engage in the play of ideas. The arguments and ideas that are engaged in this dialectic stance
emanate from the assumptions that constitute philosophical paradigms—assumptions about the
social world, social knowledge, and the purpose of science in society.
299
Diary
A record of events, ideas or feelings in the life of or affecting the diary author. More specifically a
research diary aims to be a record made by a respondent as close as possible to the time of
occurrence of the events/ ideas/feelings, and may be both structured (e.g. a particular format for
recording) and focused (e.g. recording specific items only).
Dichotomous question
A question with two possible responses.
Direct causation
Impact of one variable on an other not involving another variable.
Direct observation
The process of observing a phenomenon to gather information about it. This process is
distinguished from participated observation in that a direct observer doesn’t typically try to
become a participant in the context and does strive to be as unobtrusive as possible so as not to
bias the observations.
Discount rate
Interest rate used to adjust future monetary benefits fro the rate of gain that cold be expected by
some alternative use of the same funds.
Discrete measure
Type of quantitative variable that can take on only certain values between which there are gaps;
for example, counting the number of students enrolled in a class, for which fraction would be
nonsensical.
Discriminant validity
The degree to which concepts that should not be related theoretically are, in fact, not interrelated
in reality.
300
Dispersion
The spread of the values around the central tendency. The two common measures of dispersion
are the range and the standard deviation.
Dissemination
The mechanisms by which the results of research are communicated to stakeholders and other
interested parties
Distribution
The manner in which a variable takes different values in your data.
Distribution-free methods
Another name for non-parametric statistical methods that indicates the lack of assumptions about
the population probability distribution.
Divergent inference
(Erzberger & Kelle, 2003) This is when the inferences made on the basis of the two strands of a
mixed methods study are inconsistent or dissonant (Rossman & Wilson, 1985); that is, they do
not agree with each other. Inconsistencies between qualitative and quantitative findings might be
a consequence of the inadequacy of the applied theoretical concepts. It might, therefore, be
necessary to revise and modify the initial theoretical assumptions and to draw on further
theoretical concepts that have not yet been related to the domain in question.
Dot plot
A simple graphical summary of data with each observation represented by a dot placed above a
horizontal axis that shows the range of values for the observations.
Double entry
An automated method for checking data-entry accuracy in which your enter data once and then
enter it a second time, with the software automatically stopping each time a discrepancy is
detected until the data enterer resolves the discrepancy. This procedure assures extremely high
rates of a data entry accuracy, although it requires twice as long for data entry.
Double-Blind Design
An experiment in which neither the participants nor the research staff who interact with them
knows the memberships of the experimental or control groups. Also known as Double-Masked
Design
Dummy variable
A variable used to model the effect of qualitative independent variables. A dummy variable may
take only the value zero or one. [#] A variable that uses discrete numbers, usually 0 and 1, to
represent different groups in your study in the equations of the GLM.
301
Durbin-Watson test
A test to determine whether first-order autocorrelation is present.
Ecological fallacy
False interpretation of aggregate level data in individual-level terms.
Ecological fallacy
Faulty reasoning that results from making conclusions about individuals based only on analyses
of group data.
Ecological Fallacy
The fallacy one commits when making inferences about individuals from information about groups
or aggregates.
Ecological transferability
This refers to generalizability or applicability of inferences obtained in a study to other settings or
contexts. Subumes the QUAN term ecological validity and ecological external validity, and the
QUAL term transferability See inference transferability.
Ecological validity
Extend to which a research situation represents the natural social environment.
Editing Data
The process of going over the data and ensuring that they are complete and acceptable for data
analysis.
Effect construct
Your abstract idea or theory of what the outcome is in a cause-effect relationship you are
investigating.
Effect size
This refers to the intensity, magnitude, or practical significance of an obtained result (e.g.,
relationship, difference) in the QUAL or QUAN strands of a mixed methods
study. Onwuegbuzie and Teddlie (2003) explicitly relate this historically QUAN term to QUAL
research, naming several new terms, including manifest effect size, frequency (manifest) effect
size, and intensity (manifest) effect size. [#] Numerical index of the magnitude of a relationship
found in a study; commonly used in meta-analysis.
Efficiency analysis
Stage of evaluative research that weights the program’s outcomes by its costs.
302
Efficiency in Sampling
Attained when the sampling design chosen either results in a cost reduction to the research or
offers a greater degree of accuracy in terms the sample size.
Electronics questionnaire
Online questionnaire administered when the microcomputer is hooked up to computer networks.
Element
A single member of the population.[#] Unit from whom survey information is collected, usually a
person.
Elements
The entities on which data are collected.
Emancipated Minor
A legal status given to those individuals who have not yet attained the age of
legal competency as defined by state law, but who are entitled to adult treatment because of
assuming adult responsibilities such as being self-supporting and not living at home, marriage, or
procreation.
Emancipatory Research
Emancipatory research is conducted on and with people from marginalised groups/communities.
It is led by a researcher or research team who is either an indigenous or external insider; is
interpreted within intellectual frameworks of that group; and is conducted largely for the purpose
of empowering members of that community and improving services for them. It also engages
members of the community as co-constructors or validators of knowledge.
Empirical
Based on direct observations and measurements of reality.
Empirical Research
Research conducted 'in the field', where data are gathered first hand. Case studies and surveys
are examples of empirical research.
Empirical rule
A rule that states the percentages of items that is within one, two and three standard deviations
from the mean for mound-shaped, or bell-shaped, distributions.
Endogenous construct
In a theory, a construct that is caused by one or more other constructs exogenous or
endogenous, within the theory.
303
Enterprise resource planning
Integrated system solution for standard business requirements for the enterprise, often supported
by a single application package. l
Entry
First step in which the researcher gains access to the social setting to be studied.
Enumeration
List of all elements in the population; usually not available.
Epistemological assumptions
The assumptions that underlie the theory of methods.
Epistemology
Branch of philosophy dealing with the nature of knowledge and other ability to know. [#]
The philosophy of knowledge or of how you come to know about the world.
Equitable
Fair or just; used in the context of selection of participants to indicate that the benefits and
burdens of research are fairly distributed.
Error
The difference between an observed score and a predicted or estimated score. Symbolized as e
or E. [#] The deviation of observed scores from true scores, including both random errors and
such nonrandom sources as bias.
Error term
A term in a regression equation that captures the degree to which the line is in error (for example,
the residual) in describing each point.
Ethical committees
Health sector research to be conducted with or about patients has to be approved by the
appropriate local ethical committee. The main functions of ethical committees are to protect
patients, their families and staff, and to promote and uphold good research practice and
standards.
Ethical Research
Research that follows widely held guidelines about what is ethical, moral and responsible in
304
research settings (e.g. not plagiarizing others’ work, not misreporting sources, not submitting
questionable data, not destroying or concealing sources, etc.) and that considers its role in the
broader community and the effect of its findings on the community.
Ethics
Branch of Philosophy that pertains to the study of right and wrong conduct. [#] Code of conduct
or expected societal norms of behavior .
Ethnocentrism
Perceptual bias because of one’s own cultural beliefs.
Ethnographic Research
Ethnography is the study of people and their cultures. Ethnographic research involves
observation of and interactions with the people or group being studied in the group’s own
environment, often for long periods of time.
Ethnography
It is a combination of ethnos = people or race, and graphy = to describe or write about. The
primary method used is observation, and the key features are a focus on description, multi-
dimensionality and noting processes. It is an all-embracing approach - 'ethnographers tend to go
looking, rather than go looking for something'.
Ethnomethodology
It is the study of common social knowledge, in particular as it concerns the understanding of
others and the varieties of circumstance in which it can take place
Ethnography
Field research technique originating in anthropology that emphasizes the phenomenological
approach. [#] Study of a culture using qualitative field research.
Evaluation
A form of research used to assess the value or effectiveness of social care interventions or
programmes.
Evaluation apprehension
Subject’s anxiety generated by being tested.
Event
A collection of sample points.
Exaggerating contamination
Problem of control subjects moving in a direction opposite to that of the experimental subjects (for
305
example, by resentful demoralization); has the effect of increasing the difference between
experimental and control conditions.
Except
Very low-risk category of review in which the investigator seeks clearance by an appointed
administrator such as the department chair rather than the IRB
Exception dictionary
A faulty conclusion reached as a result of basing a conclusion on exceptional or unique cases.
Exception fallacy
A faulty conclusion reached as a result of basing a conclusion on exceptional or unique cases.
Exhaustive
The property of a variable that occurs when you include all possible answerable reponses.
Exogenous construct
In a theory a construct that causes other constructs but which itself has no cause specified within
the theory.
Exogenous variable
A variable that exerts an influence on the cause and effect relationship between two variables in
some way, and needs to be controlled.
Expected value
A measure of the mean, or central location, of a random variable.
Expedited review
Category of IRB review involving low-risk research in which just one member of the IRB judges
the proposal in order to hasten its assessment. [#] Review of proposed research by the IRB chair
or a designated voting member or group of voting members rather than by the entire IRB. Federal
rules permit expedited review for certain kinds of research involving no more than minimal risk
and for minor changes in approved research.
Experiment
A study undertaken in which the researcher has control over some of the conditions in which the
study takes place and control over some aspects of the independent variables being studied.
Random assignment of the subjects to control and experimental groups is usually thought of as a
necessary criterion of a true experiment. For example, if you interviewed moviegoers as they
exited a theater to see if what they saw influenced their attitudes, this would not be experimental
research; you had no control over who the subjects were or what film they watched or the
conditions under which they watched it. On the other hand, if you chose a room, a film, and
subjects to assign randomly to control and experimental groups and interviewed these subjects
about the effects of the film on their attitudes, that would be an experiment.
306
Experimental design
The art of planning and executing experiments. The greatest strength of an experimental
research design, due largely to random assignment, is its internal validity: One can be more
certain than with any other design about attributing cause to the independent
variables. The greatest weakness of experimental designs may be external
validity: It may be hard to generalize results beyond the laboratory. [#] A study
design in which the researcher might create an artificial setting, control some
variables, and manipulate the variable to establish cause-and-effect
relationship. [#] A study design that calls for the control or manipulation of the
independent variable in some way. A study design in which participants are
randomly assigned to experimental groups and receive treatment in the form
of the independent variable. [#] Research approach in which the independent variable is fixed by
a manipulation or natural occurrence.
Experimental Group
A group receiving some treatment in an experiment. Data collected about people in the
experimental group are compared with data about people in a control group (who received no
treatment) and/or another experimental group (who received a different treatment). [#] In
experimental conditions it is common to test the validity of a cause/effect relationship by having
two groups of research subjects, an experimental group and a control group. In the former group
the causal (independent) variable is present: in the latter it is explicitly excluded. For example, in
a study to test the impact of counselling on carer stress, the experimental group of carers would
receive counselling, the control group would not. [#] The group exposed to a treatment in an
experimental design. [#] The group in an experimental design study that receives treatment in
the form, or in various forms, of the independent variable. This group can thus be compared to
the control group.
Experimental realism
Extent to which experimental procedures produce a high level of psychological involvement and,
presumably, natural behavior regardless of the degree of mundane realism.
Experimental Research
It is a style of research in which the researcher generates or manipulates a
causal factor and then seeks to observe or measure the effects which
follow. In a drug trial, for example, a group of patients with a particular
illness are given a drug which it is hoped will alleviate or cure the illness,
and the effect of the drug is monitored. A pure experimental approach
involves the random selection of research subjects and control of
extraneous variables, as well as manipulation of the independent variable.
See also Quasi-experimental.
Experimental units
The objects of interest in the experiment.
Experimental
Term used to denote a therapy (drug, device, procedure) that is unproven or not yet scientifically
validated in terms of safety and efficacy. A procedure may be considered “experimental” without
necessarily being part of a formal study to evaluate its usefulness.
307
Experimenter expectancy
Mechanism (s) by which the researcher biases the behavior of the subject to get the
hypothesized results; also called self-fulfilling prophesy and Pygmalion effect.
Expert sampling
A sample of people with known or demonstrable experience and expertise in some area.
Expert system
An inference engine that uses stored knowledge and rules of if-than relationship to solve
problems.
Explorative Research
Seeks to find out more about phenomena which are little known. Explorative studies approach a
topic broadly to identify the range of issues and opinions associated with it. They are often the
fore-runners of more specific research which studies the identified topics in greater depth. [#]
Data collection and analysis aimed at formulating hypotheses.
Exploratory study
A research study where very little knowledge or information is available on the subject under
investigation.
External consultants
Research expert outside the organization who are hired to study specific problem to find
solutions.
External validity
The degree to which the conclusions in your study would hold for other places and at other times.
[#] The extent of generalizability of the results of a causal study to other field settings. [#] The
extent to which the findings of a study are relevant to subjects and settings beyond those in the
study. [#] This is defined by Cook and Campbell (1979, p. 37) as follows: “the approximate validity
with which we can infer that the presumed causal relationship can be generalized to and across
alternate measures of the cause and effect and across different types of persons, settings, and
times” (p. 37).
Externalities
Costs or beliefs that occur to third parties not directly involved in the project as provider or
consumer, may be accounted and paid for, or ‘internalized’, by the project.
Extraneous Variables
When an experiment is seeking to monitor the impact of one variable on another (like counselling
on stress level), attention has to be paid to other variables which could have an impact (that is,
other factors which could affect a person's stress level). These other variables are called
'extraneous'.
308
Face scale
A particular representation of the graphic scale, depicting faces with expressions that range from
smiling to sad.
Face Validity
An aspect of validity examining whether the item on the scale, on the face of it , reads as if it
indeed measures that it is supposed to measure. [#] A validity that checks that “on its face” the
operationalization seems like a good translation of the construct.
Face-to-Face interview:
information gathering when both the interviewer and interviewee meet in person.
Factor
(a) In analysis of variance, an independent variable, that is, a variable presumed to cause or
influence another variable; (b) in factor analysis, a cluster of related variables that are
distinguishable components of a larger set of variables; c) a number by which another number is
multiplied, as in the statement: real estate values increased by a factor of three, meaning they
tripled.
Factor analysis
Statistical approach to measurement construction that measures the extent to which test items
agree with a common underlying dimension or factor.
Factorial design
Experimental design to which each subject is assigned to one or another combination of the
levels or two or more independent variables.
Factorial designs
Designs that focus on the program or treatment, its components, and its major dimensions and
enable you to determine whether the program has an effect, whether different subcomponents
are effective, and whether there are interactions in the effects caused by subcomponents.
Factorial experiment
An experimental design that allows statistical conclusions about two or more factors.
Factorial Validity
That which indicates through the use of factor analytic techniques whether a test is a pure
measure of some specific factor or dimension of human participants in research. The Policy
applies to all research involving human participants that is conducted, supported, or otherwise
participant to regulation by any federal department or agency.
Faithful subject
Method for avoiding deception by asking subjects to comply with the experimental procedure and
to suspend their suspicions.
Fallibilism
In epistemology, the posture of doubting our own inductions.
309
Falsifiability
Aspect of an assertion that makes it vulnerable to being proven false, and essential ingredient in
the process of science.
Feminist Research
Feminism is a conceptual tool for critiquing traditional sociological research with key concepts of
empowerment of women, the equality of the research
relationship and understanding the social constructions
of gender. It can be understood in terms of values and
epistemology more than techniques and methodology.
( >>> also Emancipatory research)
Field experiment
An experiment done to detect cause and effect relationship in the natural environment in which
events normally occur.
Field research
A research method in which the researcher goes into the field to observe the phenomenon in its
natural state. [#] Behavioral, social, or anthropological research involving the study of people or
groups in their own environment and without manipulation for research purposes. Research
conducted in natural, real-life settings, outside the laboratory. This involves observation and, in
many cases, interactions with the people being studied
Field research
Generally, any social research taking place in a natural setting; more narrowly, equivalent to
qualitative research.
Field study
A study conducted in the natural setting with a minimal amount of researcher interference with
the flow of events in the situation.
Filter
When only a section of the total sample are required to answer the question, e.g. if the question
asks why people are dissatisfied with a particular service, only those who are dissatisfied should
answer the question. Those who are satisfied will skip to the next question that is to be asked of
all respondents.
310
Fishing and the error rate problem
A problem that occurs as a result of concluding multiple analyze and treating each one as
independent.
Five-number summary
An exploratory data analysis technique that uses the following five numbers to summaries the
data set: smallest value, first quartile, median, third quartile and largest value.
Focus Groups
A group consisting of 8 to 10 members randomly chosen, who discuss product or any given topic
for about 2 hours with a moderator present, so that their opinion
can serve as the basis for further research. [#] They are open-
ended, discursive, and are used to gain a deeper understanding
of respondents' attitudes and opinions. Focus groups typically
involve between 6-10 people, and last for 1-2 hours. A key
feature is that participants are to able interact with, and react to,
each other. In order to facilitate this group dynamic it is important to ensure that participants do
not know each other beforehand and that they are broadly 'compatible'.
Forced Choice
Elicits the ranking of objects relative to one another.
Formative evaluation
Evaluations that strengthen or improve the object being evaluated. Formative evaluations are
used to improve programs while they are still under development.
Frame
A list of the sampling units for a study. The sample is drawn by selecting units from the frame.
Frequencies
The number of items various subcategories of a phenomenon occur, from which the percentage
and cumulative percentage of any occurrence can be calculated.
Frequency distribution
A tabular summary of data showing the number (or frequency) of items in each of several non-
overlapping classes. [#] A summary of the frequency of individual values or ranges of values for a
variable. [#] Visual summary of a group of observations in which the number of occurrences of
each score (frequency) is indicated on the vertical axis and the value of the score on the
horizontal axis.
Frequency polygon
Type of frequency distribution in which a line joins points representing the frequencies of the
scores.
311
Full review
Category of IRB review in which the entire committee analyzes the research proposal.
Funneling Technique
The questioning technique that consists of initially asking general and broad questions, and
gradually narrowing the focus thereafter on more specific themes.
Gamma
Pre-measure of association for ordinal variables.
Generalisable
It has both a standard and technical use in research methods. As in normal conversation, it can
describe the extent to which the findings from a study of a sample can be generalised into
conclusions about the total research population. However, it should be used in a more technical
way, in terms of meaning how results from a sample can be generalised to a greater or lesser
extent according to the outcome of statistical tests of significance.
Generality
Theory attribute of being widely applicable, that is, being able to account for many different
observations.
Generalizability
The applicability of research findings in one setting to others. [#] The extent to which you can
come to conclusions about one thing (often a population) based on information about another
(often a sample). [#] The ability to apply the results of a specific study to groups or situations
beyond those actually studied.
312
Generative theory
The cause produces the effect, a view of causation that requires strong tests.
Gestalt principle
This refers to the whole or the totality. Gestalt psychology is known for the principle (among many
others) stating that the whole is bigger than the sum of its parts. The Gestalt principle is applied to
mixed methods ... to demonstrate that global inferences made at the end of mixed methods
studies are more than the simple sum of the inferences gleaned from QUAL and QUAN strands.
Going native
Rote shift in which the researcher gives up the neutral scientific perspective and becomes a
committed member or proponent of the group under study.
Goodness of Measures
Attests to the reliability and validity of measures.
Gradient of similarly
The dimension along which your study context can be related to other potential contexts to which
you might wish to generalize. Contexts that are closer to yours along the gradient of similarity of
place, time, people, and son on can be generalized to with more confidence that ones that are
further away.
Grey Documents
They are an umbrella heading for the paperwork which circulates around governmental and
private organisations, such as committee minutes, internal discussion documents, planning
papers and so forth. It is literature which is not 'published' in the conventional sense, but is
usually available on request.
Grounded Theory
Usually relates to qualitative research. The researcher starts by collecting evidence on a topic (or
phenomenon), and then sees what theoretical propositions the evidence will support. This is
described as an inductive process, or one in which the theory that arises is 'grounded' in the
evidence.
Grounded theory
A theory rooted in observation about phenomena of interest. Also, a method for achieving such a
theory.
Group threats
Internal validity threats to between subjects designs protection against such threats is proved by
random assignment to group.
313
Group Videoconferencing
Video transmittal technology that enables remote groups of people to participate in a conference
using video cameras and monitors.
Grouped data
Data available in class intervals as summarized by a frequency distribution. Individual values of
the original data are not available.
Groupware
A software that enables teams on a network to work on joint projects and access data
simultaneously.
Guardian
An individual who is authorized under applicable state or local law to give permission on behalf of
a child to general medical care.
Hard Data
Precise data, like dates of birth or income levels, which can reasonably be subjected to precise
forms of analysis, such as statistical testing.
Hawthorne effect
A type of demand characteristic in which the researcher’s attention was supposed to increase
subject’s effort; not confirmed by recent research.
Heterogeneity sampling
Sampling for diversity or variety.
Heterogeneous irrelevancies
Variations across studies in method, population, place, and time that are not expected to affect
the outcome of replications.
Hierarchical modeling
The incorporation of multiple units of analysis within a single analytical model. For instance, in an
educational study, you might want to compare student performance with teacher expectations. To
examine this relationship would expectations. To examine this relationship would require
averaging student performance for each class because each teacher has multiple students and
you are collecting data at both the teacher and student level.
Histogram
A graphical presentation of a frequency distribution, relative frequency distribution, or percent
frequency distribution of quantitative data. It is constructed by placing the class intervals on the
horizontal axis and the frequencies on the vertical axis.
Histogram
Type of frequency distribution in which vertical bars represent the frequencies of the scores.
314
Historical Controls
Control participants (followed at some time in the past or whose data are available through
records) who are used for comparison with participants being treated concurrently. The study is
considered historically controlled when the present condition of participants is compared with their
own condition on a prior regimen or treatment.
History Effects
A threat to the internal validity of the experimental results, when events unexpectedly occur while
the experiment is in progress and contaminate the cause-and-effect relationship.
History threat
A threat to interval validity that occurs when some historical event affects your study outcome.
History
Time threat to internal validity in which some event unrelated to the experimental intervention
causes the observed change.
Hypothesis
A statement which research sets out to prove or disprove. There
are two types of hypothesis: 'experimental' where the hypothesis is
a positive statement, such as 'carers who attend a support group
have better coping skills' or 'null' where thestatement contains a
negative, for example, 'carers who attend a support group do not
have better coping skills'. [#] An educated conjecture about the
logically developed relationship between two or more variables,
expressed in the form of testable statements. [#]A testable
statement of how two or more variables are expected to be related to one another.
Hypothesis test
Procedure by which a hypothesis is checked for its fit or agreement with observations.
Hypothesis Testing
A means of testing if the if-then statements generated from the the-oretical framework hold true
when subjected to rigorous examination.
Hypothesis
Prediction about operational variables, usually drawn by deduction from a theory.
Hypothetical-deductive model
A model in which two hypotheses are tested and so that all possible outcomes exist and so that if
one hypotheses is accepted the second must therefore be rejected.
315
Idiographic
Laws or rules that relate to individuals.
Idiographic
Research approach that tries to understand persons or situations for their unique characteristics
without trying to generalize (as opposed to nomothetic approach).
Incapacity
Refers to a person’s mental status and means the inability to understand information presented,
to appreciate the consequences of acting (or not acting) on that information, and to make a
choice.
Incidence
Number of new cases of a disorder appearing in a given time period.
Incompetence
Used as a legal term to indicate the inability to manage one’s own affairs.
Independent samples
samples selected from two (or more) populations n such a way that the elements making up one
sample are chosen independently of the elements making up the other sample (s).
Independent variable
The presumed cause in a study. Also a variable that can be used to predict the values of another
variable. Compare dependent variable. Some authors use the term "independent variable" for
experimental research only; for non-experimental research, they use predictor variable. [#] The
variable that is doing the predicting or explaining. It is denoted by x.
[#] A variable that influences the dependent or criterion variable and
accounts for (or explains) its variance. [#] The conditions of an
experiment that are systematically manipulated by the investigator.
A variable that is not impacted by the dependent variable, and that
itself impacts the dependent variable. [#] The variable that you
manipulate. For instance, program or treatment is typically an
independent variable. [#] In a research project which seeks to
establish cause and effect between variables, the potential causal variable is known as the
independent variable, and the variable(s) where effects are under scrutiny is dependent. [#] In
hypothesis tests, a variable that is supposed to cause one or more other variables and is not
caused by them, that is, it is independent of them.
In-Depth Interview
A method of data collection in which a participant is interviewed in detail about a certain research
316
participant. In this format, the interviewer leads the discussion flexibly along some pre-structured
topics, but also allows the participant to expand upon topics in-depth and to explore new avenues
of discussion.
Index
In the context of an online catalog search, the part of the instructions that tells the computer what
type of file to search- author, title, or subject.
Indirect causation
A set of two or more causal connections by which one construct of variable causes a second
indirectly via one or more intervening constructions of variables.
Indirect measure
An unobtrusive measure that occurs naturally variable.
Individual Review
A review of a single paper or article. In a review of a paper or piece of research you must analyse
method as well as content, and say whether or not you agree with the arguments and why. If you
disagree you need to be able to say what evidence supports your objection. You also need to
identify gaps in the literature. Also known as critiquing.
Induction
Creation of general principles from specific observations. [#] The process by which general
propositions based on observed facts are established.
Inductive
Bottom-up reasoning that begins with specific observations and measures and ends up as
general conclusion or theory.
Inference quality
This is proposed as a mixed methods term to incorporate the QUAN term internal validity and the
QUAL terms trustworthiness and credibility of interpretations (Tashakkori & Teddlie, 1998, 2003).
The definition of the term is as follows: the degree to which the interpretations and conclusions
made on the basis of the results meet the professional standards of rigor, trustworthiness, and
acceptability as well as the degree to which alternative plausible explanations for the obtained
results can be ruled out. Inference quality consists of Design Quality (within-design consistency)
and Interpretive Rigor[conceptual (or inferential) consistency, interpretive agreement (or
interpretive consistency), and interpretive distinctiveness.
Inference transferability
This refers to generalizability or applicability of inferences obtained in a study to other individuals
or entities (see population transferability), other settings or situations(see ecological
transferability), other time periods (see temporal transferability), or other methods of
317
observation/measurement (see operational transferability). It subsumes the QUAN terms external
validity and generalizability as well as the QUAL term transferability.
Inference
This is an umbrella term referring to a final outcome of a study. The outcome may consist of a
conclusion about, an understanding of, or an explanation for an event, a behavior, a relationship,
or a case. (#) This is “a conclusion reached” where there is either (a) a “deduction from premises
that are accepted as true” or (b) an induction by “deriving a conclusion from factual statements
taken as evidence for the conclusion” (Angeles, 1981, p. 133).
Inferential statistics
Statistical analyses used to reach conclusions that extend beyond the immediate data alone. [#]
Statistics that help to establish relationships among variables and draw conclusion therefore. [#]
Statistics with a know probability distribution that can be computed to determine whether an effect
observed in a sample or samples is due to chance.
Influential observation
An observation that has a strong influence or effect on the regression results.
Informant
In field research, a person who is “native” to the social situation being studied, who assists the
researcher by providing insider information and serving as a go-between.
Information System
The system that acquires, stores, and retrieves all relevant information for a specific group of
functions (e.g., manufacturing information system).
Information
In everyday conversation we often use 'information' and 'data' as meaning much the same thing,
but it is necessary to differentiate them. In a computerised database or information system it is
common to refer to the raw material entered into the computer as 'data', and it is then processed
by the computer to become 'information'. Knowledge can be seen as a higher level interpretation
of the information.
Informed Consent
An agreement to take part in research which is based on a full explanation and understanding of
why the research is being undertaken and any impact/effects it
might have on participants. How you obtain informed consent is a
major ethical consideration in research, especially with people who
are mentally confused or who have a learning disability. [#] A
policy of informing study participants about the procedures and
risks involved in research that ensures that all participants must
give their consent to participate. [#] The principle that
potential participants are given adequate and accurate information
about a study before they are asked to agree to participate, and
that they do in fact agree (consent) to participate. In giving
informed consent, participants many not waive or appear to waive
any of their legal rights, or release or appear to release the
investigator, the sponsor, the institution or agents thereof from liability for negligence. [#] Key
requirement for IRB approval in which the subject must give voluntary written consent before
participating based on adequate information and ability.
318
Inkblot Tests
A motivational research technique that uses colure patterns of inkblots to be interpreted by the
subjects.
Institutionalized
Confined, either voluntarily or involuntarily (e.g., a hospital, prison, or nursing home).
Instrumental utilization
Evaluation use that actually affects decision making.
Instrumentation Effects
The threat to internal validity in experimental designs caused by changes in the measuring
instrument between the pre test and the post test.
Instrumentation threat
A threat to internal validity that arises when the instruments (or observes) used on the posttest
and the pretest differ.
Integrity
In science, utter honesty in conducting research including seeking and reporting data contrary to
one’s own belief.
Interaction
The effect produced when the levels of one factor interact with the levels of another factor
influencing the response variable.
Interaction effect
An effect that occurs when differences on one factor depend on which level you are on another
factor.
Interaction
Effect of one independent variable depends on the level of another independent variable.
Interactive causation
direct causation of one variable by another that varies with the level of another variable.
Interactive model
(Maxwell & Loomis, 2003) Applied to mixed methods research, this model indicates that “the
different components of actual mixed methods studies are … connected in a network or web
319
rather than a linear or cyclic sequ
Intercept
In regression analysis, the point on the vertical axis where it meets the regression line, that is the
estimated value of the outcome variable when the predictor variable has the value of zero; usually
symbolized by the letter a.
Internal Consultants
Homogeneity of the items in the measure that tap a construct.
Internal validity
The extent to which the results of a study (usually an experiment) can be attributed to the
treatments rather than a flaw in the research design; in other words, the degree to which one can
draw valid conclusions about the causal effects of one variable on another. [#] The approximate
truth about inferences regarding cause-effect relationships. [#] Truthfulness of the assertion that
the observed effect is due to the independent variable(s) in the study.
Internet
A vast network of computers connecting people and information worldwide.
Interpretive distinctiveness
This is the degree to which the inferences are distinctively different from (and superior to) other
possible interpretations of the results and the rival explanations are ruled out (eliminated).
Interrater Reliability
The consistency of the judgment of several ratters on how they see a phenomenon or interpret
the activities in a situation.
Interval estimate
An estimate of a population parameter that provide an interval believed to contain the value of
the parameter.
320
Interval level
Type of measurement that assigns scores on a scale with equal intervals.
Interval scale
A scale of measurement for a variable that has the properties of ordinal data and the interval
between observations is expressed in terms of a fixed unit of measure. Interval data are always
numeric. [#] A multipoint scale that taps the differences, the order, and the equality of the
magnitude of the differences in the responses.
Intervening Variable
A variable that surfaces as a function of the independent variable, and helps in conceptualizing
and explaining the influence of the independent variable on the dependent variable. [#]
Measured varioable in a hypothesis test or a theoretical variable in a theory that is the effect of
one variable and a cause of another.
Interview guide
Chelist of topics that the qualitative interviewer wants to cover.
Interview: Group
Interviewing people in a group, rather than as separate individuals, is a way of finding out if a
consensus view exists in a homogeneous group on any given subject, or to identify the range of
views that might be held. It also allows the opportunity to observe and record group dynamics,
and to gather data which arises from individuals in the group stimulating each other. The focus
group is a well known example of a group interview.
Interview: Individual
The individual interview allows you to obtain personalised data about each respondent, and, if
using a structured format, develop a data set for quantitative analysis. Individual interviews can
take place face to face, by telephone or through the internet, via email.
Interview: Semi-structured
Contains a mix of structured questions, often to get factual data, and more general open-ended
questions which allow the respondent to elaborate on particular issues.
Interview: Structured
Uses questions, commonly in a questionnaire, which are precisely worded and where possible
responses are 'pre-coded'.
Interview: Unstructured
Also sometimes called 'in depth' or 'free story' interviews. Unstructured interviews can be thought
of as conversations with a purpose, but don't be tempted to regard them as vague chats.
Interviewing
A data collection method in which the researcher asks for information verbally from the
respondents.
Intranet
A network that connects people and resources within the organization.
Intrasession history
Events internal to the experimental procedure.
321
Investigator
In clinical trials, the individual who actually conducts the investigation.
Judgment Sampling
A purposive, non probability sampling design in which the sample subject is chosen on the basis
of the individual’s ability to provide the type of special information needed by the researcher. [#]
A non-probabilistic method of sampling whereby element selection is based on the judgment of
the person doing the study.
Justice
An ethical principle that requires fairness in the distribution of burdens and benefits; often
expressed in terms of treating persons of similar circumstances or characteristics similarly
Kappa
Measure of interrater agreement adjusted for chance agreement.
Key Informants
Are people who are known to have knowledge, experience, expertise and/or opinions specific to
the subject of the research, and who are selected as data sources for this reason. [#] Member of
social setting who serves as a major source of information about the setting for a qualitative
researcher.
Key Words
[or sometimes key concepts] Are those words or short phrases you identify as best describing
important aspects of your subject area. In practice there are two places
where key words are most frequently encountered - in the index of a
book, and as labels for carrying out a bibliographic search, usually by
computer. Key words are sometimes also known as 'descriptors'. [#] In
the context of an online catalog search, the part of the instructions that
tells the computer the specific term to search for – for example, the
author’s name or the book’s title.
Kinesics
Pertaining to bodily movements, especially the study of the communicative aspects of such
movements.
Kruskal-Wallis test
A non-parametric test for identifying differences among three or more populations.
322
Kurtosis
Degree to which the frequency distribution is flat or peaked.
Lab Experiment
An experimental design set up in an arterially contrived setting where controls and manipulations
are introduced to establish cause-and-effect relationships among variables of interest to the
researcher.
Lambda
Pre measure of association for nominal variables.
Laspeyres index
A weighted aggregate price index in which the weight for each item is its base-period quantity.
Latent variable
Unmeasured variable constructed statistically from two or more measured variable.
Leading Questions
Questions phrased in such a manner as to lead the respondent to give the answers that the
researcher would like to obtain.
Least squares
The criterion fir fitting a regression line so that you minimize the sum of the squares of the
residuals from the regression line.
Level of significance
The maximum probability of a Type I error.
Level
A subdivision of a factor into components or features.
Levels of measurement
Categorization of measurement into types based on the amount of information in the measure:
nominal, ordinal, interval, ratio.
323
Leverage
A measure of how far the values of the independent variables are from their mean values.
Life History
It is a record of an event/events in a respondent's life told (written down, but increasingly audio or
video recorded) by the respondent from his/her own perspective in his/her own words. A life
history is different from a 'research story' in that it covers a longer time span, perhaps a complete
life, or a significant period in a life.
Likert Scale
An interval scale that specifically uses the five anchors of Strongly Disagree, Disagree, Neither
Disagree nor Agree, Agree, and Strongly Agree. [#] A method of scaling in which the items are
assigned interval-level scale values and the responses are gathered using an interval level
response format.
Linear assumption
In regression analysis, the assumption that the relationship between the studied variables is best
described by a straight line.
Linear model
Any statistical model that uses equations to estimate lines.
Literature Review
It brings together individual reviews of papers, etc. It should weave
together the individual reviews into an overview of the area. The aim is
to convey an awareness of the current state of knowledge in the subject.
It is commonly used to set the scene for introducing new research or a
new perspective on the research. [#] Analysis of all research on a topic
that tires to identify consensus findings or to resolve conflicts in the
work.
Literature Search
A manual and/or electronic search of the literature to find out what research has been carried out
in your area of enquiry.
Literature
All of the research reports on a single question.
Loaded Questions
Questions that would elicit highly biased emotional responses from subjects.
Longitudinal
A study that takes place over time.
Longitudinal correlation
Association of variable measured at different times. One variable is said to “lead” (come before)
or “lag” (come after” the other.
Longitudinal Research
May use any method of data gathering (observation, survey, experiment, etc.), but its particular
324
characteristic is that the process is repeated on several occasions over a period of time, as far as
possible replicating the chosen methodology each time. It follows that a key aim of such research
is to monitor changes over time.
Longitudinal Study
A research study for which data are gathered at several points in time to answer a research
question. [#] A study in which data are collected from the same sample at least two different
times. A study designed to follow participants through time.
Longitudinal survey
Survey conducted at two or more times (for example, panel, trend, and cohort longitudinal survey
designs).
Lot
A group of items such as incoming shipments of raw materials or purchased parts as well as
finished goods from final assembly.
Main effect
An outcome that shows consistent difference between all levels of a factor.
Manipulation Checks
Measures used to assess the effectiveness of the manipulation.
Manipulation
How the researcher exposes the subjects to the independent variable to determine cause-and-
effect relationships in experimental designs.
Margin of error
The + - value added to and subtracted from a point estimate in order to develop a confidence
interval.
Matched samples
Samples in which each data value of one sample is matched with a corresponding data value of
the other sample.
Matching
A method of controlling known contaminating factors in experimental studies, by deliberately
spreading them equally across the experimental and control groups, so as not to confound the
cause-and effect relationship. [#] Assigning subjects to experimental and control conditions to
325
equalize the groups on selected characteristics; can be combined with random assignment but
when used alone cannot guarantee group equivalence on variable not used in the matching.
Math anxiety
Fear of math and statistics, which can result in avoidance of math- based courses or careers.
Math phobia
The fear and consequent avoidance of math – related material.
Matrix
A table of numbers such as correlations.
Maturation Effects
A threat to internal validity that is a function of the biological, psychological, and other processes
taking place in the respondents as a result of the passage of time.
Maturation threat
A threat to validity that occurs as a result of natural maturation that occurs between pre- and post-
measurement.
Maturation
Time threat to internal validity in which internal developmental process cause the observed
change.
Mature Minor
Someone who has not reached adulthood (as defined by state law) but who may be treated as an
adult for certain purposes (e.g. consenting to medical care). A mature minor is not necessarily
an emancipated minor (See >>> Emancipated Minor).
Mean (M)
Measure of central tendency consisting of the sum divided by the number of observations,
symbolize by M of X.
Mean
The average of a set of figures. [#] A description of the central tendency in which you add up all
the values and divide by the number of values. [#] A measure of central location for a data set. It
is computed by summing all the data values and dividing by the number of items. [#] The
arithmetic average of a set of data in which the values of all observations are added together and
divided by the number of observations.
Measure of Dispersion
The variability in a set of observations, represented by the gang, variance, standard deviation,
and the intrquartile range.
Measurement association
The correlation between observed variable that derives from their serving as measures of a latent
variable.
326
Measurement decay
Time threat to intervals validity in which changes in the measurement process cause the
observed change, also called instrumentation.
Measurement error
Any influence on an observed score not related to what you are attempting to measure.
Median
The central item in a group of observation arranged in an
ascending or descending order. [#] A measure of central
location. It is the value that splits the data into two equal
groups, one with values greater than or equal to the median
and one with values less than or equal to the median.[#] The
idle number in a series of numbers. For example in Thurstone
scaling the median is the value above and below which 50
percent of the ratings fall. [#] The outcome that divides an
ordered distribution exactly into halves. [#] The score found at
the exact middle or fiftieth percentile of the set of values. One
way to compute the median is to list all scores in numerical
order and then locate the score in the center of the sample.
Memoing
A process for recording your thoughts and ideas as t hey evolve throughout the study.
Meta-analytic review
Literature review approach that reduces each study to a few summary effect sized, which can
then be analyzed by statics.
Meta-interpretation
Forthofer (2003) describes this term as follows: “Mixed methods designs are inherently more
complex, and those that attempt any integration or synthesis of results across methodologies
require an additional phase of “meta-interpretation.”
Method effects
Source of construct invalidity in which measures of different constructs using the same
procedure fail to diverge.
Method/Methodology
While 'method' describes what you as a researcher have done, methodology is about your
reasons for doing it.
Methodology
The methods you use to try to understand the world better.
327
Minimal Risk
A risk is minimal where the probability and magnitude of harm or discomfort anticipated in the
proposed research are not greater, in and of themselves, than those ordinarily encountered in
daily life or during the performance of routine physical or psychological examinations or tests. The
definition of minimal risk for research involving prisoners differs somewhat from that given for
non-institutionalized adults.
Mixed design
Experimental design that includes both within – subjects and between – subjects features.
Mode
A measure of location, defined as the most frequently occurring data value. [#] The most
frequently occurring value in the set of scores. [#] Measure of central tendency consisting of the
most frequently occurring sources; if the distribution has two modes, the distribution is called
bimodal.
Model specification
The process of stating the equation that you believe best summarizes the data for a study.
Model
One possible set of causal paths that we can compare with observed data.
Modem
A device for linking computers by telephone line an abbreviation for “modulator demodular”.
Moderating Variable
A variable on which the relationship between two other variables is contingent. That is, if the
328
moderating variable is present, the theorized relationship between the two variables will hold
good, not otherwise. [#] In interactive causation, the variable that determines the effect of one
variable on another.
Monitoring
The collection and analysis of data as the project progresses to assure the appropriateness of the
research, its design and participant protections.
Mono-method bias
A threat to construct validity that occurs validity that occurs when you rely on only a single
implementation of your independent variable, cause, program, or treatment in your study.
Monostrand design
These designs use a single research method or data collection technique (QUAN or QUAN) and
corresponding data analysis procedures to answer research questions. They are also known as
single-phase designs.
Mortality
The loss of research subjects during the course of the experiment, which confounds the cause-
and-effect relationship.
Mortality threat
A threat to validity that occurs because a significant number of participants drop out.
Mortality
Subject attrition from retest to posttest, which casts doubt on the validity of the study; here
conceptualized as a threat to measurement construct validity. Protection against this threat is not
provided by a control group or random assignment but rather by care in defining the subjects to
be measured in evaluation experimental impact.
Motivational Research
A particular data gathering technique directed toward surfacing information, ideas, and thoughts
that are not either easily verbalized, or remain at the unconscious level in the respondents.
Moving averages
A method of forecasting or smoothing a time series by averaging each successive group of data
points.
Multicollinearity
The term used to describe the correlation among the independent variables.
329
Multilevel mixed model design
This is a design in which QUAL data are collected at one level (e.g., child) and QUAN data are
collected at another level (e.g., family) in a concurrent or sequential manner to answer
interrelated research questions with multiple approaches (QUAL and QUAN). Both types of data
are analyzed accordingly, and the results are used to make multiple types of inferences (QUAL
and QUAN) that are pulled together at the end of the study in the form of “global inferences.” See
>>> multilevel mixed method design.
Multimethods design
This refers to designs in which the research questions are answered by using two data collection
procedures or two research methods, both with either the QUAL or QUAN approach. See also
multimethods QUAL study and multimethods QUAN study.
Multinomial population
A population in which each element is assigned to one and only one of several categories. The
multinomial probability distribution extends the binomial probability distribution from two to three
or more categories.
Multioption variable
A question format in which the respondent can pick multiple variables from a list.
330
Multiple methods design
This refers to designs in which more than one research method or data collection and analysis
technique is used to answer research questions. They include mixed methods designs (QUAL +
QUAN) and multimethods designs (QUAN + QUAN or QUAL + QUAL).Back to the top
Multiple regression
One statistical procedure for conduction multivariate research.
Multiplication law
A probability law used to compute the probability of an intersection of two events, denoted by A
and B . It is P(A = P(A) P(B/A) or P(A = P(B)P*A/B). For independent events, it
reduces to P(A = P(A)P(B).
Multistage Sample
A probability sample that involves several stages (and frequently a cluster sampling stage) , such
as randomly selecting clusters from a population, then randomly selecting people from each of
the clusters.
Multi-stage sampling
The combining of several sampling techniques to create a more efficient or effective sample than
the use of any one sampling type can achieve on its own.
Multistrands design This refers to designs that use more than one research method or data
collection procedure. See also multimethods design.
Multitrait-multimethod matrix
This is a matrix of correlations between multiple methods of measuring each of a set of attributes.
The diagonal values indicate the reliability of each measure/method. The off-diagonal values
indicate the convergent validity and discriminant validity of each procedure/instrument. This
331
method was introduced by Campbell and Fiske (1959) to evaluate the quality of data obtained
from measurement instruments.
Multivariate Analysis
Any of several methods for examining multiple variables at the same time. Usage varies. (a)
Stricter usage reserves the term for designs with two or more independent variables and two or
more dependent variables. (b) More loosely, multivariate analysis applies to designs with more
than one independent variable or more than one dependent variable or both. Whichever usage
you prefer, either allows researchers to examine the relation between two variables while
simultaneously controlling for the influence of other variables. Examples include path analysis,
factor analysis, multiple regression analysis, MANOVA, LISREL, canonical correlations, and
discriminant analysis.
Mundane realism
Extent to which a research setting resembles in physical detail a real social setting.
Mutually exclusive
Said of two events, conditions, or variables which cannot occur at the same time. For example,
one cannot be both male and female, or both Protestant and Catholic. Thus, the categories male
and female, or Catholic and Protestant are said to be mutually exclusive.
Mutually exclusive
The property of a variable that ensures that the respondent is not able to assign two attributes
simultaneously. For example, gender is a variable with mutually exclusive potions if it is
impossible for the respondents to simultaneously claim to be both male and female.
Narrative
[or biographical] approaches to research are primarily qualitative, and include gathering/ using
data in the form of diaries, stories and life histories.
Near Market
It is a descriptor given to research which is placed in an economic context of being commercially
exploitable. It is commonly used to describe applied research (see entry) set in the market sector,
where there is or should be a market interest in supporting and funding it.
Negative relationship
A relationship between variables in which high values for one variable are associated with low
values on another variable.
332
No sampling error
All types of errors other than sampling errors, such as measurement error, interviewer error and
processing error.
Nominal scale
A scale of measurement for a variable that uses a label or name to identify an attribute of an
element. Nominal data may be non-numeric or numeric. [#] A scale that categorizes individuals
or objects into mutually exclusive and collective exhaustive groups, and offers basic, categorical
information on the variable of interest.
Nomological network
A network that includes the theoretical framework for what you are trying to measure, an
empirical framework for how you are going to measure it, and specification of the linkages among
and between these two frameworks.
Nomothetic
Refers to laws or rules that pertain to the general case.
Non participant-Observer
A researcher who collects observational data without becoming an integral part of the system.
Non-affiliated Member
Member of an Institutional Review Board who has no ties to the parent institution, its staff, or
faculty. This individual is usually from the local community (e.g., minister, business person,
attorney, teacher).
Non-probabilistic methods
Statistical methods that require few, if any, assumptions about the population probability
distributions and the level of measurement. These methods can be applied when nominal or
ordinal data are available.
333
Non-probability Sample
A subset of the population chosen in a way that does not give every member of the population a
known (nonzero) chance of being selected.
Non-probability sampling
Sampling that does not involve random selection.
Non-response Bias
The bias that results from differences between those who agree to participate in a survey and
those who don’t.
Non-therapeutic Research
Research that has no likelihood or intent of producing a diagnostic, preventive, or therapeutic
benefit to the current participants, although it may benefit participants with a similar condition in
the future.
np chart
A control chart used to monitor the output of a process in terms of the number of defective items.
Nuisance Variable
A variable that contaminates the cause-and-effect relationship.
Null case
A situation in which the treatment has no effect.
Null hypothesis
The hypothesis that describes the possible outcomes other than the alternative hypothesis.
Usually the null hypothesis predicts there will be no effect of a program or
treatment you are studying. [#] The conjecture that postulates no
differences or no relationship between or among variables. [#] The
hypothesis tentatively assumed true in the hypothesis testing procedure.
[#] The proposition, to be tested statistically, that the experimental
intervention has “no effect,” meaning that the treatment and control
groups will not differ as a result of the intervention. Investigators usually
hope that the data will demonstrate some effect from the intervention, thus allowing the
investigator to reject the null hypothesis.
Numerical Scale
A scale with bipolar attributes with five points or seven points indicated on the scale.
334
O
Symbol used in design diagrams to represent one or more observations collected at some time
point.
Objectivity
Interpretation of the result on the basis of the results of data analysis, as opposed to subjective or
emotional interpretations.
Observation bias
Data collection bias that occurs in interviewing or measurement stage (for example, the tendency
of respondents to give answers that are socially desirable).
Observation: Non-participant
Where the researcher attempts to remove or detach themselves as an actor from the research
situation.
Observation: Participant
Observing something as an insider, as someone who is involved in the processes being
observed.
Observational Survey
Collection of data by observing people or events in the work environment and recording the
information.
Observed Score
Also called fallible score, the value obtained by the measurement procedure and assumed to
contain some degree of error.
Ogive
A graph of a cumulative distribution.
One-tailed hypothesis
A hypothesis that specifies a direction, for example, when your hypothesis predicts that your
program will increase the outcome.
335
One-tailed test
A hypothesis test in which rejection of the null hypothesis occurs for values of the test statistic in
one tail of the sampling distribution.
Online
Connected to a computer for direct interaction with the electric database.
Ontological assumptions
The assumptions you hold about reality. For instance, realism is an ontological assumption that
holds that there is an external reality apart from your experience of it.
Ontology
Branch of philosophy dealing with the ultimate nature of things.
Open coding
A phase of the grounded theory method where you consider the data in minute detail while
developing some initial categories.
Open Design
An experimental design in which both the investigator(s) and the participants know the treatment
group(s) to which participants are assigned.
Open-ended Questions
Leave the answer entirely to the respondent, either with a blank
space on the questionnaire for recording the reply, or by
phrasing a question in an interview in such a way as to elicit a
longer answer. This approach is used when there is no way of
knowing what answers the respondents are likely to give, or if
you want quotable responses. Often they are used in pilot
studies in order to develop a pre-coded version for the main
study. [#] Questions that the respondent can answer in a free-
flowing format without restricting the range of choices to a set of
specific alternatives suggested by the researches. [#]
Survey questions that allow respondents to answer in their own
words.
Operational Definition
Definition of a construct in measurable terms by reducing it from its level of abstraction through
the delineation of its dimensions and elements. [#] Statements of the specific ways in which the
absence, presence, and/or the degree of presence of a phenomenon will be determined in a
specific research process. [#] Procedure that translates a construct into manifest or observable
form.
Operational transferability
This is the degree to which the inferences that are made on the basis of the results of the study
are generalizable to other methods of observing/measuring the entities or attributes that the
336
inference is about. Subsumes the QUAN terms external validity of operations and operational
external validity.
Operationalization
The act of translating a construct into its manifestation, for example translating the idea of your
treatment or program into the actual program, or translating the idea of what you want to measure
into the real measure. The result is also referred to as an operationalization, that is, you might
describe your actual program as an operationalized program. [#] Your translation of an idea or
construct into something real and concrete.
Operations Research
A quantitative approach taken to analyze and solve problems of complexity.
Opportunity cost
Value of the best alternative use of the project’s resources, that is, the value forgone by the
decision to invest in the project.
Ordinal level
Type of measurement that assigns observations to ordered categories.
Ordinal scale
A scale of measurement for a variable that has the properties of nominal data and can be used to
rank or order the data. Ordinal data may be non-numeric or numeric. [#] A scale that not only
categorizes the qualitative differences in the variable of interest, but also allows for the rank-
ordering of these categories in a meaningful way.
Outlier
A data point or observation that does not fit the pattern shown by the remaining data; an
unusually small or unusually large data value.
Oversample
Drawing a disproportionately large number of elements to assure an adequate number of
elements from small clusters or strata.
P
Symbol for probability that an observed inferential statistic occurred by chance (for example p <
.05).
P chart
A control chart used when the output of a process is measured in terms of the proportion
defective.
Paasche index
A weighted aggregate price index in which the weight for each item is its proportion defective.
337
Paired Comparisons
Respondents choose between tow objects at a time , with the process repeated with a small
number of objects.
Panel Studies
Studies conducted over a period of time to determine the effects of certain changes made in a
situation, using a panel or group of subjects as the sample base.
Panel survey
Longitudinal survey design involving multiple interviews with the same subjects: are known as a
panel.
Panel
Correlational design in which a group of subjects is surveyed or measured at more than one time
point; also the group itself is called a panel.
Paradig
Shared framework involving common theory and data collection tools in which researchers
ordinarily approach scientific problems.
Paradigm shift
The revolution in assumptions about and perception of a research problem during which one
perception replaces another.
Paradigm
(Mertens, 2003). A conceptual model of a person’s worldview, complete with the assumptions
that are associated with that view. (#) (Caracelli and Green, 2003) paradigms are social
constructions, historically and culturally embedded discourse practices, and therefore neither
inviolate nor unchanging. Back to the top
Paradox
Apparent contradiction between two different theories, between, two different observations, or
between a theory and observations.
Parallel mixed model design See >>> Concurrent Mixed Model Design.
Parallel-Form Reliability
That form of reliability which is established when responses to two comparable sets of measure
tapping the same construct are highly correlated.
Parameter
A numerical characteristic of a population, such as a population mean, a population standard
deviation, a population proportion, and so on.
Parametric Statistics
Statistics used to test hypothesis when the population from which the sample is drawn is
assumed to be normally distributed.
Parsimony
Efficient, expiation of the variance in the dependent variable of interest through the use of a
smaller, rather than a larger number of independent variables.
Parsimony
Theory attribute of being simple or sparing of constructs and relationships.
338
Partial correlation
Measure of association between two variables after statistically controlling one or more other
variables. Order of correlation is the number of variables controlled (for example, zero-order is
simple correlation, first – order partial controls one variable and so on).
Participant observation
A method of qualitative observation where the researcher becomes a participant in the culture or
context being observed. [#] Common qualitative research method in which the researcher enters
the social setting to be studied and actively joins the subjects in their normal activities.
Participant
Individuals whose physiological or behavioral characteristics and responses are the object of
study in a research project. Under federal regulations, human participants are defined as: living
individual(s) about whom an investigator conducting research obtains: (1) data through
intervention or interaction with the individual; or (2) identifiable private information.
Participant-Observer
A researcher who collects observational data by becoming a member of the system from which
data are collected.
Partitioning
The process of allocating the total sum of squares and degree of freedom to the various
components.
Paternalism
Making decisions for others against or apart from their wishes with the intent of doing them good.
Path analysis
Diagram of a causal model that includes statistical estimates of relationships.
Path coefficient
Standardized regression coefficient from a multiple regression analysis that describes the
association between two variables in path analysis.
Pattern matching
The degree of correspondence between two data items. For instance, you might look at a pattern
match of a theoretical expectation pattern with an observed pattern to see if you are getting the
outcomes you expect.
339
Percent frequency distribution
A tabular summary of data showing the percentage of items in each of several non-overlapping
classes.
Percentile
A value such that at least p percent of the items are less than or equal to this value and at least
th
(100-p) percent of the items are greater than or equal to this value. The 50 percentile is the
median.
Permission
The agreement of parent(s) or guardians to the participation of their child or ward in research.
Persuasive utilization
Evaluation use in which the research justifies decisions already made also called symbolic
utilization.
Phenomenology
A philosophical perspective as well as an approach to qualitative methodology that focuses on
people’s subjective experiences and interpretations of the world. [#] Philosophical perspective
that emphasizes the discovery of meaning from the point of view of the studied group or
individual.
Pie chart
A graphical device for presenting data summaries based on sub-division of a circle into sectors
that correspond to the relative frequency for each class.
Pilot Study
A trial, both to examine the effectiveness of various aspects of the proposed research, such as
procedures for data gathering, and to aid the completion of detailed project plans. [#] Small scale
research with the experimental manipulation to determine its effective ness before using it in the
main study.
Placebo
A chemically inert substance (e.g., sugar pills) given to control groups as if it were the medicine
or treatment for its psychologically suggestive effect; it is used in controlled clinical trials to
determine whether improvement and side effects may reflect imagination or anticipation rather
than actual power of a drug. [#] Intervention that simulates an authentic treatment but with no
active ingredient.
Plagiarism
Falsely claiming credit for work authored by another.
Point estimate
A single numerical value used as an estimate of a population parameter.
Point estimator
The sample statistic that provides the point estimate of the population parameter.
Pooled variance
An estimate of the variance of a population based on the combination of two (tow or more)
sample results. The pooled variance estimate is appropriate whenever the variances of two (or
more) populations are assumed equal.
Population Frame
A listing f all the elements in the population from which the sample s drawn.
Population parameter
A numerical value used as a summary measure for a population of data ( e.g. the population
mean, the population variance , and the population standard deviation). [#] The mean or average
you would obtain if you were able to sample the entire population
Population transferability
This refers to generalizability or applicability of inferences obtained in a study to other individuals
or entities. Subumes the QUAN term population validity and population external validity, and the
QUAL term transferability. See >>> Inference transferability.
Population validity (or population external validity) See >>> Inference transferability
Population
A group of persons that one wishes to describe or about which one wishes to generalize. To
generalize about a population, one often studies a sample that is
meant to be representative of the population. [#] The entire group
(or set or type) of people from which a researcher samples, and
to which she or he would ideally like to generalize. [#] The entire
group of people, events, or things that the researcher desires to
investigate. [#] The group you want to generalize to and the
group you sample from in a study. [#] The set of all elements of
interest in a particular study. [#] Collection of all elements to
whom survey results are to be generalized.
Positive relationship
A relationship between variables in which high values for one variable are associated with high
values on another variable and low values are associated with low values.
Positivism
An approach to knowledge based on the assumption of an objective reality that can be
discovered with observed data. [#] The philosophical position that the only meaningful inferences
are ones that can be verified through experience or direct measurement. Positivism is often
associated with the stereotype of the hard-headed, lab-coat scientist who refuses to believe in
something if it can’t be seen or measured directly.
Post-positivism
The rejection of positivism in favour of a position that one can make reasonable inference about
phenomena based upon theoretical reasoning combined with experience-based evidence.
Posttest
A test given to the subject to measure the dependent variable after exposing them to a treatment.
341
Posttest-only non-experimental design
Na research design in which only a posttest is given. It is referred to as nonexperimental because
no control group exists.
Power curve
A graph of the probability of rejecting Ho for all possible values of the population parameter not
satisfying the null hypothesis. The power curve provides the probability of correctly rejecting the
null hypothesis.
Power
The probability of Ho when it is false. [#] The probability of a statistical test correctly rejecting a
false null hypothesis (or 1- beta).
Precision
The degree of closeness of the estimated sample characteristics to the population parameters,
determined by the extend of the variability of the sampling distribution of the sample mean.
Pre-coded Questions
These have a list of answers from which to choose in order to facilitate analysis, or to better
control the interview process. In a self-completion questionnaire the respondent chooses the
option or options. In an interview the options are either read out or shown to the respondent who
then chooses. In this type of question care must be taken that the options are exclusive and
exhaustive. The category 'Other' is often added in case the list is not complete, but keep in mind
that if there are possible answers which are not on your list, bias can ensue.
Precosopm statement
A probability statement about the sampling error.
Predictive Research
It is concerned with identifying indicators of future behaviour or demand in a population on the
basis of the current behaviour and demands of a sample. Predictive techniques use a number of
statistical approaches.
Predictive Study
A study that enables the prediction of the relationships among the variables in a particular
situation.
Predictive validity
A type of construct validity based on the idea that your measure is able to predict what it
theoretically should be able to predict. [#] The ability of the measure to differentiate among
individuals as to a criterion predicted for the future.
Pre-experiments
Class of experimental design that are very vulnerable to threats to internal validity.
342
Pre-post nonequivalent groups quasi-experiment
A research design in which groups receive both a pre- and posttest and group assignment is not
randomized and therefore the groups may be nonequivalent, making it a quasi-experiment.
Pre-project Research
It is an activity undertaken in the planning stages of research to clarify issues such as the focus of
the research, its aims, access to sampling frame, likely response rate, most appropriate
methodology and means of analysis. Overlaps somewhat with a pilot study.
Pre-randomization
Class of experimental designs that are very vulnerable to threats to internal validity.
Present values
Value of future program benefits adjusted downward by some discount rate.
Pretest sensitization
Production of changes in later interviews by the experience of a prior interview.
Pretest
A test given to subjects to measure the dependent variable before exposing them to a treatment.
Prevalence
Number of cases existing at some time.
Primary Data
Data collected firsthand for subsequent analysis to find solutions to the problem researched.
Primary Sources
A primary source is that which provides the initial basic data set under discussion, while a
secondary source is one which offers further analysis or commentary on the data. Generally it is
better, if you can, to make reference to primary sources.
Principal Investigator
The scientist or scholar with primary responsibility for the design and conduct of a research
project.
Prisoner
An individual involuntarily confined in a penal institution, including persons: (1) sentenced under
a criminal or civil statue; (2) detained pending arraignment, trial, or sentencing; and (3) detained
in other facilities.
Privacy
A person’s capacity to control the extent, timing, and circumstances of shared oneself (physically,
behaviorally, or intellectually) with others.
Probabilistic
Based on probabilities
Probabilistic equivalence
The notion that two groups, if measured infinitely, would on average perform identically. Note that
343
two groups that are probabilistically equivalent would seldom obtain the exact same average
score.
Probabilistic sampling
Any method of sampling for which the probability of each possible sample can be computed.
Probability distribution
A description of how the probabilities are distributed over the values the random variables can
assume.
Probability distribution
In inferential statistics, the likelihood of occurrence (or probability ) of each level of the inferential
statistic for any number of degree of freedom.
Probability function
A function, denoted by f(x), that provides the probability that x assumes a particular value for a
discrete random variable.
Probability Sample
A subset of the population chosen in such a way that every member of the population has a
known (nonzero) chance of being selected into the sample.
Probability sampling
Method of sampling that utilizes some form of random selection. [#] The sampling design in
which the elements of the population have some known chance or probability of being selected
as sample subjects.
Probability sampling
Sampling method in which all elements have equal probability of being drawn.
Problem Definition
A precise, succinct statement of the question or issued that is to be investigated.
Process analysis
Procedure for measuring selected grammatical or nonlexical forms in speech or text.
Producer’s risk
The risk of rejection a good-quality lot. This a type I error.
Program audit
Nonroutine evaluation by an outsider of a program’s operation.
Program evaluation
Social research that judges a program’s success, usually in one or more of the following, program
impact, efficiently analysis, or utilization.
344
Program impact
Stage of evaluative research that determines whether the program has an effect.
Program monitoring
Stage evaluative research that checks whether the program’s operation follows its plan.
Projective Methods
Ways of eliciting responses difficult to obtain, otherwise than through such means as word
association, sentence completion, and thematic apperception tests.
Projective question
A hypothetically framed question. For example, you might ask the respondent how much money
people they know typically give in a year to charitable causes
Projective tests
Measurement procedures by which subjects respond to ambiguous stimuli; presumed to reflect
significant personality characteristics.
Prompt
Blinking symbol on a computer screen showing readiness for the next step in the procedure.
Prospective Studies
Studies designed to observe outcomes or events that occur after the group of participants has
been identified. Prospective studies do not have to involve manipulation or intervention but may
be purely observational or involve only the collection of data instead.
Protocol
The formal design or plan of an experiment or research activity; specifically, the plan submitted to
an IRB for review and to an agency for research support. The protocol includes a description of
the research design or methodology to be employed, the eligibility requirements for prospective
participants and controls, the treatment regimen(s), and the proposed methods of analysis that
will be performed on the collected data.
Proxemics
Pertaining to interpersonal spacing, especially the study of communicative aspects, causes, and
effects of spacing.
345
Proxy-Pretest design
A post-only design in which, after the fact, a pretest measure is constructed from pre-existing
data. This is usually done to make up for the fact that the research did not include a true pretest.
Pseudo-effect
Apparent treatment effect caused by contrast with noncomparable control group.
Pseudoscience
Body of assertions that appears scientific because it involves observation, but is not for lack of
falsifiability, for example, astrology.
Psychometrics
Research devoted to evaluating and improving reliability and validity of social research measures.
Pure
Often used in the same way as 'Basic Research', though sometimes to imply the purity of
methodological approach (that is, an emphasis on what is methodologically correct with minimal
compromise with practical issues).
Purposive Sampling
A non probability sampling design in which the required information is gathered form special or
specific targets or groups of people on some rational basis. [#] Non-probability sampling method
that involves choosing elements with certain characteristics.
Purposiveness in Research
The situation in which research is focused on solving a well-identified and defined problem, rather
than aimlessly looking for answers to vague questions.
p-value
The probability, when the null hypothesis is true, of obtaining a sample result that is at least as
unlikely as what is observed. It is often called the observer level of significance.
Qualitative data
Data in which the variables are not in a numerical form, but are in the form of text, photographs,
sounds bytes, and so on. [#] Data that are labels or names used to identify an attribute of each
element. Qualitative data may be non-numeric or numeric.
[#] Data that are not immediately quantifiable unless they are coded and categorized in some
way.
Qualitative measures
Data not recorded in numerical form.
Qualitative Research
(a) When referring to variables, "qualitative" is another term for categorical or nominal. (b) When
speaking of kinds of research, "qualitative" refers to
346
studies of subjects that are hard to quantify, such as art history. Qualitative research tends to be
a residual category for almost any kind of non-quantitative research. [#] The collection of non-
numerical data. Often multi-method in focus, qualitative research involves an interpretive,
meaning-driven approach to its participant matter. [#] Social research based on observation made
in the field and analyzed in nonstatistical ways.
Qualitative Study
Research involving analysis of data / information that are descriptive in nature and not readily
quantifiable.
Qualitative variable
A variable that is not in numerical form.[#] A variable with qualitative data.
Qualitative variables
Types of variable for which observations are assigned to levels rather than given precise
quantitative values rather than given precise quantitative values, example, religious preference
Qualitative
These data will normally be presented discursively (though multi-media is increasingly used) and
will focus on depth and subtlety in a single or small number of settings rather than counting
characteristics over a larger number of settings or responses from more people. This method can
provide a rich and more in-depth data set. Researchers will often use qualitative methods to
complement quantitative methods and vice versa.
Qualitizing
This is the process by which quantitative data are transformed into data that can be analyzed
qualitatively.
Quality control
A series of inspections and measurements that determine whether quality standards are being
met.
Quantitative
These data can be stored in reasonably well-defined categories, and in sufficient volume (ie
number of responses) to permit tabular and cross-tabular presentations, and possibly statistical
analysis. In other words it is about counting and offering findings as numbers or percentages. The
strength of this approach lies in the precision and clarity with which findings can be stated, and
the scope which exists (via appropriate statistical tests) for establishing general validity. In some
sectors statistical presentation is respected more than any other format.
Quantitative data
Data that appears in numerical form. [#] Data that indicate how much or how many of something.
Quantitative data are always numeric.
Quantitative measurement
Collecting and reporting observations numerically.
Quantitative Research
Said of variables or research that can be handled numerically. Usually (too sharply) contrasted
with qualitative variables and research. [#] The collection of numerical data in order to describe,
explain, predict and/or control phenomena of interest.
Quantitative variable
Data in the form of numbers. [#] A variable with quantitative data.
347
Quantitative Variables
Type of variable for which observations are assigned measured values for example, temperature.
Quantity index
An index that is designed to measure changes in quantities over time.
Quartiles
th th th
The 25 , 50 and 75 percentiles, referred to as the first quartile, the second quartile (median)
and third quartile, respectively. Quartiles can be used to divide the data set into four parts, each
part containing approximately 25% of the data.
Quasi-experiment
A type of research design for conducting studies in field or real-life situations where the
researcher may be able to manipulate some independent variables but cannot randomly assign
subjects to control and experimental groups. The procedures of quasi-experimentation were
developed mainly in the context of evaluation research projects. For example, you cannot cut off
someone's unemployment benefits to see how well he or she could get along without them or to
see whether an alternative job-training program would be more effective for some unemployed
persons. But you could try to find volunteers for the new program. You could compare the results
for the volunteer group (experimental group) with those of people in the regular program (control
group). The study is quasi-experimental because you were unable to assign subjects at random
to treatment and control groups. Questionnaire - A group of written questions to which subjects
respond. Some restrict the use of the term "questionnaire" to written responses. [#]
An experimental design that is missing one or more aspects of the (classic) controlled
experiment.
Quasi-experimental design
Experiment approach in which the researcher does not assign subjects randomly to treatment
and control conditions.
Quasi-experimental designs
Research designs that have several of the keys features of randomized experimental designs,
such as pre-post measurement and treatment-control group comparisons, but lack random
assignment to treatment group. [#] Research designs that look like randomized or true
experiments (they have multiple groups and pre-post measurement) but use nonrandom
assignment to assign the groups.
Quasi-Experimental
A quasi-experiment is an experiment in which a potential cause (independent variable) has been
manipulated, but conditions do not permit the use of a random selection of research subjects
and/or the effective control of extraneous variables. Most field research which seeks to be an
experiment is likely to fall into the quasi-experimental category. See also Experimental research.
Questionnaire
A questionnaire comprises the questions to be asked of respondents. There are three main types:
questionnaires to be used in face to face or telephone interviews; self completion questionnaires,
which are read, completed and returned by respondents; and computer administered
questionnaires, which allow more complex question patterns than paper questionnaires. [#] A
data collection method in which participants read and answer questions in a written format. [#] A
pre formulated written set of questions to which the respondent records the answers, usually
within rather closely delineated alternatives.
Quota Sampling
A form of purposive sampling in which a predetermined proportion of people from different
348
subgroups is sampled. [#] Any sampling method where you sample until you achieve a specific
number of sampled units for each subgroup of a population.
Quota sampling
Nonprobability sampling method that creates and samples matching a predetermined
demographic profile .
R chart
A control chart used when the output of a process is measured in terms of the range of a variable.
R
Symbol used in design diagrams to represent random assignment to groups.
Random assignment
Method of placing subjects in different condition so that each subject has an equal chance of
being in any group, thus avoiding systematic subject differences between the experimental and
control groups. [#] Process of assigning your sample into two or more subgroups by chance.
Procedure for random assignment can vary from flipping a coin to using a table of random
numbers to using the random number capability built into a computer.
Random error
Random deviation, which tends to average to zero over numerous sample subjects or items.
Random Sample
A specific type of probability sample in which participants are selected from a population list using
a table of random numbers or a random number generator. (A random
sample requires a list of population members in which each member can
be assigned a discrete number.) The assignment of participants to
different treatments, interventions, or conditions according to chance,
rather than systematically. Random assignment of participants increases
the probability that differences observed between participant groups are
the result of the experimental intervention.
Random sampling
Drawing a representative group from a population by a method that gives every member of the
population an equal chance of being drawn.
Random selection
Process or procedure that assures that the different units in your population are selected by
chance.
349
Random variable
A numerical description of the outcome of an experiment.
Randomization
The process of controlling the nuisance variables by randomly assigning members among the
various experimental and control groups, so that the confounding variables are randomly
distributed across all groups.
Range
The highest value minus the lowest value. [#] A measure of variability, defined as the largest
value minus the smallest value. [#] The spread in a set of numbers indicated by the difference in
the tow extreme values in the observations. [#] A measure of variability consisting of the span
between the lowest and highest scores.
Ranking Scale
Scale used to tab preferences between tow or among more objects or items.
Rapport
Trusting relationship between interviewer and interviewee.
Rating scale
Scale with several responses categories that evaluate an object on a scale.
Ratio level
Type of measurement that assigns scores on a scale with equal intervals and a true zero point.
Ratio scale
A scale of measurement for a variable that has all the properties of interval that has all the
properties of interval data and the ratio of two values is meaningful. Ratio data are always
numeric.
Ration Scale
A scale that has an absolute zero origin, and hence indicates not only the magnitude, but also the
proportion of the differences.
Reactivity
Extent to which a measure causes a change in the behaviour of the subject.
Realism
In ontology, the view that the sources of our perceptions are real and not fictions.
Recall-dependent Question
Questions that elicit from the respondents information that involves recall of experiences from the
past that may be hazy in their memory.
Reciprocal causation
A two-way causal connection between two constructs or variables in which each causes the
other.
Record Sheets
The medium on which your recorded data will appear, whether on paper, through the computer or
350
via audio or video. The record may be anything from a blank page on which you write, through to
a structured recording schedule where you perhaps just have to fill in relevant ticks or ring
numbers, like the pre-coded answer block on a questionnaire.(See also Optical Mark Reader
System)
References
In the context of a research report a reference is a formal system for drawing attention to a
literature source, usually published, both in the report itself (for instance, when you want to
identify the source of a quotation) and in the bibliography or reading list at the end of the report.
Regression analysis
A general statistical analysis that enables us to model relationships in data and test for treatment
effects. In regression analysis, we model relationships that can be depicted in graphic from with
lines that are called regression lines.
Regression equation
The equation that describes how the mean or expected value of the dependent variable is related
to the independent variable.
Regression line
A line that describes the relationship between two or more variables.
Regression model
The equation describing how y is related to x and an error term.
Regression threat
A statistical phenomenon that causes a group’s average performance on one measure to regress
towards or appear closer to the mean of that measure than anticipated or predicted. Regression
occurs whenever you have a nonrandom sample from a population and two measures that are
imperfectly correlated. A regression threat will bias your estimate of the group’s posttest
performance and can lead to incorrect causal inferences.
Regression-Discontinuity (RD)
A pretest-posttest, program-comparison group quasi-experimental design in which a cutoff
criterion on the preprogram measure is the method of assignment to group.
351
Regularity theory
Causation is shown by a nonspurious association between two variables, a view of causation that
requires a weak test.
Rejection region
The range of values that will lead to the rejection of a null hypothesis.
Relationship
Refers to the correspondence between two variables.
Relative efficiency
Given two unbiased point estimators of the same population parameter, the point estimator with
the smaller standard deviation is more efficient.
Relevance
It is about the closeness with which the data being gathered feeds into the aims of the study.
Reliability
The degree to which a measure is consistent or dependable; the degree to which it would give
you the same result over again, assuming the underlying phenomenon is not changing. [#]
Attests to the consistency and stability of the measuring instruments. [#] The degree to which a
measure yields consistent results. [#] The consistency or stability of a measure or test from one
use to the next. When repeated measurements of the same thing give identical or very similar
results, the measure is said to be reliable.
Reliability coefficient
Estimate of the extent to which a measure is free of random error, usually arrived at by correlating
measures of the same type-for example, inter-item, interrater, parellel, form, or test- retest.
Reliability
Extent to which a measure reflects systematic or dependable sources of variation rather than
random error.
Remuneration
Payment for participation in research; this is different from compensation, which typically refers to
payment for research-related injuries.
Repeated measures
Two or more waves of measurement over time.
Replicability
The repeatability of similar results when identical research is conducted at different times or in
different organizational settings.
352
Replicability/Replication
As the name suggests, replication is the process of repeating a study undertaken by someone
else, in the sense of using the same methodology. Commonly the location and research subjects
will be different, though sometimes studies return to the same group of subjects after a period of
time has passed - e.g. with child development studies. A good research report always includes
enough information on the methods used to enable someone else to carry out a replication.
Replication
Repetition of a study to see if the same results are obtained. [#] The number of times each
experimental condition is repeated in an experiment.
Representative Sample
A sample in which the participants closely match the characteristics of the population, and thus,
all segments of the population are represented in the sample. A representative sample allows
results to be generalized from the sample to the population.
Research
An organized, systematic, critical, scientific inquiry or investigation into a specific problem,
undertaken with the objective of finding answers or solutions thereto.
Research Design
The science and art of planning procedures for conducting studies so as to get the most valid
findings. Called "design" for short. When designing a research study, one draws up a set of
instructions for gathering evidence and for interpreting it.
Research Mindedness
Includes the following essential elements: A faculty for critical reflection informed by knowledge
and research; An ability to use research to inform practice which counters unfair discrimination,
racism, poverty, disadvantage and injustice, consistent with core social work values; An
understanding of the process of research and the use of research to theorise from practice.
Research Plan
This is the researcher's guidebook for the project, and the yardstick against which the various
stages of progress can be judged. It states the outputs to be delivered and the timescale.
Research Population [or its derivatives such as 'survey population' and 'experimental
population']
Is the total number of potential subjects for your research. If this population is larger than you
need or can cope with, then you should use a rational and unbiased process for reducing the
number (sampling).
353
Research proposal
A document that sets out the purpose of the study and the research design details of the
investigation to be carried out by the researcher.
Research question
The central issue being addressed in the study, which is typically phrased in the language of
theory.
Research
A systematic investigation (i.e., the gathering and analysis of information) designed to develop or
contribute to generalizable knowledge.
Researcher Interference
The extent to which the person conducting the research interferes with the normal course of work
at the study site.
Resentful demoralization
A social threat to internal validity that occurs when the comparison group knows what the
program group is getting and instead of developing a rivalry, control group members become
discouraged or angry and give up.
Residual analysis
The analysis of the residuals used to determine whether the assumptions made about the
regression model appear to be valid. Residual analysis is also used to identify unusual and
influential observations.
Residual Error
In regression analysis, the distance or deviation between the estimated value from the regression
line and the actual value of the outcome variable for any value of the predictor variable; also
called residuals or deviations.
Residual path
In path analysis, the causal path representing the effect on the outcome variable of all variables
not specified in the study.
Residual plots
Graphical representations of the residuals that can be used to determine whether the
assumptions made about the regression model appear to be valid.
Residual
The difference between the observed value of the dependent variable and the value predicted
using the estimated regression equation. [#] The vertical distance from the regression line to
each point. The residual in regression analysis refers to the portion of the outcome or dependent
variable that you cannot predict with your regression equation.
Respondents
Research participants, who fill out a survey, are interviewed, participate in an experiment, are
observed in a naturalistic setting, or who are otherwise studied.
354
Response brackets
A question response type that includes groups of answers, such as between 30 and 40 years old,
or between $50,000 and $100,000 and annual income.
Response format
The format you use to collect the answer from the respondent.
Response Rate
The proportion of people asked to take part in research who actually become respondents. Non-
response occurs when you have selected a sample and some of them do not provide data - for all
sorts of reasons, but basically through your inability to make contact or their refusal. Usually face-
to-face surveys will have a non-response rate of around 25% and postal surveys nearer 50%.
Response scale
A sequential numerical response format, such as a 1-to-5 rating format.
Response set
Tendency of a person to answer items in a way designed to produce a preferred image (for
example, social desirability); can reduce construct validity.
Response style
Peron’s habitual manner or responding to test items that is independent of item content (for
example, acquiscence); can reduce the construct validity of the measure.
Response
A specific measurement value that a sampling unit supplies.
Retrospective Studies
Research conducted by reviewing records from the past (e.g., birth and death certificates,
medical records, school records, or employment records) or by obtaining information about past
events elicited through interviews or surveys. Case studies are an example of this type of
research.
Reverse causation
Threat to internal validity for an observed association in which the causal direction is opposite to
that hypothesized.
Right to service
The ethical issue involved when participants do not receive a service that they would be eligible
for if they were not in your study. For example, the member of a control group might not receive a
drug because they are in a study.
Rigor
The theorelogotical and methodological precision adhered to in conducting research.
Risk
The probability of harm or injury (physical, psychological, social, or economic) occurring as a
355
result of participation in a research study. Both the probability and magnitude of possible harm
may vary from minimal to significant (See >>> Minimal Risk).
Robust
Relative immunity of an inferential statistic to violation of its assumptions.
Rules of integration
(Erzberger & Kelle, 2003) A set of rules can be formulated that may be helpful for drawing
inferences from the results of qualitative and quantitative studies in mixed methods designs.
These rules should be understood as general guidelines whose significance vary according to the
research question, the empirical domain under investigation, and the specific methods employed.
Erzberger and Kelle list eight rules of integration. >>> Inference
Sample bias
When numerous samples are on average unrepresentative of the population.
Sample error
Unavoidable, random deviations of different sample estimates from each other.
Sample point
An element of the sample space. A sample point represents an experimental outcome.
Sample population
The population from which the sample is taken.
Sample size
The actual number of subjects chosen as a sample to represent the population characteristics.
Sample space
The set of all experimental outcome.
Sample stages
Steps at which elements or clusters are drawn as part of the sampling design.
Sample statistic
A numerical value used as a summary measure for a sample (e.g. the sample mean, the sample
variance and the sample standard deviation). The value of the sample statistic is used to estimate
the value of the population parameter.
Sample
A subset of a given population used for research purposes. [#] A subset of the population. [#] A
subset or subgroup of the population. [#] Subset of individuals selected from a larger population.
[#] The actual units you select to participate in your study.
356
Sampling distribution
The theoretical distribution of an infinite number of samples of the population of interest in your
study. [#] A probability distribution consisting of all possible values of the sample statistic.
Sampling error
Error in measurement associated with sampling. [#] The absolute value of the difference
between an unbiased point estimator and the corresponding population parameter. It is the error
that occurs because a sample, and not the entire population, is used to estimate a population
parameter.
Sampling Frame
Available list of elements from which sample can actually be
drawn, usually not a complete enumeration. [#] The list from
which you draw your sample. In some cases, there is no list; you
draw your sample based upon an explicit rule. For instance, when
doing quota sampling of passers-by at the local mall, you do not
have a per se, and the sampling frame is the population of people
who pass by within the time frame of your study and the rule(s)
you use to decide whom to select. [#] The sampling frame is
what you have when you have checked out your research population (that is, all potential
respondents), and have left the names and contact details of all of those from the research
population who genuinely can become respondents, if they are willing.
Sampling model
A model for generalizing in which you identify your population, draw a fair sample and conduct
your research, and finally generalize your results to other populations groups.
Sampling Unit
Either the element or grouping of elements selected at a sampling stage.[#] The units selected for
sampling. A sampling unit may include several elements.
Sampling
The process by which you reduce the total number of possible respondents for a research project
(the research population) to a number which is practically feasible and theoretically acceptable
(the sample). [#] The process of selecting items from the population so that the sample
characteristics can be generalized to the population. Sampling involves both design choice and
sample size decisions.
Sampling-Non random
'Not random' sampling means that the principle of randomness has not been maintained in the
selection of a sample. Often it involves structured sampling whereby the sample group is carefully
matched to the overall population on key variables. 'Non-random sampling' is often convenient, or
the only approach possible in the circumstances
Sampling-Random
The aim of random sampling is to combine chance (that everyone in the frame has the same
357
chance of being chosen) with balance (that the chosen sample will be an accurate microcosm of
the research population as a whole).
Scalar structure
Hierarchical pattern in a set of items in which harder or less frequently chosen items are chosen
only if easier or more frequently chosen items are also chosen by most respondents.
Scale
A group of related measures of a variable. The items in a scale are arranged in some order of
intensity or importance. A scale differs from an index in that the items in an index need not be in a
particular order, and each item usually has the same weight or importance. [#] A tool or
mechanism by which individuals, events , or objects are distinguish on the variables of interest in
the some meaningful way.
Scaling
The branch of measurement that involves the construction of an instrument that associates
qualitative construct with quantitative metric units.
Scatter diagram
A graphical presentation of the relationship between two quantitative variables. The independent
variable is shown on the horizontal axis and the dependent variable is shown on the vertical axis.
Scattergram
Graphic presentation of an association in which each point indicates the two scores on an
individual.
Scenario writing
A qualitative forecasting method that consists of developing a conceptual scenario of the future
based on a well-defined set of assumptions.
Scientific Investigation
A step-by-step, logical, organized and rigorous effort to solve problems.
Search Engine
Software program designed to search and locate information through “keywords”, typically in
documents on the world wide web.
Secondary analysis
Analysis that makes use of already existing data sources.
Secondary Data
That have already been gathered by researchers, data published in statistical and other journals,
and information available from any published or unpublished source available either within or
outside the organized, all of which might be useful to the researcher.
Secondary source
A primary source is that which provides the initial basic data set under discussion, while a
secondary source is one which offers further analysis or commentary on the data. For example,
the primary source for demographic data in the UK is likely to be the publications of the Office of
Population and Census Studies - OPCS - while there are many secondary sources which make
use of OPCS output. Generally it is better, if you can, to make reference to primary sources.
358
Selection Bias
A bias in the way the experimental and or comparison groups are selected, resulting in pre-
existing differences between the groups that may serve as confounding factors.
Selection Effects
The threat to internal validity that is a function of improper or unmatched selection of subjects for
the experimental and control groups.
Selection
Group threat to internal validity in which difference observed between groups at the end of the
study existed prior to the intervention because of the way members were sorted into groups.
Selection-by-time interaction
Group internal validity threat in which subject with different likelihoods of experiencing time –
related changes (for example, maturation or history) are placed into different groups.
Selection-history threat
A threat to internal validity that arises from any differential rates of normal growth between pretest
and posttest for the groups.
Selection-instrumentation
A threat to internal validity that results from differential changes in the test used for each group
from pretest to posttest.
Selection-maturation threat
A threat to internal validity that arises from any differential rate of normal growth between pretest
and posttest for the groups.
Selection-mortality
A threat to internal validity that arises when there is differential nonrandom dropout between
pretest and posttest.
Selection-regression
A threat to internal validity that occurs when there are different rates of regression to the mean in
the groups.
Selection-testing threat
A threat to internal validity that occurs when a differential effect of taking the pretest exists
between groups on the posttest.
Selective coding
Systematic coding with respect to a previously developed core concept.
Semantic differential
A scaling method in which an object is assessed by the respondent on a set of bipolar adjective
pairs.
359
Sequential explanatory design
This design “is characterized by the collection and analysis of quantitative data followed by the
collection and analysis of qualitative data. Priority is typically given to the quantitative data, and
the two methods are integrated during the interpretation phase of the study.” [#] This design “is
characterized by an initial phase of qualitative data collection and analysis, followed by a phase of
quantitative data collection and analysis. Therefore, the priority is given to the qualitative aspects
of the study.
Serial correlation
Assumption made in the regression analysis of a time series that the residuals or differences
between estimated and actual values will be uncorrelated; such correlation is called serial
correlation of the errors and requires special care in the analysis. [#] Same as auto-correlation
Setting
Experimental arrangement of the study or, more broadly, the larger social context in which the
study takes place.
Shadow prices
Estimated monetary value of program resources that are otherwise unpriced (for example,
volunteer time), based on estimated value in the marketplace.
Sign test
A non-parametric statistical test for identifying differences between two populations based on the
analysis of nominal data.
Simulation
A model - building technique for assessing the possible effects of changes that might be
introduction in a system.
Single-Blind Design
Typically, a study design in which the investigator, but not the participant, knows the identity of
360
the treatment assignment. Occasionally the participant, but not the investigator, knows the
assignment.
Single-factor experiment
An experiment involving only one factor with k populations or treatments.
Single-group threat
A threat to internal validity that occurs in a study that uses only a single program or treatment
group and no comparison or control.
Single-option variable
A question response list from which the respondent can check only one response.
Single-subject design
Also called n –l design usually a multiple-intervention design applied to a single subject.
Site Visit
A visit by agency officials, representatives, or consultants to the location of a research activity to
assess the adequacy of IRB protection of human participants or the capability of personnel to
conduct the research.
Skepticism
Attitude of doubting and challenging assertions.
Skewness
Degree of departure from symmetry (that is, one side or tails of the distribution is longer than the
other). If the longer tail is to the right, it is called positive skew; to the left, negative skew.
Slope
In regression analysis, the angle of the best, fitting line, that is, how many units on the vertical
axis the line rises or falls for each unit on the horizontal axis; commonly symbolized by the letter
b. [#] The change in y for a change in x of one unit.
Smoothing constant
A parameter of the exponential smoothing model that provides the weight given to the most
recent time series value in the calculation of the of the forecast value.
Snowball Sample
A non-probability sample that is created by using members of the group of interest to identify
other members of the group (for example, asking a participant at the end of an interview for
suggestions about who else to interview).
Snowball sampling
A sampling method in which you sample participants based upon referral from prior participants.
Social Desirability
The respondents’ need to give socially or culturally acceptable responses to the questions posed
by the researcher even if they are not true.
361
Social discount rate
Discount rate used in adjusting future returns to social programs, usually lower than the prevailing
private investment rate of return.
Social Experimentation
Systematic manipulation of, or experimentation in, social or economic systems; used in planning
public policy.
Social indicators
Regular reports on the psychological and social well-being of the population similar to what
economic indicators such as unemployment rates do for economic well-being.
Social significance
In contrast to statistical significance, the societal value or importance placed on a study or its
outcome.
Sociometry
Measurement approach that described a person’s social relationships from the number of
“choices” of that person made by others.
Soft data
Data such as people's ideas and opinions. A characteristic of qualitative research.
Software
Technology that is capable of designing programs to meet the different computing needs of
individuals and companies.
Specification error
In regression analysis, omission of an important causal variable; can lead to mis-estimation of the
relationships among variables included in the analysis.
Split-Half Reliability
Coefficient between one half of the items measuring a concept and the other half.
362
Spontaneous remission
Apparent natural improvement of control subjects that may be due in part to compensatory
contamination, that is, their acquisition of unreported treatment.
Spuriousness
Two variables associated because both are caused by another variable.
Stability of a Measure
The ability of the measure to repeat the same over time with low vulnerability to changes in the
situation.
Stakeholders
People with an interest (stake) in the research. Examples are subjects of the research, including
service users and carers, researchers, agency staff, funding bodies and commissioners of
research. Politicians and policy makers are also increasingly research stakeholders.
Standard error
The spread of the averages around the average of averages in a sampling distribution. [#] The
standard deviation of a point estimator.
Standard variable
In social science research, especially in survey analysis, there are a range of variables which are
usually considered 'standard' or 'key', in the sense that some analysis is undertaken in relation to
363
each of them. The list will change according to the specific research project, but may well include
such items as age, gender, socio-economic group, ethnicity, employment, family background,
housing.
Standardization
Arrangement of measurement procedures so that they will be identical (or nearly so) when
applied to different subjects, at different times, or by different raters; also, in experimentation, the
reduction of human experimenter variability in treatment of subjects of use of fixed script.
Standardized residual
The value obtained by dividing a residual by its standard deviation.
Stapel Scale
A scale that measures both the direction and intensity of the attributes of a concept.
Statical Panel
A panel that consists of the sane group of people serving as subjects over an extended period of
time for a research study.
Statical Regression
The threat to internal validity that results when various groups in the study have been selected on
the basics extreme(very high or very low) scores on some important variables.
Statistic
A value that is estimated from data.
Statistical inference
The process of using data obtained form a sample to make estimates or test hypothesis about the
characteristics of a population.
Statistical power
The probability of correctly concluding that there is a treatment or program effect inyour data.
Statistical probability
Being able to draw a conclusion from a sample and generalising it to a wider population.
Statistical significance
Tests of statistical significance, of which the best known is probably the Chi-square, which is a
measure of probability. Where a research sample has been used, it is important to know, whether
the findings are valid or came about by chance. [#] In inferential statistics, the judgment that a
finding was not due to chance. [#] The probability that difference between the outcomes of the
control and experimental group are great enough that it is unlikely it is due solely to chance. The
probability that the null hypothesis can be rejected at a predetermined significance level (0.05 or
0.01).
Statistical Tests
Researchers use statistical tests to make quantitative decisions about whether a study’s data
indicate a significant effect from the intervention and allow the researcher to reject the null
hypothesis. That is, statistical tests show whether the differences between the outcomes of
the control and experimental groups are great enough to be statistically significant. If differences
364
are found to be statistically significant, it means that the probability (or likelihood) that these
differences occurred solely due to chance is relatively low. Most researchers agree that a
significance value of .05 or less (there is a 95% probability that the differences are real)
sufficiently determines significance.
Statistics
Numerical summaries of observations, either descriptive or inferential. [#] The art and science of
collecting, analyzing, presenting and interpreting data.
Stem-and-leaf display
An exploratory data analysis technique that simultaneously ranks quantitative data and provides
insights into the shape of the distribution.
Stories
From a research viewpoint a story is a record of an event in a respondent's life told by the
respondent from his/her own perspective in his/her own words.
Stratification
Categorization of elements having some common characteristic. The group of all elements having
such a common characteristic is called a stratum, for example males or females.
Stratified Sample
A probability sample that is determined by dividing the population into groups or strata defined by
the presence of certain characteristics and then random sampling from each of the strata. This is
a good way to make sure that a student sample is racially diverse (for instance). [#] A method of
selecting a sample in which the population is first divided into strata and a simple random sample
is then taken form each stratum.
Stress index
An index used in multidimensional scaling that ranges from1 (worst fit) to 0 (perfect fit).
Structural Variables
Factors related to the form and design of the organization such as the roles and positions,
communication channels, control systems, reward systems, and span of control.
365
Structured interview
An IDI that often uses a detailed interview guide similar to a questionnaire to guide the question
order; questions generally use an open-ended response strategy. [#] A data collection method in
which an interviewer reads a standardized interview schedule to the respondent and records the
answers. (Not to be confused with an in-depth interview.)
Structured Interviews
Interviews conducted by the researcher with a predetermined list of questions to be asked of the
interview.
Structured response
Participant’s response is limited to specific alternatives provided; a.k.a closed response.
Sub of squares
Sum of the squared deviations from the mean in calculations of variance or from the regression
line in assessing its fit.
Subject
A single member of the sample. [#] An individual who is studied.
Subjective method
A method of assigning probabilities on the basis of judgment.
Subjectivist
The belief that there is no external reality and that the world as you see it is solely a creation of
your own mind.
Suggestion
In social research, the effect on subjects of their beliefs about their situation.
Summative evaluations
Evaluations that examine the effects or outcomes of some program or treatment.
Supergroup
A group interview involving up to 20 people.
Survey
A measurement process using a highly structured Interview
; employs a measurement tool called a questionnaire,
measurement instrument, interview schedule. [#] A research
design in which a sample of subjects is drawn from a
population and studied (usually interviewed) to make
inferences about the population. This design is often
contrasted with the true experiment in which subjects are
366
randomly assigned to conditions or treatments. [#] A study in which the same data are collected
from all members of the sample using a highly structured questionnaire and analyzed
using statistical tests
Systematic Sample
th th th
A probability sample that is determined by selecting every ‘nth’ (5 , 10 , 50 , etc.) person from a
list of the entire population, after the first person has been randomly selected.
Switching-Replication design
A two-group design in two phases defined by three waves of measurement. The implementation
of the treatment is repeated in both phases. In repetition of the treatment, the two groups switch
roles; the original control group in phase 1 becomes the treatment group in phase 2; whereas the
original treatment acts as the control. By the end of the study, all participants have received the
treatment.
Symbolic interactionism
Theoretical perspective concerned with the meanings that things and events have for human
beings and the production of these meanings in human interchange.
Symmetric matrix
A square table where the value for each numbered row and column is identical to the value for
the same numbered column and row.
Symmetrical relationship
When two variables vary together but without causation.
Symmetry
In a frequency distribution, the degree of similarity of shape of the left and right sides of the
distribution.
Synergy
The process at the foundation of group interviewing that encourages members to react to and
build on the contributions of others in the group.
Synopsis
A brief summary of the research study.
Systematic error
Error that results from bias; see also systematic variance.
Systematic observation
Data collection through observation that employs standardized procedures, trained observes,
schedules for recording, and other devices for the observer that mirror the scientific procedures of
other primary data methods.
367
sampling frame and then follow a rule to select every xth element in the sampling frame list
(where the ordering of the list is assumed to be random).
Systematic review
A summary of relevant literature that uses explicit methods to perform a thorough literature
search and critical appraisal of individual studies and that uses appropriate statistical techniques
to combine these valid studies.
Systematic sampling
A method of choosing a sample by randomly selecting the first element and then selecting every
th
k element thereafter. [#] A probability sample drawn by applying a calculated skip interval to a
sample frame; population (N) is divided by the desired sample (n) to obtain a skip interval (k).
Using a random star between l and k, each kth element is chosen from the sample frame; usually
treated as a simple random sample but statistically more efficient. [#] A probability sampling
design that involves choosing every nth elements in the population for the sample.
Systematic sampling
th
Probability sampling technique in which every n element is sampled from an existing list of
available elements.
Systematic variance
The variation that causes measurements to skew in one direction or other.
t distribution
A family of probability distributions that can be used to develop interval estimates of a population
mean whenever the population standard deviation is unknown and the population has a normal or
near-normal probability distribution.
Tabular review
Traditional approach to literature review that summarizes each study by a line in one or more
tables.
Tabulations
A set of data which provide a count of the number of occasions on which a particular
answer/response has been given across all of those respondents who tackled the question.
Tactics
Specifies, timed activities that execute a strategy.
Target population
The population about which inferences are made.
Target question
Measurement question that addresses the core investigative questions of a specific study; these
can be structured or unstructured questions.
368
Target question, structured
A measurement question that presents the participant with a fixed set of categories per variable.
Tau (T )
A measure of association that uses table marginal’s to reduce prediction errors, with measure from
0 to 1.0 reflecting percentage of errors estimates for prediction of one variable based on another
variable.
Tau b (T b)
A refinement of gamma for ordinal data that consider “tied” pairs, not only discordant and
concordant (values from -1.0 to +1.0); used best on square tables 9 one of the most widely used
measures for ordinal data).
Tau c (T c)
A refinement of gamma for ordinal data that considered “tied” pairs, not only discordant and
concordant pair (values from -1.0 to +1.0); used for any size table (one of the most widely used
measures for ordinal data).
t-distribution
A normal distribution with more tail area than that in a Z normal distribution.
Technical report
A report written for an audience of researchers.
Technology
Any mechanism that transforms inputs to outputs.
Telephone interview
A study conducted wholly by telephone contact between participant and interviewer.
Telephone Interview
The information-gathering method by which the interviewer asks the interviewee over the
telephone, rather than face , for information needed for the research.
Temporal precedence
One criterion for establishing a casual relationship that hold s that the cause must occur before
the effect,
Tertiary sources
Aids to discover primary or secondary sources, such as indexes, bibliographies, and internet
search engines; also may be an interpretation of a secondary source.
Test market
A controlled experiment conducted in a carefully chosen marketplace (e.g., Web site, store, town,
or other geographic location) to measure an predict sales or profitability of a product.
369
Test reactivity
A time threat to internal validity that occu5rs when the observed change over time stems from the
pretest rather than the experimental variable.
Test statistic
A graphical representation helpful in identifying the sample points of an experiment involving
multiple steps.
Test unit
An alternative term for a subject within an experiment (a person, an animal, a machine, a
geographic entity, an object, etc.).
Testibility
The ability to subjects the data collected to appropriate statical tests ,in order to substantiate or
reject the hypotheses developed for the research study.
Testing Effects
The distorting effects on the experimental results (the post test scores) caused by the prior
sensitization of the respondents to the instruments through the pre test.
Testing threat
A threat to internal validity that occurs when taking the pretest affects how participants do on the
posttest.
Test-retest Reliability
A way of established the stability of measuring instrument by correlating the scores obtained
through its administration to the same set of respondents at two different points in time.
Textual analysis
Used in analysis of secondary source data and also in qualitative research. It involves working on
a text in depth, looking for keywords and concepts and making links between them. The term also
extends to literature reviewing.Increasingly, much textual analysis is done using computer
programs.
Theoretical
Pertaining to theory. Social research is theoretical, meaning that much of it is concerned with
developing, exploring, or testing the theories or ideas that social researchers have about how the
world operates.
Theoretical Framework
A logically developed ,described, and explained network of associations among variables of
interest to the research study.
Theoretical sampling
A nonprobability sampling process where conceptual or theoretical categories of participants
develop during the interviewing process; additional participants are sought who will challenge
emerging patterns.
370
Theoretical variable
Concept or construct as distinct from a variable that is measured.
Theory
A set of systematically interrelated concept, definitions, and propositions that are advanced to
explain or predict phenomena (facts); the generalizations we make about variables and the
relationships among variables.
Theory
A general explanation about a specific behavior or set of events that is based on known
principles and serves to organize related events in a meaningful way. A theory is not as specific
as a hypothesis. [#] Tentative or preliminary explanations of causal relationships.
Third-variable problem
An unobserved variable that accounts for a correlation between two variables.
Threats to validity
Reasons your conclusion or inference might be wrong.
Time sampling
The process of selecting certain time points or time intervals to observe and record elements,
acts, or conditions from a population of observable behaviors or conditions to represent the
population as a whole; three types include time point sample, time-interval samples, and
continuous real time samples.
Time series
Data for a variable at numerous time points, for example, the unemployment rate for each month
for several years. [#] Many waves of measurements over time.
Time threats
Internal validity threats to within subjects designs; protected against by control groups. (See table
9-1 regarding the four time threats of history, maturation, instrumentation, (or measurement
decay) and test reactivity).
Topic outline
Report planning format; uses key word or phrases rather than complete sentences to draft each
report section.
Tow-tailed test
A hypothesis test in which rejection of the null hypothesis occurs for values of the test statistic in
either tails of the sampling distribution.
Traces
Physical records based either on wear or erosion or on leavings or accretions.
Trait
Personality characteristic or behavioral style.
371
Translation validity
A type of construct validity related to hoe well you translated the idea of your measure into its
operationalization.
Treatment
In experiments, a treatment is what researchers do to the subjects in the experimental group, but
not to those in the control group. A treatment is thus an independent variable. [#] The
manipulation of the independent variable in experimental designs so as to determine its effects on
a dependents variable of interest to the researcher.
Treatment levels
The arbitrary or natural grouping within the independent variable of an experiment.
Treatment
The experimental factors to which participants are exposed. [#] Different levels of a factor.
Tree diagram
A graphical representation helpful in identifying the sample points of an experiment involving
multiple steps.
Trend survey
Longitudinal survey design involving a series of cross-sectional -surveys each based on a
different sample.
Trends
Patterns in time series marked by longterm increases or decreases.
Triad
A group interview involving three people.
Trials
Repeated measures taken from the same subject or participant.
Triangulation
Research design that combines several qualititative with quantitative methods; most common are
simultaneous QUAL/QUANT in single or QUANT-QUAL, sequential QUAL-QUANT-QUAL. [#] A
multi-method or pluralistic approach, using different methods in order to focus on the research
topic from different viewpoints and to produce a multi-faceted set of data. Also used to check the
validity of findings from any one method. [#] Method of comparing observations from different
times and sources to arrive at a correct analysis.
True experiment
Experimental design in which suspects are randomly assigned to two or more differently treated
conditions.
True score
That part of the observed score that reflects the construct of interest.
Truncation
A search protocol that allows a symbol (usually”?” or”*”) to replace one or more characters or
letters in a word or at the end of a word root.
372
t-test
A parametric test to determine the statistical significance between a sample distribution mean and
population parameter; used when the population standard deviation is unknown and sample
standard deviation is used as a proxy. [#] A statistical test established a significant mean
difference in a variable between two groups. [#] A statistical test of the difference between the
means of two groups, often a program and comparison group. The t-test is the simplest variation of
the one-way Analysis of Variance (ANOVA).
t-value
The estimate of the difference between the groups relative t the variability of the scores in the
groups.
Two-independent-sample tests
Parametric and nonparametric tests used when the measurements are taken from two samples
that are unrelated (Z test, t-test, chi-square, etc.).
Two-stage design
A design in which exploration as a distinct stage precedes a descriptive or casual design.
Two-tailed hypothesis
A hypothesis that does not specify a direction, for example in a study on self-esteem, you do not
predict whether your assisted-living program will have a positive or negative effect on the self-
esteem of the respondents in your study.
Two-tailed test
A nondirectional test to reject the hypothesis that the sample statistic is either greater than or less
than the population parameter.
Type II error
The error of accepting Ho when it is false.
Type I error
Error when one rejects a true null hypothesis (there is no difference); the beta(B); the alpha (
)value, called the level of significance, is the probability of rejecting the true null hypothesis. [#]
When a test wrongly shows an effect or condition to be present (e.g. that a woman is pregnant
when, in fact, she is not). When a researcher falsely rejects the null hypothesis [#] Error of
rejecting the null hypothesis when it is true. [#] The
error of rejecting Ho when it is true.
Type II Error
When a test wrongly shows an effect or condition
to be absent (e.g. that a woman is not pregnant
when, in fact, she is). When a researcher fails to
reject the null hypothesis [ #] Error of not rejecting
the null hypothesis when it is false. [#] When one
fails to reject a false null hypothesis; the beta (-)
373
value is the probability of failing to reject the false null hypothesis ; the power of the test 1-B and
is the probability that will correctly reject the false null hypothesis.
Unbiased Questions
Questions posed in accordance with the principles of wording and measurement, and the right
questioning technique, so as to elicit the least biased responses.
Unbiasedness
A property of a point estimator when the expected value of the point estimator is equal to the
population parameter it estimates.
Unidimentional scale
Instrument scale that seeks to measure only one attributes of the participant or object.
Unit effect
Pattern of preexisting differences across spatial units that accounts for discrepancy between
analysis over time and analysis over space.
Unit of analysis
The entity that you are analyzing in your analysis: for example individuals, groups,or social
interactions.
Unit of analysis
The level of aggregation of the data collected during data analysis.
Univariate Analysis
Studying the distribution of cases of one variable only--for example, studying the ages of welfare
recipients but not considering their gender, ethnicity, and so on.
Univariate statistics
Descriptive statistics for one variable.
Unobtrusive measures
A set of observational approaches that encourage creative and imaginative forms of indirect
observation, archival searches and variations on simple and contrive observation, including
physical traces observation (erosion and accretion). [#] Measurement of the data gathered from
sources other than people ,such as examination of birth and death records or count of the number
of cigarette burns in the ashtray. [#] Methods used to collect data without interfering in the lives of
the respondents.
374
Unobtrusive
With respect to qualitative research, unobtrusive observation involves disguised entry and
participation without the knowledge of the subjects that they are under scientific scrutiny.
Unsolicited proposal
A suggestion by a contract researcher for research that might be done.
Unstructured interview
A customized IDI with no specific questions or order of topics to be discussed; usually starts with
a participant narrative.
Unstructured interviewing
An interviewing method that uses no predetermined interview protocol or survey and where the
interview questions emerge and evolve as the interview proceeds.
Unstructured Interviews
Interviews conducted with the primary purpose of identified some important issues relevant to the
problems situation ,without prior preparation of a planned or predetermined sequence of
questions.
Unstructured response
Where participant’s response is limited only by space, layout, instructions, or time; usually free
response or fill-in response strategies.
User-involvement in research
The politicization of service-users as active participants in service design and delivery, rather than
passive recipients, has extended to research activities. User involvement in research situates
users as active stakeholders in the research process and involvement may be on a range from
responsibility for the design, delivery and reporting of research to inclusion as a key stakeholder
group. User-involvement impacts on methodologies, which questions are asked, the way they are
asked who asks them and what happens to the answers. (>>> Emancipatory Research)
Utilitarianism
Ethical approach that seeks a rational balancing of costs and benefits of behaviors.
Utility score
A score in conjoint analysis used to represent each aspect of a product or service in a participant’s
overall preference ratings.
Utilization
Stage of evaluative research that gauges the exter to which the research report is used and
provides guidance for better dissemination of evaluation results.
375
Validity
A term to describe a measurement instrument or test that measures what it is supposed to
measure; the extent to which a measure is free of systematic error.
For example, a bathroom scale provides a reliable measure cannot
give a valid measure of height. [#]A characteristics of measurement
concerned with the extent that a test measures what the researcher
actually wishes to measure; and that differences found with a
measurement tool reflect true differences among participants drawn
from a population. [#]Concerns the extent to which your research
findings can be said to be accurate and reliable, and the extent to
which the conclusions are warranted. [#]Evidence that the instrument ,technique, or process used
to measure a concept does indeed measures the intended concept. [#]Extent to which a measure
reflects the intended phenomenon (for example, construct, Criterion, or content domain); more
generally the truth value of an assertion. [#]The best available approximation of the truth of a
given proposition, inference, or conclusion. [#]
Validity [Face]
At face value, does the measure seem valid?
Validity [Internal]
Does a study’s confusions about causal relationships agree with what is actually true?
Validity coefficient
Estimate of the agreement of the measure being vasidated with a criterion (criterion validity) or a
measure thought to reflect the target construct (convergent type construct validity).
Validity [Construct]
Does the measure of a given concept relate to a measure of another theoretically associated
concept?
Validity [Criterion]
Is the measure associated with expected behaviors?
Validity [Content]
Does the measure cover diverse meanings of the concept?
Validity, construct
The degree to which a researcher instrument is able to provide evidence based on theory.
Validity, content
The extent to which measurement scales provide adequate coverage of the investigative
questions.
Validity, criterion-related
The success of measurement scale for prediction or estimation; types are predictive and
concurrent.
376
Variability
Term for measures of spread or dispersion within a data set. [#] In descriptive statistics, the
dispersion of individuals scores from each other (for example, standard deviation).
Variable
Any characteristic or trait that can vary from one person to another (race, sex, academic major) or
for one person over time (age, political beliefs). [#] Any entity that can take on different values.
For instance, age can be considered a variable because age can take different values for
different people at different times. [#] Any factor which may be relevant to a research study. In a
survey, for example, you may choose to analyse data by the age and gender of respondents.
'age' and 'gender' are variables. [#] Anything that can take on differing or varying values. [#]
Measure or indicator thought to represent an underlying construe or concept and produced by an
operational definition of the construct or concept.
Variance
A measure of score dispersion about the mean; calculated as the squared deviation scores from
the data distribution, s mean; greater the dispersion of scores, the greater the variance in the data
set. [#] A measure of the spread of scores in a distribution of
scores, that is, a measure of dispersion. The larger the variance,
the further the individual cases are from the mean. The smaller the
variance, the closer the individual scores are to the mean. [#] A
measure of variability of a data set, based on the squared
deviations of the data values about the mean. It is also a measure
of the variability, of dispersion, of a random variable. [#] Indicates
the dispersion of a variable in the data set, and is obtained by
subtracting the mean from each of the observations, squaring the
results, summing them ,and dividing the total by the number of
observations. [#] The spread of the scores around the mean of a distribution. Specifically, the
variance is the sum of the squared deviations from the mean divided by the number of
observations minus 1.
377
are aliens and are confronting the product for the first time; they then describe their reactions,
questions, and attitudes about purchase or retrial.
Visual aids
presentation tools used to facilitate understanding of content (e.g., chalkboard, whiteboards,
handouts, flip chart, overhead transparencies, slides, computer-drawn visuals, computer
animation).
Vocational filter
Phenomenon of people removing themselves from desirable career paths due to math
avoidance.
Voice recognition
Computer systems programmed to record verbal answers to questions.
Voluntary Participation
The principle that study participants choose to participate of their own free will, rather than being
coerced or forced to participate. For IRB purposes, this is a key part of your study proposal; you
must demonstrate that participants will be participating voluntarily for a study to be approved by
the IRB. [#] For ethical reasons, researchers must ensure that study participants are taking part
in a study voluntarily and are nit coerced.
Weak inference
Conclusion of causality based on the regularity theory of causation. Involves finding
unconfounded covariation between the variables in question that is consistent with a model:
equally good-fitting alternative models are not ruled out.
Web site
Site accessible on the Internet or Internet, created by individuals and organizations for the
purpose of sharing information.
Web-based questionnaire
A measurement instrument both delivered and collected via the Internet; data processing is
ongoing. Two options currently exists; proprietary solutions offered through research firms and off-
the-shelf software for researcher who possess the necessary knowledge and skills; a.k.a. online
survey, online questionnaire, Internet survey.
Weighted mean
The mean for a data set obtained by assigning each data value a weight that reflects its
importance with the set.
378
Weighted moving average
A method of forecasting or smoothing a time series by computer a weighted average of past data
values. The sum of the weights must equal one.
Weighting
Stratifying your sampling to achieve an outcome which is typical of the research population can
result in some of the strata being too small for tabular or statistical presentation. For example,
carers stratified by gender may give a very small group of males. In this instance you may choose
to increase the size of the male stratum solely in order to get a viable number. This is weighting
your sample. In the same way you may want to weight clusters.
White noise
In time series analysis the residuals between the estimated and actual values that have no
correlation among themselves at any lag.
Within-Design Consistency
The consistency of the procedures of the study from which the inferences emerged
Within-participants Design
A research design in which each participant experiences, at different times, all levels of
the independent variable (or both the experimental and control treatment). Thus, each participant
is tested once in each condition
Word association
A projective method of identifying respondents attitudes and feelings by asking them to associate
a specified word with the first thing that comes to their mind.
X chart
A control chart used when the output of a process is measured in terms of the mean value of a
variable such as a length, weight, temperature, and so on.
379
Yahoo
An acronym for yet another hierarchically officious oracle, a worldwide directory of Web sites
developed in 1994 by two Stanford University engineering students to organize Web content in
a hierarchical system of subject categories. Yahoo! also provides other Web-based services
(news, weather, travel, e-mail, shopping, games, etc.). It uses a smaller database than most other
Web search engines, but searches in Yahoo! usually have high precision because the Web sites
it lists are selected by human beings rather than robot software. Jonathan Swift coined
the term "yahoo" in Gulliver’s Travels (1726) to refer to an imaginary race of coarse, brutish
creatures in human form. Mark Twain later applied it to any boorish person.
Yapp binding
A form of limp or semi-limp leather binding with rounded corners and bent-
in edges that overlap the sections, sometimes by as much as half the thickness
of the text block, named after William Yapp, the 19th-century bookseller who
designed the style for pocket bibles sold in England (see this example). Geoffrey
Glaister notes in Encyclopedia of the Book (Oak Knoll/British Library, 1996) that
a similar style of binding with tooled edges was used in the mid-16th century.
Yearbook
An annual documentary, historical, or memorial compendium of facts, photographs, statistics,
etc., about the events of the preceding year, often limited to a specific country,
institution, discipline, or subject . Optional yearbooks are offered by some publishers of
general encyclopedias. Most libraries place yearbooks on continuation order and shelve them in
the reference collection. Yearbooks of historical significance may be stored in archives or special
collections.
Yellow press
A popular name for newspapers and periodicals of the early 20th century that published news
stories of a vulgarly sensational nature, comparable to the modern tabloid.
z-score
A value found by dividing the deviation bout the mean by the standard deviation s. a z- score is
referred to as a standers value and denotes the number of standard deviation a data value x is
form the mean.
Z distribution
The normal distribution of measurements assumed for comparison.
380
Z test
A parametric test to determine the statistical significance between a sample distribution mean and
a population parameter; employs the Z distribution.
381
A2Z
PhD
Thesis
Reflections on Academic Research
Appendix - I
381
Detailed Guidelines for Chapters
Following pages describe a proto-type format with details of subsections of each chapter
in a five chapter structure model of a PhD Thesis. This illustration forms a basis in
developing a draft thesis and guides the scholars in the process of completing the final
Thesis.
Tips to develop the subsections are provided for Chapter I. Using these tips,
research scholars could develop the subsections of other chapters based on the
requirements of individual research study.
Format gives the model for the contents of the first page for each chapter.
First page of each chapter will contain a brief summary of the chapter.
Format offers very high flexibility so that within the 5 chapter structure, even the
title of the chapter could be changed; for example, chapter ‘Literature Review’
may be renamed as either ‘theoretical framework’ or ‘conceptual framework’.
Research scholars may have ample option to christen the title of these chapters.
382
Chapter I
Introduction
1.4 Methodology
[Here an introductory overview of the methodology is placed here. This section should refer to
sections in chapter 2 and 3 where the methodology is justified and described.]
1.6 Definitions
[Definitions adopted by researchers are often not uniform, so key and controversial terms are
defined to establish positions taken in the PhD research. Definitions should match the underlying
assumptions of the research and scholars may need to justify some of their definitions.]
1.8 Conclusion
[Summary of the discussions and achievements of this chapter are provided here.]
------------------------------------------------------------------------------------------------------------
Brief description of Chapter I [in 10 – 15 sentences]
383
Chapter II
Literature Review
2.1 Introduction
2.2 Parent disciplines and classification models
2.3 Developing and Current Literature
2.4 Earlier Literature
2.5 Immediate discipline and analytical models
2.6 Research Gap in the available Literature
2.7 Area identified for the Research Study
2.8 Conclusion
------------------------------------------------------------------------------------------------------------
Brief description of Chapter II [in 10 – 15 sentences]
384
Chapter III
Research Methodology
3.1 Introduction
3.2 Justification of Methodology
3.3 Details of Research Procedures
3.4 Ethical considerations
3.5 Conclusion
------------------------------------------------------------------------------------------------------------
Brief description of Chapter III [in 10 – 15 sentences]
385
Chapter IV
Analysis of Data
4.1 Introduction
4.2 Statistical Tools used
4.3 Data about subjects
4.4 Detailed Pattern of data
4.5 Conclusion
------------------------------------------------------------------------------------------------------------
Brief description of Chapter IV [in 10 – 15 sentences]
386
Chapter V
Conclusions
------------------------------------------------------------------------------------------------------------
Brief description of Chapter V [in 10 – 15 sentences]
387
Typical Model of Chapter I
[to be positioned as the first page of Chapter I]
Chapter I
Introduction
Contents
------------------------------------------------------------------------------------------------------------
1.4 Methodology 8
1.6 Definitions 15
1.8 Conclusion 19
------------------------------------------------------------------------------------------------------------
Background of the research is described leading to identification of research problem
and allied hypotheses. Based on these, conducting a research study is justified and
followed by a brief account of methodology to be adopted. Method and manner in which
the research scholar proposes to carry out the proposed research study is clearly
specified and explained. A bird’s eye-view of the proposed thesis is given enabling the
availability of full picture of the research. New words, phrases and jargon required for
the research study are narrated and definitions to be used in the research are identified
and explained. As in any research study, delimitations [or limitations!] contemplated and
anticipated are clearly explained. Finally, a summary listing the key achievements in this
chapter is provided.
388
A2Z
PhD
Thesis
Reflections on Academic Research
Appendix II
Simple Guide to SPSS
389
SIMPLE GUIDE - SPSS
When conducting any statistical analysis, you need to get familiar with your data and
perform an examination of it in order to lessen the odds of having biased results that can
make all of your hard work essentially meaningless or substantially weak.
Source: www.sociology-data.sju.edu/documents/data_analysis_guide_spss.doc
TIP 1
You can use change the appearance of the variables so that they appear as variable
names rather than variable labels [see above], which is the default option. You can also
make the variables appear alphabetical. I recommend switching to the variable names
option and having them listed alphabetically so that you can more easily find the
variables of interest to you. By selecting this method, you can type the first letter of the
name of the variable that you want in the variable display section of the dialog box and
SPSS will jump to the first variable that starts with that letter, and every subsequent
variable that starts with that letter as well.
Directions: Pull down the Edit Tab, Select Options, Select the general tab,
and under Variable lists select display names and alphabetical
390
TIP 2 [What is this variable?]
In case you forget the label that you gave to variables when you go to the dialog box,
such as the frequency dialog box above, highlight the variable that you are interested in
and click the right-mouse button. This will provide a pop-up window that offers the
Variable Information section. In other words, if you were presented with the frequency
table above, you might highlight “size of the company [size], click on the right-mouse
button and select variable information. This action provides you with the name of the
variable, its label, measurement setting [e.g. ordinal], and value labels for the variable
[i.e. categories]
If you are unsure of what a particular statistic is used for, then highlight the particular
item, right-click on the selected statistics [e.g. mean] and you will receive a brief
description of what the statistics available in the dialog box provide. If the variable is one
that seems useful to you, then you select it by placing a check in the available box
next to the statistic. Then click ok and you will return to the main frequencies
box.
To obtain help on the output screen [i.e. the spss viewer], you need to double click a
pivot table in order to activate it so that you can make modifications. When activated, it
will appear to have “railroad track” lines surrounding it. See Table a below.
391
Favor or Oppose Death Penalty for Murder
Once you have activated the pivot table, then you should right-click a row or
column header for a pop up menu, such as the column labeled valid percent.
Choose what’s this? This will bring up a pop-up window explaining what the
particular column or row is addressing. If you forget to activate the pivot table
and simply right-click on a column or row, you will get the following message:
[Displays output. Click once to select an object (for example, so that you can copy it to
the clipboard). Double-click to activate an object for editing. If the object is a pivot table,
you can obtain detailed help on items within the table by right-clicking on row and
column labels after the table is activated.]
If you want more than a pop-up delivers, choose results coach from the list
instead of what’s this? Essentially, this will take you through a subsection of the
SPSS tutorial.
Changing Text
Activate the pivot table in the SPSS viewer [see Output screen] and then double-
click on the text that you wish to change. Enter the new text and then follow the
same procedure as needed. If you wish to get rid of the title, then you can select
the title and then hit the delete button on your keyboard and you will obtain a table
like the one below.
Select the cell entries with too many [or too few] decimal places. From the format
menu, choose Cell properties, select the number of decimal points that you want
and click OK.
Showing/Hiding Cells
392
Activate the table and then select the row or column you wish by using Ctrl-Alt-
click in the column heading or row label. From the view label, choose hide. To
resurrect the table at a later time, activate the table and then from the view menu,
choose show all. If you’re sure that you never want to resurrect the information,
then you can simply delete them and they will be permanently removed. See
Table a above, which shows a table with unhidden columns. See Table b for an
example of a table with hidden valid percent column.
Table b
Favor or Oppose Death Penalty for Murder
Activate the pivot table, and from the pivot menu, choose pivoting trays. A
schematic representation of a pivot table appears with 3 areas [trays] labeled layer, row,
and column. Colored icons in these trays represent the contents of the table, one for
each variable and one for statistics. Place your mouse pointer over one of them to
see what it represents and if you wish to change the structure of the table, then
you can drag an icon and the table will rearrange itself. See Table b above for a
pre-modification version of the table and Table c below for a post-modification
version.
393
Favor or Oppose Death Penalty for Murder
Valid Favor Frequency 1074
Percent 71.6
Cumulative
77.4
Percent
Oppose Frequency 314
Percent 20.9
Cumulative
100.0
Percent
Total Frequency 1388
Percent 92.5
Missing DK Frequency 106
Percent 7.1
NA Frequency 6
Percent .4
Total Frequency 112
Percent 7.5
Total Frequency 1500
Percent 100.0
Double click on the viewer option to open it in a new chart editor window. Note:
To access some chart editing capabilities, such as identifying points on a scatterplot or
changing the width of bars in a histogram, you must click an element of the chart to
select it. For example, you must click any point in a scatterplot or any bar in a
histogram or bar chart. You can change labels [double-click any text and
substitute your own], create reference lines, and change colors, line types and
sizes. When you close the Chart editor window, the original chart in the viewer
updates show any changes that you made.
Using Syntax
You should ALWAYS use syntax when running statistical analyses. There are 2 ways
that you can do this. You can select the paste tab when you run a statistical
analysis using one of the dialog boxes, such as that for frequencies. However,
when you use the paste function, you have to remember to go to the newly
created syntax window or one that you created in a previous session and
highlight the commands if you wish the analysis to actually run.
394
The other method is to open a new syntax file so that you can type in any
commentary and syntax or copy and paste from an already existing syntax file.
The syntax below is what you would receive if you did a paste command in SPSS after
using a dialog box, such as that for frequencies and you would also receive this
command if you did a copy and paste of prior commands in an already existing syntax or
a newly created one.
FREQUENCIES
VARIABLES=cappun
/PIECHART PERCENT
/ORDER= ANALYSIS .
Whichever method you use to create syntax, you MUST always type in commentary that
explains what the command does. This ensures that you have a way of checking back
to see the methodology that you used and the steps that were taken when you
conducted your analysis. This is useful in case something goes wrong and you need to
make corrections and just to provide you and others with a guide for how the analyses
occurred in case replications need to be done. Commentary should be written in the
following way when dealing with commands:
Notice that there are asterisk at either end and that a period (.) is just before the closing
asterisk. This tells the computer that this is not command text, so that while the
computer may highlight it during a run of the analysis, it will not view it as command text.
If you were going to combine the commentary and the command syntax in a syntax file,
it would appear as you see it below.
395
*frequencies of attitudes toward capital punishment and gun laws.*
FREQUENCIES
VARIABLES=cappun
/PIECHART PERCENT
/ORDER= ANALYSIS .
In addition, you MUST keep a Log of the analyses that you run, which will appear in the
output [SPSS viewer] file. To do this, you need to go to Edit, then options, and select
the viewer tab. Under that tab, be sure that initial output state has “log” listed in the pull-
down tab and that display commands in the log is checked. This ensures that the
information that the program enters the text of any analysis that you do right before it
displays the results of the analysis, which is another way to let yourself and others know
what type of analysis you did and to evaluate whether it is the appropriate analysis and
whether it has been done properly in that case. See the information just below this text
for a sample.
FREQUENCIES
VARIABLES=cappun
/ORDER= ANALYSIS .
Frequencies
Statistics
N Valid 1388
Favor or Oppose Death Penalty for Murder
Missing 112
396
Introducing Data
If your data aren’t already in a computer-readable SPSS format, you can enter the
information directly into the SPSS Data Editor. From the menus, choose file, then
new, then data, which opens the data editor in data view. If you type a number
into the first cell, SPSS will label that column with the variable name VAR00001.
To create your own variable names, click the variable view tab.
In the name column, enter a unique name for each variable in the order in which
you want to enter the variables. The name must start with a letter, but the remaining
part of the variable can be letters or digits. A name can’t end with a period, contain
blanks or special characters, or be longer than 64 characters.
Variable Labels: Assign descriptive text to a variable by clicking the cell and
then entering the label. For instance for the variable “cappun” the label says
“favor or oppose death penalty for murder.”
Value Labels: To label individual values, click the button in the Value column.
This opens its dialog box. For cappun, the label is coded 1 = favor, 2 = oppose.
The sequence of operations is to: enter the value, enter its label, click add,
and repeat this process for each value.
o Note: Labels for individual values are useful only for variables with a
limited number of categories whose codes aren’t self-explanatory. You
don’t want to attach value labels to individual ages; however, you should
label the missing value codes for all variables if you use more than one
code.
To indicate which codes were used for each variable when information is not available,
click in the missing column, and assign missing values. Cases with these codes will be
treated differently during statistical analysis. If you don’t assign codes for missing
values, even nonsensical values are accepted. A value of -1 for age would be
considered a real age. The missing-value codes that you assign to a variable are
called user-missing values. System-missing values are assigned by SPSS to any
blank numeric cell in the Data Editor or to any calculated value that is not defined. A
system-missing value is indicated with a period (.).
Note: You can’t assign missing values to a string variable that is more than
8 characters in width. For string variables, uppercase and lowercase
letters are treated as distinct characters. This means that if you use the code
397
NA (not available) as a missing value code, entries coded as na will not be
treated as missing. Also, if a string variable is 3 characters wide and the missing
value code is only 2 characters wide, the placement of the two characters in the
field of 3 affects what’s considered missing. Blanks at the end of the field (trailing
blanks) are ignored in missing-value specifications.
Warning: DON’T use a blank space as a missing value. Use a specific number
or character to signify that I looked for this value and I don’t know what it is.
DON’T use missing-value codes that are between the smallest and largest valid
values, even if these particular codes don’t occur in the data.
Click in a cell in the Measure column to assign a level of measurement to each variable.
You have 3 choices: nominal, ordinal, and scale.
Warning 1: If you don’t specify the scale, SPSS attempts to divine it based on
characteristics of the data, but its judgment in this matter is fallible. For example,
string variables are always designated as nominal. In some procedures, SPSS
uses different icons for the 3 types of variables. The scale on which a variable is
measured doesn’t necessarily dictate the appropriate statistical analysis for a
variable. For example, an ID number assigned to subjects in an experiment is
usually classified as a nominal variable. If the numbers are assigned
sequentially, however, they can be plotted on a scale to see if subject responses
change with time. Vellemena and Wilkinson (1993) discuss the problems
associated with stereotyping variables.
Warning 2: Although SPSS assigns a level of measurement to each variable,
this information is seldom used to guide you. SPSS will let you calculate means
for nominal variables as long as they have numeric values. Certain statistical
procedures don’t allow string variables in particular fields in the dialog boxes.
For example, you can’t calculate the mean of a string variable.
You MUST always save your data periodically so that you don’t have to start from
scratch if anything goes wrong. You can also include text information in an SPSS data
file by choosing utilities and data file comments, which will appear in the syntax
screen. Anyone using the file can read the text associated with it. You can also elect to
have the comments displayed in the output. This is similar to what you would do with
your own inclusion of comments alerting what steps you are taking in your data analysis.
I recommend the other way because you will already be in the syntax rather than having
to switch back and forth, but this is a possible option.
398
PRESERVE.
ADD DOCUMENT
RESTORE.
If you want to perform the same analysis for several groups of cases, choose Split File
from the Data menu. A separate analysis is done for each combination of values of the
variables specified in the Split File Dialog box.
SPLIT FILE
LAYERED BY sex .
399
Frequencies
Statistics
You can also select how you want the output displayed—all output for each subgroup
together or the same output for each subgroup together.
SPLIT FILE
SEPARATE BY sex .
Frequencies
N Valid 607
Favor or Oppose Death Penalty for Murder
Missing 34
a.
400
Favor or Oppose Death
a Penalty for Murder
N Valid 781
Favor or Oppose Death Penalty for Murder
Missing 78
a.
Choose utilities and then variables to get data-definition information for each variable
in your data file. Make sure that all of your missing-value codes are correctly identified.
TIP 5
If you click, Go To, you find yourself in the column of the Data Editor for the
selected variable if the data editor is in data view. To edit the variable information
401
from the data editor in data view, double-click the variable name at the top of that
column. This takes you to the variable view for that variable.
TIP 6
To get a listing of the information for all of the variables without having to select
the variables individually, choose File, then display data file information, then
working file. This lists variable information for the whole data file. The
disadvantage is that you can’t quickly go back to the data editor to fix mistakes.
An advantage is that codes that are defined as missing are identified, so it’s easier
to check the labels.
If you have entered your own data, it is possible that you will enter the same case
twice or even more. To oust any duplicates, choose data, then identify duplicate
cases. If you entered a supposedly unique ID variable for each case, move the
name of that ID variable into the Define Matching Cases By list. If it takes more
than one variable to guarantee uniqueness (for example, college and student ID),
move all of these variables into the list. When you click OK, SPSS checks the file
for cases that have duplicate values of the ID variables.
TIP 7
DON’T automatically discard cases with the same ID number unless all of the other
values also match. It’s possible that the problem is merely that a wrong ID number was
entered.
Run any procedure and look at the count of the total cases processed. That’s always
the first piece of output. Table d shows the summary from the Crosstabs procedure for
sex by cappun.
Table d
Crosstabs Case Processing Summary
Cases
Valid Missing Total
Respondent's Sex *
Favor or Oppose Death N 1388 Percent
92.5% N 112 Percent
7.5% N 1500 Percent
100.0%
Penalty for Murder
402
You see that the data file has 1500 cases, but only 1388 have valid (nonmissing) values
for the sex and cappun variables. If the count isn’t what you think it should be and if you
assigned sequential numbers to cases, you can look for missing ID numbers.
Warning: Data checking is not an excuse to get rid of data values that you don’t like.
You are looking for values that are obviously in error and need to be corrected or
replaced with missing values. This is not the time to deal with unusual but correct data
points. You’ll deal with those during the actual data analysis phase.
Use the frequency procedure to count the number of times each value of a variable
occurs in your data. For example, how many people in the gss survey support capital
punishment? You can also graph this information using pie charts, bar charts, or
histograms.
TIP 8
You can acquire the information that you need for the descriptive statistics [e.g. mean,
minimum, maximum] through the frequency dialog box by selecting the statistics tab and
checking on those statistics of interest to you.
When conducting frequency analyses, you want to consider the following questions
when reviewing the results presented in your output.
Are the codes that you used for missing values labeled as missing values
in the frequency table? If the codes are not labeled, go back to the data editor
and specify them as missing values.
Do the value labels correctly match the codes? For example, if you see that
50% of your customers are very dissatisfied with your product, make sure that
you haven’t made a mistake in assigning the labels.
Are all of the values in the table possible? For example, if you asked the
number of times a person has been married and you see values of -2, you know
that’s an error. Go back to the source and see if you can figure out what the
correct values are. If you can’t, replace them with codes for missing values.
403
Are there values that are possible, but highly unlikely? For example, if you
see a subject who claims to own 11 toasters, you want to check whether the
value is correct. If the value is incorrect, you’ll have to take that into account
when analyzing the data.
Are there unexpectedly large or small counts for any of the values? If you’re
studying the relationship of highest educational degree to subscription to Web
services offered by your company and you see that no one in your sample has a
college degree, suspect problems.
TIP 9
To search for a particular data value for a variable, go to data view, highlight the
column of the variable that you are interested in, choose edit, then find, and then
type in the value that you are interested in finding.
404
Table e [using the Explore command]
Extreme Values
You can see that one of the respondents claims to watch television 24 hours a day. You
know that’s not correct. It’s possible that he or she understood the question to mean
how many hours is the TV set on. When analyzing the TV variable, you’ll have to decide
what to do with people who have reported impossible values. In Table e, you see that
there are only 4 cases with values of 16 hours or greater and then there is a gap until 12
hours. You might want to set values greater than 12 hours to 12 hours when analyzing
the data. This is similar to what many people do when dealing with a variable for “age.”
405
and the duration of her first marriage, you know that the current age must be
greater than or equal to the age at first marriage. You also know that the age at
first marriage plus the duration of first marriage cannot exceed the current age.
Start by looking at the simplest relationship: Is the age at first marriage less than
the current age? You can plot the two variables on a scatterplot and look for
cases that have unacceptable values. You know that all of the points must fall on
or above the identity line.
TIP 10
For large data files, the drawback to this approach is that it’s tedious and prone to error.
A better way is to create a new variable that is the difference between the current age
and the age at first marriage. Then use data, select cases to select cases with
negative values and analyze, then reports, then case summaries to list the
pertinent information. Once you’ve remedied the age problem, you can create a
new variable that is the sum of the age at first marriage and the duration of first
marriage. You can then find the difference between this sum and the current age.
Reset the select cases criteria and use case summaries to list cases with
offending values.
Is there consistency?
For a survey, you often have questions that are conditional. For example, first
you ask Do you have a car? and then, if the answer is Yes, you ask insightful
questions about the car. You can make Crosstabs tables of the responses to the
main question with those to the subquestions. You have to decide how to deal
with these inconsistencies: do you impute answers to the main question, or do
you discard answers to subquestions? It’s your call.
Is there agreement?
This refers to whether you have pairs of variables that convey similar information
in different ways. For example, you may have recorded both years of education
and highest degree earned. Or, you may have created a new variable that
groups age into 2 categories, such as less than 25, 25 to 50, and older than 50.
Compare the values of the 2 variables using crosstabs. The table may be large,
but it’s easy to check the correspondence between the 2 variables. You can also
identify problems by plotting the values of the 2 variables.
Are there unusual combinations of values?
Identify any outliers so that you can make sure the values of these variables are
correct and make any necessary adjustments. What counts as an outlier
depends on the variables that are being considered together.
TIP 11
You can identify points in a scatterplot by specifying a variable in the Label Cases By
text box in the Scatterplot dialog box. Double-click the plot to activate it in the Chart
Editor. From the Elements menu, choose Data Label Mode or click on the Data
Label Mode icon on the toolbar. This changes your cursor to a black box. Click
the cursor over the point that you want identified by the value of the labeling
406
variable. To go to that case in the Data Editor, right click on the point, and then
left click. Make sure that the Data Editor is in Data View. To turn Data Label Mode
off, click on the Data label Mode icon on the toolbar.
Before you transform your data, be sure that you know that value of the variables that
you are interested in so that you know how to code the information. See earlier
instructions about how to use the utilities menu to obtain information on the variables
either individually or for the entire working data file.
If you want to perform the same calculation for all of the cases in your data file, the
transformation is called unconditional. If you want to perform different computations
based on the values of 1 or more variables, the transformation is conditional. For
example, if you compute an index differently for men and women, the transformation is
conditional. Both types of transformations can be performed in the Compute Variable
dialog box.
Choose compute from the transform menu to open the compute variable dialog
box. At the top left, assign a new name to the variable that you will be computing.
To do so, click in the target variable box and type in the desired name. You must
follow the same rules for assigning variable names as you did when naming variables in
the Data Editor. Also, don’t forget to enter the information in the type and label tab in the
dialog box.
Warning: You MUST use a new variable name rather than one already in use. If you
reuse the same name and make a mistake specifying the transformation, you’ll replace
the values of the original variable with values that you don’t want. If you don’t catch the
mistake right away, and you save the data file, the original values of the variable are lost.
SPSS will ask you for permission to proceed if you try to use an existing variable name.
To specify the formula for the calculations that you want to perform, either type directly in
the Numeric Expression text box or use the calculator pad. Each time you want to refer
to an existing variable, click it in the variable list and then click the arrow button. The
variable name will appear in the formula at the blinking insertion point. Once you click
ok, the variable is added to your data file as the last variable. However, remember that
you want to click the paste button and then run the syntax command from the syntax
window so that you know what commands you specified. You also want to remember to
use commentary information above the pasted syntax in order to tell yourself and the
reviewer, in this case me, what you did to conduct your analysis.
407
TIP 12
Right-click your mouse on any button (except the #s) on the calculator pad or any
function for an explanation of what it means.
The function groups are located in the dialog box and can be used to perform your
calculations, if necessary. There are 7 main groups of functions: arithmetic, statistical,
string, data and time, distribution, random-variable, and missing-values. If you wish to
use it, click it when the blinking insertion point is placed where you want to insert
the function into your formula, and then click the up arrow button. The function
will appear in your formula, but it will have question marks for the arguments. The
arguments of a function are the numbers or strings that it operates on. In the expression
SQRT(25), 25 is the sole argument of this function. Enter a value for the argument, or
double-click a variable to move it into the argument list. If there are more
question-mark arguments, select them in turn and enter a value, move a variable,
or somehow supply whatever suits the needs of the function.
TIP 12
For detailed information about any function and its arguments, from the Help menu,
choose Topics, click the index tab, and type the word functions. You can then select the
type of function that you want.
If you want to use different formulas, depending on the values of one or more existing
variables, you have to enter the formula and then click the button labeled if at the bottom
of the compute variable dialog box. This will take you a secondary compute data dialog
box in which you choose, “include if cases satisfies condition.” To make your conditional
equation. For example, if you wish to compute a new variable, you would specify how
the new target variable is coded in reference to the “if, then expression.”
If you wish to change the coding of a variable but not create a totally different variable,
you would select transform, recode, into same variables, and click on the variable
or variables of interest and move them into the variable box by clicking the arrow.
Depending on how you wish to recode the values within a variable, you could select old
and new values and on the left side of the dialog box, choose the numbers that
you wish to change and on the right side of the dialog box, choose what you want
them to become and click add. When done, select continue, to go back to the
previous dialog box and paste command syntax so that you can run it. Again
408
don’t forget to type in a commentary of what the command is doing. In other
cases, you might choose the “IF” tab to compute the conditions under which a recode
will take place.
TIP 13
If you wish to recode a group of variables using the same coding scheme, such as
recode a 2 into a 1 for a set of variables even if the numbers stand for different value
labels, you can enter several variables into the dialog box at once.
If you want to recode an existing variable into a new one in which every original value
has to be transformed into a value of the new variable. Click transform, recode, into
different variables and you will get a dialog box. In this dialog box, select the
name of the variable that will be recoded. Then in the output variable name test
box, enter a name for the new variable. Click the change button and the new
name appears after the arrow in the central list. Once this is done, click “old and
new values” and enter the recode criteria that will comprise the command syntax.
SPSS carries out the recode specifications in the order they are listed in the old to new
list.
TIP 14
Always specify all of the values even if you’re leaving them unchanged. Select all other
values and then copy cold values. Remember to click the add button after
entering each specification to move it into the old to new list; otherwise, it is
ignored.
The easiest method is to make a crosstabs table of the original variable with the new
variable containing recoded values.
Warning: After you’ve created a new variable with recode, go to the variable view in the
Data Editor and set the missing values for each newly created variable.
Frequency Tables
409
Rap Music
Imagine that you were interested in analyzing respondents views regarding rap music.
You would run a frequency table like the one above to find a count of the level of like or
dislike of rap music reported by respondents. Each row of the table corresponds to one
of the recorded answers. Be sure to make sure that the counts presented appear to be
correct, including those for the missing data listing.
The 3rd-5th columns contain percentages. The 3rd column labeled simply percent is the
% of all cases in the data file with that value. 9% of respondents reported that they like
rap music. However, the 4th column, labeled valid percent indicates that 10% of
respondents like rap music. Why the difference? The 4th column bases the % only on
people who actually respondent to the question.
Warning: A large difference between the % and valid % columns can signal big
problems for your study. If the missing values result from people not being asked the
question because that’s the design of the study, you don’t have to worry. If people
weren’t asked because the interviewer decided not to ask them or if they refused to
answer, that’s a different matter.
The 5th column, labeled cumulative percent is the sum of the valid % for that row and all
of the rows before it. It’s useful only if the variable is measured at least on an ordinal
scale. For example, the cumulative % for “like” tells you that 13% of respondents either
reported that they like rap music or that they like it very much. The valid data value that
occurs most frequently is called the mode. For these data, “dislike very much” is the
modal category since 578 of the respondents reported that they disliked rap music very
much. The mode is not a particularly good summary measure, and if you report it, you
should always indicate the percentage of cases with that value. For variables measured
on a nominal scale, the mode is the only summary statistic that makes sense, but that
isn’t the case for this variable because there is a natural order to the responses [i.e.
ordinal variable].
410
Frequency Tables as Charts
You can display the numbers in a frequency table in a pie chart or a bar chart, although
prominent statisticians advise that one should “never use a pie chart.”
Rap Music
Like Very Much
Like It
Mixed Feelings
Dislike It
2.87%
Dislike Very Much
10.13%
40.39%
18.59%
28.02%
__
Warning: If you create a pie chart by choosing Descriptive Statistics, then frequencies, a
slice for missing values is always included. Use graph, then select pie if you don’t want
to include a slice for missing values. This was the way that I obtained the pie chart
above.
411
50.0%
40.0%
30.0%
Percent
20.0%
10.0%
0.0%
Like Very Much Like It Mixed Feelings Dislike It Dislike Very Much
Rap Music
Now you know how people as a group feel about rap music, but what about more
nuanced information about the kinds of people who hold these views. Are they male?
College Educated? Racial and Ethnic Minorities? To find out this information, you need
to look at attitudes regarding rap music in conjunction with other variables. A
crosstabualtion involving a 2-way table of counts, for attitudes toward rap music and
gender. Gender is the row variable since it defines the rows of the table, and attitudes
toward rap music is the column variable since it defines the columns. Each of the
unique combinations of the values of the 2 variables defines a cell of the table. The
412
numbers in the total row and column are called marginals because they are in the
margins of the table. They are frequency tables for the individual variables.
TIP 15
The table below shows a crosstabulation that contains information solely on the number
of cases that meet both criteria, but not a % distribution.
Respondent's Sex * Rap Music Crosstabulation
Count
Rap Music
Respondent's Male Like Very
17 62 Mixed97 181 Dislike
258 615
Sex Much Like It Feelings Dislike It Very Much Total
Female 24 83 169 220 320 816
Total 41 145 266 401 578 1431
Percentages
The above information, i.e. the counts in the cell are the basic elements of the table, but
they are usually not the best choice for reporting findings because they cannot be easily
compared if there are different totals in the rows and columns of the table. For example,
if you know that 17 Males and 24 Females like rap music very much, you can conclude
little about the relationship between the 2 variables unless you also know the total of
men and women in the sample.
Row %: the cell count divided by the number of cases in the row times 100
Column %: the cell count divided by the number of cases in the column times
100
Total %: the cell count divided by the total number of cases in the table times
100
The 3 % convey different information, so be sure to choose the correct one for your
problem. If one of the 2 variables in your table can be considered an independent
413
variable and the other a dependent variable, make sure the % sum up to 100 for each
category of the independent variable.
Rap Music
Respondent's Male Count Like17
Very 62 Mixed
97 181 Dislike
258 615
Sex Much Like It Feelings Dislike It Very MuchTotal
% within
2.8% 10.1% 15.8% 29.4% 42.0% 100.0%
Respondent's Sex
% within Rap Music 41.5% 42.8% 36.5% 45.1% 44.6% 43.0%
% of Total 1.2% 4.3% 6.8% 12.6% 18.0% 43.0%
Female Count 24 83 169 220 320 816
% within
2.9% 10.2% 20.7% 27.0% 39.2% 100.0%
Respondent's Sex
% within Rap Music 58.5% 57.2% 63.5% 54.9% 55.4% 57.0%
% of Total 1.7% 5.8% 11.8% 15.4% 22.4% 57.0%
Total Count 41 145 266 401 578 1431
% within
2.9% 10.1% 18.6% 28.0% 40.4% 100.0%
Respondent's Sex
% within Rap Music 100.0% 100.0% 100.0% 100.0% 100.0% 100.0%
% of Total 2.9% 10.1% 18.6% 28.0% 40.4% 100.0%
Since gender would fall under the realm of an independent variable, you want to
calculate the row % because they will tell you what % of women and men fall into each
of the attitudinal categories. This % isn’t affected by unequal numbers of males and
females in your sample. From the row % displayed above, you find that 2.8% of males
like rap music very much as do 2.9% of females. So with regard to strong positive
feelings about rap music, you note that there are no visible differences. Note: No
statistical differences are examined yet. From the column% displayed above, you find
that among those who like rap music very much, 41.5% are men and 58.5% are female.
This does not tell you that females are significantly more likely to report liking rap music
very much than males. Instead, it tells you that of the people who like rap music very
much, women tend to hold a stronger view than men. Note: The column % depend on
the number of men and women in the sample as well as how they feel about rap music.
If men and women have identical attitudes but there are twice as many men in the
survey than women, the column % for men will be twice as large as the column % for
women. You can’t draw any conclusions based on only the column %.
TIP 16
If you use row %, compare the % within a column. If you use column %, compare the %
within a row.
414
Multiway Tables of Counts as Charts
You can plot the % in the table above by using a clustered bar chart like the one below.
For each attitudinal category regarding rap music, there are separate bars for men and
women since gender is the cluster variable. The values plotted are the % of all men and
the % of all women who gave each response. You can easily that females are equally
likely to like rap music very much as much as males. Although the same information is
in the crosstabulation, it is easier to see in the bar chart.
50.0%
Respondent's Sex
Male
Female
40.0%
30.0%
Percent
41.95%
20.0% 39.22%
29.43%
26.96%
10.0% 20.71%
15.77%
10.08%10.17%
2.76% 2.94%
0.0%
Like Very Like It Mixed Dislike It Dislike Very
Much Feelings Much
Rap Music
TIP 17
Always select % in the clustered bar chart dialog boxes; otherwise, you’ll have a difficult
time making comparisons within a cluster, since the height of the bars will depend on the
number of cases in each subgroup. For example, you won’t be able to tell if the bar for
men who always read newspapers is higher because men are more likely to read a
newspaper daily or because there are more men in the sample.
Control Variables
415
You can examine the relationship between gender and attitudes toward rap music
separately for each category of another variable, such as education [i.e., the control
variable]. See the crosstabulation model below to show you how the information would
look when entered into the crosstabulation dialog box.
416
Respondent's Sex * Rap Music * RS Highest Degree Crosstabulation
Rap Music
Less than HS Respondent'sMale Count Like5Very 11 Mixed
14 30 Dislike
55 115
RS Highest Degree
Sex Much Like It Feelings Dislike It Very MuchTotal
% within
4.3% 9.6% 12.2% 26.1% 47.8% 100.0%
Respondent's Sex
Female Count 10 18 19 35 59 141
% within
7.1% 12.8% 13.5% 24.8% 41.8% 100.0%
Respondent's Sex
Total Count 15 29 33 65 114 256
% within
5.9% 11.3% 12.9% 25.4% 44.5% 100.0%
Respondent's Sex
High school Respondent'sMale Count 9 36 50 87 110 292
Sex % within
3.1% 12.3% 17.1% 29.8% 37.7% 100.0%
Respondent's Sex
Female Count 11 45 95 134 175 460
% within
2.4% 9.8% 20.7% 29.1% 38.0% 100.0%
Respondent's Sex
Total Count 20 81 145 221 285 752
% within
2.7% 10.8% 19.3% 29.4% 37.9% 100.0%
Respondent's Sex
Junior college Respondent'sMale Count 1 4 4 13 14 36
Sex % within
2.8% 11.1% 11.1% 36.1% 38.9% 100.0%
Respondent's Sex
Female Count 1 3 13 15 18 50
% within
2.0% 6.0% 26.0% 30.0% 36.0% 100.0%
Respondent's Sex
Total Count 2 7 17 28 32 86
% within
2.3% 8.1% 19.8% 32.6% 37.2% 100.0%
Respondent's Sex
Bachelor Respondent'sMale Count 2 8 22 32 41 105
Sex % within
1.9% 7.6% 21.0% 30.5% 39.0% 100.0%
Respondent's Sex
Female Count 2 11 30 27 52 122
% within
1.6% 9.0% 24.6% 22.1% 42.6% 100.0%
Respondent's Sex
Total Count 4 19 52 59 93 227
% within
1.8% 8.4% 22.9% 26.0% 41.0% 100.0%
Respondent's Sex
Graduate Respondent'sMale Count 3 7 19 38 67
Sex % within
4.5% 10.4% 28.4% 56.7% 100.0%
Respondent's Sex
Female Count 5 12 9 16 42
% within
11.9% 28.6% 21.4% 38.1% 100.0%
Respondent's Sex
Total Count 8 19 28 54 109
% within
7.3% 17.4% 25.7% 49.5% 100.0%
Respondent's Sex
You see that the largest difference in strong dislike of rap music between men and
women occurs among those with a graduate degree. 56.7% of males strongly dislike
417
rap compared to 38.1% of females. The % are almost equal for those with a high school
education.
T-Tests
When using these statistical tests, you are testing the null hypothesis that 2 population
means are equal. The alternative hypothesis is that they are not equal. There are 3
different ways to go about this, depending on how the data were obtained.
Neither the one-sample t test nor the paired samples t test requires any assumption
about the population variances, but the 2-sample t test does.
TIP 18
When reporting the results of a t test, make sure to include the actual means,
differences, and standard errors. Don’t give just a t value and the observed significance
level.
One-sample T test
If you have a single sample of data and want to know whether it might be from a
population with a known mean, you have what’s termed a one-sample design, which can
be analyzed with a one-sample t test.
Examples
You want to know whether CEOs have the same average score on a personality
inventory as the population on which it was normed. You administer the test to a
random sample of CEOs. The population value is assumed to be known in
advance. You don’t estimate it from your data.
You’re suspicious of the claim that the normal body temperature is 98.6 degrees.
You want to test the null hypothesis that the average body temperature for
human adults is the long assumed value of 98.6, against the alternative
hypothesis that it is not. The value 98,6 isn’t estimated from the data; it is a
known constant. You take a single random sample of 1,000 adult men and
women and obtain their temperatures.
You think that 40 hours no longer defines the traditional work week. You want to
test the null hypothesis that the average work week is 40 hours, against the
alternative that it isn’t. You ask a random sample of 500 full-time employees how
many hours they worked last week.
You want to know whether the average IQ score for children diagnosed with
schizophrenia differs from 100, the average for the population of all children.
You administer an IQ test to a random sample of 700 schizophrenic children.
418
Your null hypothesis is that the population value for the average IQ score for
schizophrenic children is 100, and the alternative hypothesis is that it isn’t.
Data Arrangement
For the one-sample t test, you have one variable that contains the values for
each case. For example:
A manufacturer of high-performance automobiles produces disc brakes that must
measure 322 millimeters in diameter. Quality control randomly draws 16 discs
made by each of eight production machines and measures their diameters. This
example uses the file brakes.sav . Use One Sample T Test to determine whether
or not the mean diameters of the brakes in each sample significantly differ from
322 millimeters. A nominal variable, Machine Number, identifies the production
machine used to make the disc brake. Because the data from each machine
must be tested as a separate sample, the file must first be split into groups by
Machine Number.
Select compare groups in the split file dialog box. Select machine number from the
variable listing and move it into the box for “groups based on.” Select the “compare
groups circle” and since the file isn’t already sorted, be sure that you have selected, “sort
the file by grouping variables.”
419
Next select one-sample T test from the analyze tab.
Select the test variable, i.e. disc brake diameter (mm), type 322 as the test variables,
and click options.
420
In the options dialog box for the one-sample T test, type 90 in the confidence interval %,
then be sure that you have missing values coded as “exclude cases analysis by
analysis,” then click continue, then click paste so that the syntax is entered in the syntax
viewer, and then select ok.
Note: A 95% confidence interval is generally used, but the examples below
reflect a 90% confidence interval.
The Descriptives table displays the sample size, mean, standard deviation, and standard
error for each of the eight samples. The sample means disperse around the 322mm
standard by what appears to be a small amount of variation.
The test statistic table shows the results of the one-sample T test.
421
The t column displays the observed t statistic for each sample, calculated as the ratio of
the mean difference divided by the standard error of the sample mean.
The df column displays degrees of freedom. In this case, this equals the number of cases
in each group minus 1.
The column labeled Sig. (2-tailed) displays a probability from the t distribution with 15
degrees of freedom. The value listed is the probability of obtaining an absolute value
greater than or equal to the observed t statistic, if the difference between the sample
mean and the test value is purely random.
The Mean Difference is obtained by subtracting the test value (322 in this example) from
each sample mean.
The 90% Confidence Interval of the Difference provides an estimate of the boundaries
between which the true mean difference lies in 90% of all possible random samples of 16
disc brakes produced by this machine.
Since their confidence intervals lie entirely above 0.0, you can safely say that machines 2,
5 and 7 are producing discs that are significantly wider than 322mm on the average.
Similarly, because its confidence interval lies entirely below 0.0, machine 4 is producing
discs that are not wide enough.
422
The one-sample t test can be used whenever sample means must be compared to a
known test value. As with all t tests, the one-sample t test assumes that the data be
reasonably normally distributed, especially with respect to skewness. Extreme or
outlying values should be carefully checked; boxplots are very handy for this.
Paired-Samples T test
You use a paired-samples (also known as the matched cases) T test if you want to test
whether 2 population means are equal, and you have 2 measurements from pairs of
people or objects that are similar in some important way. For example, you’ve observed
the same person before and after treatment or you have personally measures for each
CEO and their non-CEO sibling. Each “case” in this data file represents a pair of
observations.
Examples
You are interested in determining whether self-reported weights and actual
weights differ. You ask a random sample of 200 people how much they weigh
and then you weigh them on a scale. You want to compare the means of the 2
related sets of weights.
You want to test the null hypothesis that husbands and wives have the same
average years of education. You take a random sample of married couples and
compare their average years of education.
You want to compare 2 methods for teaching reading. You take a random
sample of 50 pairs of twins and assign each member of a pair to one of the 2
methods. You compare average reading scores after completion of the program.
Data Arrangement
In a paired-samples design, both members of a pair must be on the same data
record. Different variable names are used to distinguish the 2 members of a pair.
For example:
A physician is evaluating a new diet for her patients with a family history of heart
disease. To test the effectiveness of this diet, 16 patients are placed on the diet
for 6 months. Their weights and triglyceride levels are measured before and after
the study, and the physician wants to know if either set of measurements has
changed.
This example uses the file dietstudy.sav . Use Paired-Samples T Test to
determine whether there is a statistically significant difference between the pre-
and post-diet weights and triglyceride levels of these patients.
423
Select Triglyceride and Final Triglyceride as the first set of paired variables.
Select Weight and final weight as the second pair and click ok.
424
The Descriptives table displays the mean, sample size, standard deviation, and standard
error for both groups. The information is disseminated in pairs such that pair 1 should
come first and pair 2 should come second in the table.
Across all 16 subjects, triglyceride levels dropped between 14 and 15 points on average
after 6 months of the new diet.
The subjects clearly lost weight over the course of the study; on average, about 8 pounds.
The standard deviations for pre- and post-diet measurements reveal that subjects were
more variable with respect to weight than to triglyceride levels.
At -0.286, the correlation between the baseline and six-month triglyceride levels is not
statistically significant. Levels were lower overall, but the change was inconsistent across
subjects. Several lowered their levels, but several others either did not change or
increased their levels.
On the other hand, the Pearson correlation between the baseline and six-month weight
measurements is 0.996, almost a perfect correlation. Unlike the triglyceride levels, all
subjects lost weight and did so quite consistently.
425
The Mean column in the paired-samples t test table displays the average difference
between triglyceride and weight measurements before the diet and six months into the
diet.
The Std. Deviation column displays the standard deviation of the average difference
score.
The Std. Error Mean column provides an index of the variability one can expect in
repeated random samples of 16 patients similar to the ones in this study.
The 95% Confidence Interval of the Difference provides an estimate of the boundaries
between which the true mean difference lies in 95% of all possible random samples of 16
patients similar to the ones participating in this study.
The t statistic is obtained by dividing the mean difference by its standard error.
The Sig. (2-tailed) column displays the probability of obtaining a t statistic whose absolute
value is equal to or greater than the obtained t statistic.
Since the significance value for change in weight is less than 0.05, you can conclude that
the average loss of 8.06 pounds per patient is not due to chance variation, and can be
attributed to the diet.
However, the significance value greater than 0.10 for change in triglyceride level shows
the diet did not significantly reduce their triglyceride levels.
Warning: When you click the first variable of a pair, it doesn’t move to the list
box; instead, it moves to the lower left box labeled Current Selections. Only
when you click a second variable and move it into Current Selections can you
move the pair into the Paired Variable list.
Two-Independent-Samples T test
426
If you have 2 independent groups of subjects, such as CEOs and non-CEOs, men and
women, or people who received a treatment and people who didn’t, and you want to test
whether they come from populations with the same mean for the variable of interest, you
have a 2-independent samples design. In an independent-samples design, there is no
relationship between people or objects in the 2 groups. The T test you use is called an
independent-samples T test.
Examples
You want to test the null hypothesis that, in the U.S. population, the average
hours spent watching TV per day is the same for males and females.
You want to compare 2 teaching methods. One group of students is taught by
one method, while the other group is taught by the other method. At the end of
the course, you want to test the null hypothesis that the population values for the
average scores are equal.
You want to test the null hypothesis that people who report their incomes in a
survey have the same average years of education as people who refuse.
Data Arrangement
If you have 2 independent groups of subjects, e.g., boys and girls, and want to
compare their scores, your data file must contain two variables for each child:
one that identifies whether a case is a boy or a girl, and one with the score. The
same variable name is used for the scores for all cases. To run the 2
independent samples T test, you have to tell SPSS which variable defines the
groups. That’s the variable Gender, which is moved into the Grouping Variable
box. Notice the 2 question marks after a variable name. They will disappear
after you use the Define Groups dialog box to tell SPSS which values of the
variable should be used to form the 2 groups.
TIP 18
Right-click the variable name in the Grouping Variable box and select variable
information from the pop-up menu. Now you can check the codes and value labels that
you’ve defined for that variable.
Warning: In the define groups dialog box, you must enter the actual values that you
entered into the data editor, not the value labels. If you used the codes of 1 for male and
2 for female and assigned them value labels of m and f, then you enter the values 1 and
2, not the labels m and f, into the define groups dialog box.
427
Select money spent during the promotional period as the test variable. Select type of
mail insert received as the grouping variable. Then click define groups.
Type 0 as the group 1 variable and 1 as the group 2 variable under define groups. For
the default, the program should have “use specified values” selected. Then click
continue and ok.
428
The Descriptives table displays the sample size, mean, standard deviation, and standard
error for both groups. On average, customers who received the interest-rate promotion
charged about $70 more than the comparison group, and they vary a little more around
their average.
The procedure produces two tests of the difference between the two groups. One test
assumes that the variances of the two groups are equal. The Levene statistic tests this
assumption.
In this example, the significance value of the statistic is 0.276. Because this value is
greater than 0.10, you can assume that the groups have equal variances and ignore the
second test. Using the pivoting trays, you can change the default layout of the table so
that only the "equal variances" test is displayed.
Activate the pivot table. Then under pivot, select pivoting trays.
429
Drag assumptions from the row to the layer and close the pivoting trays window.
With the test table pivoted so that assumptions are in the layer, the Equal variances
assumed panel is displayed.
The df column displays degrees of freedom. For the independent samples t test, this
equals the total number of cases in both samples minus 2.
The column labeled Sig. (2-tailed) displays a probability from the t distribution with 498
degrees of freedom. The value listed is the probability of obtaining an absolute value
430
greater than or equal to the observed t statistic, if the difference between the sample
means is purely random.
The Mean Difference is obtained by subtracting the sample mean for group 2 (the New
Promotion group) from the sample mean for group 1.
The 95% Confidence Interval of the Difference provides an estimate of the boundaries
between which the true mean difference lies in 95% of all possible random samples of
500 cardholders.
Since the significance value of the test is less than 0.05, you can safely conclude that the
average of 71.11 dollars more spent by cardholders receiving the reduced interest rate is
not due to chance alone. The store will now consider extending the offer to all credit
customers.
Churn propensity scores are applied to accounts at a cellular phone company. Ranging
from 0 to 100, an account scoring 50 or above may be looking to change providers. A
manager with 50 customers above the threshold randomly samples 200 below it,
wanting to compare them on average minutes used per month.
The Descriptives table shows that customers with propensity scores of 50 or more are
using their cell phones about 78 minutes more per month on the average than
customers with scores below 50.
The significance value of the Levene statistic is greater than 0.10, so you can assume
that the groups have equal variances and ignore the second test. Using the pivoting
trays, change the default layout of the table so that only the "equal variances" test is
displayed. Play around with the pivot tray link if you wish.
431
Analyzing Truancy Data: The Example
To perform this analysis in order to test your skills using a T test, please see the spss file
on the course blackboard page.
One-Sample T test
Consider whether the observed truancy rate before intervention [the % of school days
missed because of truancy] differs from an assumed nationwide truancy rate of 8%. You
have one sample of data [students enrolled in the TRP program-truancy reduction
program] and you want to compare the results to a fixed, specified in-advance
population value.
The null hypothesis is that the sample comes from a population with an average truancy
rate of 8%. [Another way of stating the null hypothesis is that the difference in the
population means between your population and the nation as a whole is 0.] The
alternative hypothesis is that you sample doesn’t come from a population with a truancy
rate of 8%.
To obtain the table below, you would do one of the following: Go to Analyze, choose
desciptive statistics, then descriptives, select the variable to be examined, in this
case prepct, then go to options in the descriptives dialog box and select, mean,
minimu, maximum, and standard deviation, then select continue and okay. You
can also choose frequencies under the descriptive statistics link, select the
variable to be examined, go to statistics and pick the same statistics as above,
select continue, and then okay.
Descriptive Statistics
432
From the table above, you see that, for the 299 students in this sample, the average
truancy rate is 14.2%. You know that even if the sample is selected from a population in
which the true rate is 8%, you don’t expect your sample to have an observed rate of
exactly 8%. Samples from the population vary. What you want to determine is whether
it’s plausible for a sample of 299 students to have an observed truancy rate of 14.2% if
the population value is 8%.
TIP 19
Before you embark on actually computing a one-sample T test, make certain checks.
Look at the histogram of the truancy rates to make sure that all of the values make
sense. Are there percentages smaller than 0 or greater than 100? Are there values that
are really far from the rest? If so, make sure they’re not the result of errors. If you have
a small number of cases, outliers can have a large effect on the mean and the standard
deviation.
To use the one-sample T test, you have to make certain assumptions about the data:
The observations must be independent of each other. In this data file, students
came from 17 schools, so its possible that students in the same school may be
more similar than students in different schools. If that’s the case, the estimated
significance level may be smaller than it should be, since you don’t have as much
information as the sample size indicates. [If you have 10 students from 10
different schools, that’s more information than having 10 students from the same
school because it’s plausible that students in the same school are more similar
than students from different schools.] Independence is one of the most important
assumptions that you have to make when analyzing data.
In the population, the distribution of the variable must be normal, or the sample
size must be large enough so that it doesn’t matter. The assumption of normally
distributed data is required for many statistical tests. The importance of the
assumption differs, depending on the statistical test. In the case of a one-sample
T test, the following guidelines are suggested: If the number of cases is < 15, the
data should be approximately normally distributed; if the number of cases is
between 15 and 40, the data should not have outliers or be very skewed; for
samples of 40 or more, even markedly skewed distributions are acceptable.
Because you have close to 300 observations, there’s little need to worry about
the assumption of normality.
TIP 20
If you have reason to believe that the assumptions required for the T test are violated in
an important way, you can analyze the data using a nonparametric tests.
433
Compute the difference between the observed sample mean and the hypothesized
population value. [14.2%-8% = 6.2%]
Compute the standard error of the difference. This is a measure of how much you
expect sample means, based on the same number of cases from the same population,
to vary. The hypothetical population value is a constant and doesn’t contribute to the
variability of the differences, so the standard error of the difference is just the standard
error of the mean. Based on the standard deviation in the table above, the standard
error equals:
N Valid 299
prepct Percent truant days pre intervention
Missing 0
Mean 14.2038
Std. Error of Mean .75595
Std. Deviation 13.07160
You can calculate the t statistic by hand if you divide the observed difference by the
standard error of the difference.
You can also conduct a one-sample T test using SPSS by going to analyze, compare
means, one-sample T test, selecting the relevant variable, [i.e. prepct] and
entering it into the test variable box and entering the number 8 in the test value
box at the bottom of the dialog box and running the analysis. You will get the
following output as shown below.
T-TEST
/TESTVAL = 8
/MISSING = ANALYSIS
/VARIABLES = prepct
/CRITERIA = CI(.95) .
T-Test
434
One-Sample Statistics
One-Sample Test
Test Value = 8
95% Confidence
Interval of the
Mean Difference
prepct Percent truant
t 8.207 df 298 .000
Sig. (2-tailed) 6.20378Lower4.7161 Upper7.6915
Difference
days pre intervention
Use the T distribution to determine if the observed t statistic is unlikely if the null
hypothesis is true. To calculate the observed significance level for a T statistic, you
have to take into account both how large the actual T value is and how many degrees of
freedom it has. For a one-sample T test, the degress of freedom [dof] is one fewer than
the number of cases. From the table above, you see that the observed significance level
is < .0001. Your observed results are very unlikely if the true rate is 8%, so you reject
the null hypothesis. Your sample probably comes from a population with a mean larger
than 8%.
TIP 21
If you look at the 95% Confidence Interval for the population difference, you see that it
ranges from 4.7% to 7.7%. You don’t know whether the true population difference is in
this particular interval, but you know that 95% of the time, 95% confidence intervals
include the true population values. Note that the value of 0 is not included in the
confidence interval. If your observed significance level had been larger than 0.05, 0
would have been included in the 95% confidence interval.
TIP 22
435
There is a close relationship between hypothesis testing and confidence intervals. You
can reject the null hypothesis that you sample comes from a population with any value
outside of the 95% confidence interval. The observed significance level for the
hypothesis test will be less than 0.05.
Paired-Samples T test
You’ve seen that your students have a higher truancy rate than the country as a whole.
Now the question is whether there is a statistically significant difference in the truancy
rates before and after the truancy reduction programs. For each student, you have 2
values for unexcused absences. One is for the year before the student enrolled in the
program; the other is for the year in which the student was enrolled in the program.
Since there are two measurements for each subject, a before and an after, you want to
use a paired-samples T test to test the null hypothesis that averages before and after
rates are equal in the population.
TIP 23
The reason for doing a paired-samples design is to make the 2 groups as comparable as
possible on characteristics other than the one being studied. By studying the same
students before and after intervention, you control for differences in gender,
socioeconomic status, family supervision, and so on. Unless you have pairs of
observations that are quite similar to each other, pairing has little effect and may, in fact,
hurt your chances of rejecting the null hypothesis when it is false.
Before running the paired-samples T test procedure, look at the histogram of the
differences shown. You should see that the shape of the distribution is symmetrical [i.e.
not too far from normal]. Many of the cases cluster around 0, indicating that the
difference in the before and after scores is small for these students.
The same assumptions about the distributions of the data are required for this test as
those in the one-sample T test. The observations should be independent; if the sample
size is small, the distribution of differences should be approximately normal. Note that
the assumptions are about the differences, not the original observations. That’s
because a paired-samples T test is nothing more than a one-sample T test on the
differences. If you calculate the differences between the pre- and post-values and use
the one-sample T test with a population value of 0, you’ll get exactly the same statistic
as using the paired-samples T test.
From the table below, you see that the average truancy rate before intervention is 14.2%
and the average truancy rate after intervention is 11.4%. That’s a difference about
436
2.8%. To get the table below, you should go to descriptives and select the prepct
and postpct variables and enter them into the variable list, be sure that the right
statistics are checked off [e.g. standard deviation], and then hit okay.
Paired Samples Statistics
To see how often you would expect to see a difference of at least 2.8% when the null
hypothesis of no difference is true, look at the paired-samples T test table below.
To obtain the table below, do the following: go to analyze, then select compare
means, then select paired-samples T test and choose the 2 variables of interest of
the pair to be selected, i.e., prepct and postpct, then select Ok.
Paired Samples Test
Paired Differences
95% Confidence
Interval of the
Std. Error Difference
Pair postpct Percent truant
1 days post intervention Mean Std. Deviation Mean Lower Upper t df Sig. (2-tailed)
-2.76602 12.69355 .73409 -4.21067 -1.32137 -3.768 298 .000
- prepct Percent truant
days pre intervention
The T statistic, 3.8, is computed by dividing the average difference [2.77%] by the
standard error of the mean difference [0.73]. The degrees of freedom is the number of
pairs minus one. The observed significance level is < .001, so you can reject the null
hypothesis that the pre-intervention and post-intervention truancy rates are equal in the
population. Intervention appears to have reduced the truancy rate.
Warning: The conclusions you can draw about the effectiveness of truancy reduction
programs from a study like this are limited. Even if you restrict your conclusions to the
schools from which these children are a sample, there are many problems. Since you
are looking at differences in truancy rates between adjacent years, you aren’t controlling
for possible increases or decreases in truancy that occur as children grow older. For
example, if truancy increases with age, the effect of the truancy reduction program may
be larger than it appears. There is also potential bias in the determination of what is
considered an “excused” absence.
The 95% confidence interval for the population change is from 1.3% to 4.2%. It appears
that if the program has an effect, it is not a very large one. One average, assuming a
437
180-day school year, students in the truancy reduction program attended school five
more days after the program than before. The 95% confidence interval for the number
of days “saved” is from 2.3 days to 7.6 days.
A paired-samples design is effective only if you have pairs of similar cases. If your
pairing does not result in a positive correlation coefficient between the 2 measurements
of close to 0.5, you may lose power [your computer stays on, but your ability to reject the
null hypothesis when it is false fizzles] by analyzing the data as a paired-samples
design. From the correlation coefficient table covering the correlation coefficient
between the pre- and post-intervention rates is close to 0.5, so pairing was probably
effective. See below.
Paired Samples Correlations
Warning: Although well-intentioned, paired designs often run into trouble. If you give a
subject the same test before and after an intervention, the practice effect, instead of the
intervention, may be responsible for any observed change. You must also make sure
that there is no carryover effect; that is, the effect of one intervention must be completely
gone before you impose another.
You’ve seen that intervention seems to have had a small, although statistically
significant effect. One of the questions that remains is whether the effect is similar for
boys and girls prior to intervention? Is the average truancy rate the same for boys and
girls after intervention? Is the change in truancy rates before and after intervention the
same for boys and girls?
Group Statistics
438
The table above shows summary statistics for the 2 groups for all 3 variables. Boys had
somewhat larger average truancy scores prior to intervention than did girls. The
average scores after intervention were similar for the 2 groups. The difference between
the average pre- and post-intervention is larger for boys. You must determine whether
these observed differences are large enough for you to conclude that, in the population,
boys and girls differ in average truancy rates. You can use the 2 independent-samples
T test to test all 3 hypotheses.
You must assume that all observations are independent. If the sample sizes in the
groups are small, the data must come from populations that have normal distributions. If
the sum of the sample sizes in the 2 groups is greater than 40, you don’t have to worry
about the assumption of normality. The 2-independent-samples T test also requires
assumptions about the variances in the 2 groups. If the 2 samples come from
populations with the same variance, you should use the “pooled” or equal-variance T
test. If the variances are markedly different, you should use the separate-variance T
test. Both of these are shown below.
You can test the null hypothesis that the population variances in the 2 groups are equal
using the Levene test, shown above. If the observed significance level is small [in the
column labeled sig. under Levene’s Test], you reject the null hypothesis that the
population variances are equal. For this example, you can reject the null hypothesis that
the per-intervention truancy variances are equal in the 2 groups. For the other 2
variables, you can’t reject the null hypothesis that the variances are equal.
In the 2-independent-samples T test, the T statistic is computed the same as for the
other 2 tests. It is the ratio of the difference between the 2 sample means divided by the
standard error of the difference. The standard error of the difference is computed
differently, depending on whether the 2 variances are assumed to be equal or not.
439
That’s why you see 2 sets of T values in the table above. In this example, the 2 T values
and confidence intervals based on them are very similar. That will always be the case
when the sample size in the 2 groups is almost the same.
The degrees of freedom for the t statistic also depends on whether you assume that the
2 variances are equal. If the variances are assumed to be equal, the degrees of
freedom is 2 fewer than the sum of the number of cases in the 2 groups. If you don’t
assume that the variances are equal, the degrees of freedom is calculated from the
actual variances and the sample sizes in the groups. The result is usually not an
integer.
From the column labeled Sig. [2-tailed], you can’t reject any of the 3 hypotheses of
interest. The observed results are not incompatible with the null hypothesis that boys
and girls are equally truant before and after the program and that intervention affects
confidence intervals.
Warning: When you compare 2 independent groups, one of which has a factor of
interest and the other that doesn’t, you must be very careful about drawing conclusions.
For example, if you compare people enrolled in a weight-loss program to people who
aren’t, you cannot attribute observed differences to the program unless the people have
been randomly assigned to two programs.
T-Tests
Crosstabulations
You classify cases based on values for 2 or more categorical variables [e.g. type of
health insurance coverage and satisfaction with health care.] Each combination of
values is called a cell. To test whether the two variables that make up the rows and
columns are independent, you calculate how many cases you expect in each cell if the
variables are independent, and compare these expected values to those actually
observed using the chi-square statistic. If your observed results are unlikely if the null
hypothesis of independence is true, you reject the null hypothesis. You can measure
how strongly the row and column variables are related by computing measures of
association. There are many different measures, and they define association in different
ways. In selecting a measure of association, you should consider the scale on which the
variables are measured, the type of association you want to detect, and the ease of
interpretation of the measure. You can study the relationship between a dichotomous
[2-category] risk factor and a dichotomous outcome [e.g. family history of a disease and
development of the disease], controlling for other variables [e.g. gender] by computing
special measures based on the odds.
If you think that 2 variables are related, the null hypothesis that you want to test is that
they are not related. Another way of stating the null hypothesis is that the 2 variables
are independent. Independence has a very precise meaning in this situation. It means
that the probability that a case falls into a particular cell of a table is the product of the
440
probability that a case falls into that row and the probability that a case falls into that
column.
Warning: The word independent as used here has nothing to do with dependent and
independent variables. It refers to the absence of a relationship between 2 variables.
As an example of testing whether 2 variables are independent, look at the table below, a
crosstabulation of highest educational attainment [degree] and perception of life’s
excitement[life] based on the gssdata posted on blackboard. From the row %, you see
that the % of people who find life exciting is not exactly the same in the 5 degree groups,
although it is fairly similar for the 1st 2 degree groups. Slightly less than half of those
with less than a high school education or with a high school education find life exciting.
However, you see that there is substantial differences between those with some
exposure to college and those with a post-graduate degree. For those respondents,
almost 2/3 find that life is exciting.
degree Highest degree * life Is life exciting, routine or dull? Crossta
Warning: The chi-square test requires that all observations be independent. This
means that each case can appear in only one cell of the table. For example, if you apply
441
2 different treatments to the same patients and classify them both times as improved or
not improved, you can’t analyze the data with the chi-square test of independence.
You use the chi-square test to determine if your observed results are unlikely if the 2
variables are independent in the population. 2 variables are independent if knowing the
value of one variable tells you nothing about the value of the other variable. The level of
education one attains and one’s perception of life are independent if the probability of
any level of educational attainment/perception of life combination is the product of the
probability of that level of educational attainment times the probability of that perception
of life. For example, under the independence assumption, the probability of being a
college graduate and finding life exciting is:
If the null hypothesis is true, you expect to find in your table 74 excited people with
bachelor’s degrees. You see this expected value in the row labeled Expected Count in
the table above
The chi-square test is based on comparing these 2 counts: the observed number of
cases in a cell and the expected number of cases in a cell if the 2 variables are
independent. The Pearson chi-square statistic is:
X2 = ∑ (observed-expected) 2/expected
TIP 24
By examining the differences between observed and expected values in the cells [the
residuals], you can see where the independence model falls. You can examine actual
residuals and residuals standardized by estimates of their variability to help you pinpoint
departures from independence by requesting them in the Cells dialog box of the
Analyze/Descriptive Statistics/Crosstabs procedure.
From the calculated chi-square value, you can estimate how often in a sample you
would expect to see a chi-square value at least as large as the one you observed if the
independence hypothesis is true in the population. If the observed significance level is
small, enough you reject the null hypothesis that the 2 variables are independent. The
value of chi-square depends on the number of rows and columns in the table. The
degrees of freedom for the chi-square statistic is calculated by finding the product of one
fewer than the number of rows and one fewer than the number of columns. [the degrees
442
of freedom is the number of cells in a table that can be arbitrarily filled when the row and
column totals are fixed.] In this example, the degrees of freedom is 6.
From the table below, you see that the observed significance level for the Pearson chi-
square is 0.000, so you can reject the null hypothesis that level of educational attainment
and perception of life are independent.
Chi-Square Tests
Warning: A conservative rule for use of the chi-square test requires that the expected
values in each cell be greater than 1 and that most cells have expected values greater
than 5. After SPSS displays the pivot table with the statistics, it displays the number of
celss with expected values less than 5 and the minimum expected count. If more than
20% of your cells have expected values less than 5, you should combine categories, if
that makes sense for your table, so that most expected values are greater than 5.
SPSS displays several statistics in addition to the Pearson chi-square when you ask for
a chi-square test as shown above.
443
Fisher’s exact test [not shown here] is calculated if any expected value in a 2 by
2 table is < 5. You get exact probabilities of obtaining the observed table or one
more extreme if the 2 variables are independent and the marginals are fixed.
That is, the number of cases in the rows and columns of the table are determined
in advance by the researcher.
Warning: The Mantel-Haenszel test is calculated using the actual values of the row and column
variables, so if you coded 3 unevenly spaced dosages of a drug as 1, 2, and 3, those values are
used for the computations.
A special case of the chi-square test for independence is the test that several
proportions are equal. For example, you want to test whether the % of people who
report themselves to be very happy has changed during the time that the GSS has been
conducted. The figure below is a crosstabulation of the % of people who say were very
happy for each of the decades. This uses the aggregatedgss.sav file. Almost 35% of
the people questioned in the 1970s claimed that they were very happy, compared to
31% in this millennium.
happy GENERAL HAPPINESS * decade decade of survey Crosstabulation
If the null hypothesis is true, you expect 32.1% of people to be very happy in each
decade, the overall very happy rate. You calculate the expected number in each decade
by multiplying the total number of people questioned in each decade by 32.1%. The
expected number of not very happy people is 67.9% multiplied by the number of people
in each decade. These values are shown in the table above. The chi-square statistic is
calculated in the usual fashion.
From the table below, you see that the observed significance level for the chi-square
statistic is < .001, leading you to reject the null hypothesis that in each decade people
are equally likely to describe themselves as very happy. Notice that the difference
between years isn’t very large; the largest % is 34.3% for the 1970s, while the smallest
444
is 30.9% for the 1990s. the sample sizes in each group are very large, so even small
differences are statistically significant, although they may have little practical implication.
Chi-Square Tests
To see whether both men and women experienced changes in happiness during this
time period, you can compute the chi-square statistic separately for men and for women,
as shown below:
sexMale
1 RESPONDENTS Pearson Chi-Square 3.677
a 3 Asymp..298
Sig.
SEX Value df (2-sided)
Likelihood Ratio 3.668 3 .300
Linear-by-Linear
.901 1 .343
Association
N of Valid Cases 18442
2 Female Pearson Chi-Square 42.987
b 3 .000
Likelihood Ratio 42.712 3 .000
Linear-by-Linear
35.904 1 .000
Association
N of Valid Cases 23538
a.
0 cells (.0%) have expected count less than 5. The minimum expected count is
b.
586.01.
0 cells (.0%) have expected count less than 5. The minimum expected count is
742.96.
445
You see that for men, you can’t reject the null hypothesis that happiness has not
changed with time. You can reject the null hypothesis for women. From the line plot in
the graph below, you see that in the sample, happiness decreases with time for women,
but not for men. You can also graph the information. See the graph below, but also
note how to obtain the graph.
Go to the graphs menu, choose line, then select the multiple icon and
summaries for groups of cases, and then click define. Next move decade
inot the category axis box and sex into the define lines by box in the dialog
box that appears. Select other statistic, then move happy into the variable
list, and then click change statistic. In the statistic subdialog box, select %
inside and type 1 into both the low and high text boxes. Click continue,
and then click OK.
36 RESPONDENTS SEX
Male
Female
35
%in(1,1) GENERAL HAPPINESS
34
33
32
31
30
decade of survey
446
Measuring Change: McNemar Test
The chi-square test can also be used to test hypotheses about change when the same
people or objects are observed at two different times. For example, the table below is a
crosstabulation of whether a person voted in 1996 and whether he or she voted in 2000.
[See gssdata.sav file]
vote00 DID R VOTE IN 2000 ELECTION * vote96 DID R VOTE IN 1996
Crosstabulation
Count
vote96 DID R VOTE IN
1996 ELECTION
vote00 DID R VOTE 1 VOTED 1539 2 DID151 1690
IN 2000 ELECTION 2 DID NOT VOTE 1 VOTED NOT VOTE Total
187 502 689
Total 1726 653 2379
An interesting question is whether people were more likely to vote in one of the years
than the other. The cases on the diagonal of the table don’t provide any information
because they behaved similarly in both elections. You have to look at the off-diagonal
cells, which correspond to people who voted in one election but not the other. If the null
hypothesis that likelihood of voting did not change is true, a case should be equally likely
tofallinto either of the 2 off-diagonal cells. The binomial distribution is used to calculate
the exact probability of observing a split between the 2 off-diagonal cells at least as
unequal as the one observed, if cases in the population are equally likely to fall into
either off-diagonal cell. This test is called the McNemar test.
Chi-Square Tests
McNemar’s test can be calculated for a square table of any size to test whether the
upper half and the lower half of a square table are symmetric. This test is labeled in the
table above. For tables with more than 2 rows and columns, it is labeled the McNemar-
Bowker test. From the figure below, you see that you can’t reject the null hypothesis that
people who voted in only one of the 2 elections were equally likely to vote in another.
447
Warning: Since the same person is asked whether he or she voted in 1996 and whether
he or she voted in 2000, you can’t make a table in which the rows are years and the
columns are whether he or she voted. Each case would appear twice in such a table.
If you reject the null hypothesis that 2 variables are independent, you may want to
describe the nature and strength of the relationship between the 2 variables. There are
many statistical indexes that you can use to quantify the strength of the relationship
between 2 variables in a cross-classification. No single measure adequately
summarizes all possible types of association. Measures vary in the way they define
perfect and intermediate association and in the way they are interpreted. Some
measures are used only when the categories of the variables can be ordered from
lowest to highest on some scale.
Warning: Don’t compute a large number of measures and then report the most
impressive as if it were the only one examined.
You can test the null hypothesis that a particular measure of association is 0 based on
an approximate T statistic shown in the output. If the observed significance level is small
enough, you reject the null hypothesis that the measure is 0.
TIP 25
448
A2Z
PhD
Thesis
Reflections on Academic Research
APAStyle
APA Citation Citation Style
Appendix III
449
APA CITATION STYLE
An Annotated Guide
Introduction
This guide provides a basic introduction to the APA citation style. It is based on the 6th edition of
the Publication Manual of the American Psychological Association published in 2010 (2009).
The Publication Manual is generally used for academic writing in the social sciences. The manual
itself covers many aspects of research writing including selecting a topic, evaluating sources,
taking notes, plagiarism, the mechanics of writing, the format of the research paper as well as the
way to cite sources.
This guide provides basic explanations and examples for the most common types of citations
used by research scholars. For additional information and examples, refer to the Publication
Manual.
Never--NEVER--simply use a quotation into your text without indicating why it’s there in
someway. The easiest way to incorporate quotations gracefully is with “signal phrases,” which
serve to link the quotation with your sentences and to name the author of the quoted material.
There are many “signaling verbs” one can use in a signal phrase:
[a] Use signal phrases to introduce quotations which support your view:
Body modification appears to be universal among human species, perhaps due to the
nature of the human body itself. As Germaine Greer writes, “Humans are the only animals
which can consciously and deliberately change their appearance according to their own
whims”
Many more mundane elements of our social life can be thought of as body modification
rituals. For instance, as Greer argues, “Fashion, because it is beyond logic, is deeply
revealing”
While Greer argues that “beautification and mutilation are the same activity”, I wish to
suggest that some body modifications should be clearly seen and judged to be harmful
and cruel by civilized societies--in effect, that beautification and mutilation must be kept
separate.
Quotations of usually less than 40 words should be incorporated in the text and enclosed
with double quotation marks. Provide the author, publication year and a page number.
Following two examples may be distinguished and appreciated.
She stated, "The 'placebo effect,' ...disappeared when behaviors were studied in this
manner" (Miele, 1993, p. 276), but he did not clarify which behaviors were studied.
[Quotation, truncated, ends within the sentence; page number immediately after the year]
Miele (1993) found that "the 'placebo effect,' which had been verified in previous
studies, disappeared when [only the first group's] behaviors were studied in this
manner" (p. 276).
[Quotation, in full, closes at the end of the sentence; page number at the end]
When making a quotation of more than 40 words use a free-standing "block quotation" on
a new line, indented five spaces and omit quotation marks. Quotation may consist of two or
more sentences.
For electronic sources such as Web pages, provide a reference to the author, the year
and the page number (if it is a PDF document), the paragraph number if visible or a heading
followed by the paragraph number.
"The current system of managed care and the current approach to defining empirically
supported treatments are shortsighted" (Beutler, 2000, Conclusion section, ¶ 1)
[Note how para number is cited]
Some times, you may have to make use of other’s view/opinion, without using the author’s
original words. When using your own words to refer indirectly to another author's work, you must
identify the original source. A complete reference must appear in the Reference List at the end of
your paper.
In most cases, providing the author's last name and the publication year are sufficient:
451
Smith (1997) compared reaction times...
Within a paragraph; you need not include the year in subsequent references.
If there are two authors, include the last name of each and the publication year:
If there are three to five authors, cite all authors the first time; in subsequent citations,
include only the last name of the first author followed by "et al." and the year:
The names of groups that serve as authors (e.g. corporations, associations, government
agencies, and study groups) are usually spelled out each time they appear in a text citation. If
it will not cause confusion for the reader, names may be abbreviated thereafter:
To cite a specific part of a source, indicate the page, chapter, figure, table or equation at
the appropriate point in the text:
For electronic sources that do not provide page numbers, use the paragraph number, if
available, preceded by the ¶ symbol or abbreviation para. If neither is visible, cite the heading
and the number of the paragraph following it to direct the reader to the quoted material.
(Myers, 2000, ¶ 5)
(Beutler, 2000, Conclusion section, para. 1)
When citing a work which is discussed in another work, include the original author's name
in an explanatory sentence, and then include the source you actually consulted in your
parenthetical reference and in your reference list.[here, Andrews, 2007]
REFERENCE LIST
Your reference list should appear at the end of your paper. In some cases, a list of all references
may find a place at the end of the book. It provides the information necessary for a reader to
locate and retrieve any source you cite in the body of the paper. Each source you cite in the
452
paper must appear in your reference list; likewise, each entry in the reference list must be cited in
your text.
Your references should begin on a new page separate from the text of the essay; label this page
"References" centered at the top of the page (do NOT bold, underline, or use quotation marks for
the title). All text should be double-spaced just like the rest of your essay.
Important Features
All lines after the first line of each entry in your reference list should be indented one-half
inch from the left margin. This is called hanging indentation.
Authors' names are inverted (last name first); give the last name and initials for all
authors of a particular work if it has three to seven authors. If the work has more than
seven authors, list the first six authors and then use ellipses after the sixth author's name.
After the ellipses, list the last author's name of the work.
Reference list entries should be alphabetized by the last name of the first author of each
work.
If you have more than one article by the same author, single-author references or
multiple-author references with the exact same authors in the exact same order are listed
in order by the year of publication, starting with the earliest.
When referring to any work that is NOT a journal, such as a book, article, or Web page,
capitalize only the first letter of the first word of a title and subtitle, the first word after a
colon or a dash in the title, and proper nouns. Do not capitalize the first letter of the
second word in a hyphenated compound word.
Do not italicize, underline, or put quotes around the titles of shorter works such as journal
articles or essays in edited collections.
Below are some examples of the most common types of sources including online sources.
I NON-ELECTRONIC SOURCES
Traditionally, books, articles and paper from print media, i.e., non-electronic, are quoted in the
text material relating to research work/paper. To make the process systematic and universal,
th
several citations methods were devised during the early part of the 20 century. The ultimate aim
and scope of these citation methods is to standardize the citation process.
[A] Books
Books form the major bulk –sources of citation which has been followed for more than a century.
Following examples would cover various types of citation used in APA style while citing books.
453
Bernstein, T.M. (1965). The careful writer: A modern guide to English usage (2nd ed.).
New York, NY: Atheneum.
Beck, C. A. J., & Sales, B. D. (2001). Family mediation: Facts, myths, and future
prospects. Washington, DC: American Psychological Association.
Postman, N. (1979). Teaching as a conserving activity. New York, NY: Delacorte Press.
Postman, N. (1985). Amusing ourselves to death: Public discourse in the age of show
business. New York, NY: Viking.
If works by the same author are published in the same year, arrange alphabetically by title and
add a letter after the year as indicated below
McLuhan, M. (1970b). From cliche to archetype. New York, NY: Viking Press.
Gibbs, J. T., & Huang, L. N. (Eds.). (1991). Children of color: Psychological interventions
with minority youth. San Francisco, CA: Jossey-Bass.
[B] Articles
Next to books, articles, papers, etc take the second place of resources for citation. Following
would help the research scholar understand the intricacies in citing the articles in a research
paper.
454
[b] Article in a journal
Note: List only the volume number if the periodical uses continuous pagination throughout a
particular volume. If each issue begins with page 1, then list the issue number as well.
Klimoski, R., & Palmer, S. (1993). The ADA and the hiring process in
organizations. Consulting Psychology Journal: Practice and Research, 45(2), 10-36.
MacIntyre, L. (Reporter). (2002, January 23). Scandal of the Century [Television series
episode]. In H. Cashore (Producer), The fifth estate. Toronto, Canada: Canadian
Broadcasting Corporation.
Kubrick, S. (Director). (1980). The Shining [Motion picture]. United States: Warner
Brothers.
II ELECTRONIC SOURCES
We are in the era of ICT. Publishing online, either an article or paper or a book, has become the
fast growing trend and quite often, the most popular and useful happening in the world of
publishing. This guide provides basic guidelines and examples for citing electronic sources using
the Publication Manual of the American Psychological Association, 6th edition and the APA Style
for Electronic Sources (2008). APA style requires that sources receive attribution in the text by
the use of parenthetical in-text references.
Where available, the doi (digital object identifier) number should be used to provide access
information for electronic materials. URLs may be included for resources that do not have a doi
number. The names of full-text databases and rarely necessary in an APA citation. Retrieval date
information should only be included when the page/site/information is likely to change.
Format:
Author Last, First Initial. (Year of Publication). Article title. Journal Title, volume #(issue number),
start page-end page. doi: alphanumeric string
Citation:
455
Welch, K.E. (2005). Technical communication and physical location: Topoi and
architecture in computer classrooms. Technical Communication Quarterly 14(3), 335-344. doi:
10.1207/s15427625tcq1403_12
Format:
Author Last, First Initial. (Year of Publication). Article title. Journal Title, volume #(issue number),
start page-end page. Retrieved from URL
Citation:
Fisher, D., Russell, D., Williams, J., & Fisher, D. (2008). Space, time & transfer in
virtual case environments. Kairos 12(2), 127-165. Retrieved from
http://kairos.technorhetoric.net/12.2/binder.html?topoi/fisher-etal/articleIntro.html
Format:
Author Last, First Initial. (Year of Publication). Article title. Journal Title,
volume #(issue number), start page-end page. Advance online
publication. doi: alphanumeric string or URL
Note: In the following , the text includes neither page numbers nor a doi number. Therefore, the
page number component is not included and the URL is substituted for the doi.
Citation:
St. John, J., & Quinn, T.W. (2008). Rapid capture of DNA targets.
Biotechniques 44(2). Advance online publication. Retrieved
from http://www.biotechniques.com/default.asp?page=aop&subsection
=article_display&display=full&id=112633
Format:
Author Last, First Initial. (in press). Article title. Journal Title. Retrieved from doi or URL
Citation:
Papini, P., Adriani, O., Ambriola, M., Barbarino, G.C., Basili, A., Bazilevskaja, G.A.,et al. (in
press). In-flight performances of the PAMELA satellite experiment. Nuclear Instruments and
Methods in Physics Research Section A: Accelerators, Spectrometers, Detectors, and Associated
Equipment. Retrieved from http://www.sciencedirect.com
Format:
Author Last, First Initial. (n.d.). Manuscript title. Manuscript in preparation. Retrieved from URL
Note: The initials n.d. (no date) are included here in lieu of the publication date.
Citation:
Goggans, H. (n.d.) The “Floating Bear” as zine precursor. Manuscript in preparation. Retrieved
from http://www.heathergoggans.com
456
[b] Electronics Books
Format:
Author Last, First Initial. (Year). Title. Retrieved from URL
Citation:
Dickens, C. (1910). A tale of two cities. Retrieved from
http://books.google.com/books?id=Pm0AAAAAYAAJ
Format::
Author Last, First Initial. (Year). Chapter title. In First Initial Last Name & First Initial Last Name
(Eds.), Book title (pp. start page-end page). doi: alphanumeric string
Note: Use a doi number if available. If a number is not available, do not provide retrieval
information for book chapters. See .
Citation:
Shun, I. (1998). The invention of the martial arts: Kanao Jigorao and Kaodaokan judo. In S.
Vlastos (Ed.), Mirror of modernity: Invented traditions of modern Japan (pp. 163-173).
Format:
Author Last, First Initial. (Year). Title. Retrieved from database name. (accession number if
available)
Citation:
Houck, A.M. If God is God: Laughter and the divine in ancient Greek and modern Christian
literature. Retrieved from ProQuest Digital Dissertations. (AAI9990560)
Format:
Author Last, First Initial. (Year, Month Day of Pub). Title [Format of defense] (Dissertation
defense, University Name). Name. Retrieved from URL
Citation:
Boardman, R. (2004, September 24). Improving tool support for personal information
management [PowerPoint slides](Dissertation defense, Imperial College of London Department of
Electrical and Electronic Engineering). Retrieved from http://www.slideshare.net/rick/phd-
defenseimproving-tool-support-for-personal-information-management/
[d] Abstracts
457
[d1] Abstract as Original Source
Format:
Author Last, First Initial. (Year, Month Day of Pub). Title. [Abstract]. Retrieved from name of
database.
Note: If a publication number is assigned to the abstract, it may be included in parentheses after
the title. See .
Citation:
Berman, L.M., & Letellier, B. (1996). Pharaohs: Treasures of Egyptian art from
the Louvre (AEB 1996.0572) [Abstract]. Retrieved from Annual Egyptological Bibliography
database.
Format:
Author Last, First Initial. Title of Article. (Year, Month Day). Title of abstract. In First Initial Last
Name of authority (Title of Authority), Title of Meeting, Symposium, or Poster Session. Type of
meeting conducted at the name of sponsoring meeting or conference. Abstract retrieved from
URL
Citation:
Miller, C. (2007, June 25). Preserving soil survey data with GIS. In J.J. Meier (Web Editor),
Issues and trends in digital repositories of non-textual information: Support for research and
teaching. Poster session conducted at the ACRL Science and Technology Section conference.
Abstract retrieved from http://www.ala.org/ala/acrl/aboutacrl/acrlsections/sciencetech/
stsconferences/posters07.cfm#poster5
Format:
Author Last, First Initial. (Year). Article title. Journal Title volume #(issue number if available),
start page-end page. Abstract retrieved from secondary source
name.
Citation:
Chung, D.S., & Kim, S. (2008). Blogging activity among cancer patients and their companions:
Uses, gratifications, and predictors of outcomes. Journal of the American Society for Information
Science and Technology 59(2), 297-306. Abstract retrieved from Wiley InterScience database.
[e] Bibliographies
Format:
Author Last, First Initial. (Year of Pub). Title. Retrieved from Name of Source:URL.
Citation:
de Zepetnek, S.T., Nielsen, W.C., & Aoun, S. (n.d.). Selected bibliography for
458
work in comparative cultural studies (history, theory, method). Retrieved from CLCWeb:
Comparative Literature and Culture: http://clcwebjournal.lib.purdue.edu/library/
comparativeculturalstudies(biblio).html
Format:
Last Name, First Initials of Author. (Year of Pub). Title of course [Bibliography]. Retrieved from
Name of University and Courseware Product/Site: URL.
Citation:
Johnston, S.L. (2004). French resources on the web [Bibliography]. Retrieved from Trinity
University BlackBoard site: http://bb.trinity.edu
Format:
Author Last, First Initial. (Year of Publication). Title of review [Review of the book Title of book].
Journal Title, volume #(issue # if available), inclusive page numbers or location on the web page.
Retrieved from URL
Citation:
Ferrer, H. (2006). The case of the disappearing genres [Review of the book Best American
mystery stories 2005]. American Book Review, 27(4), 8-9. Retrieved from
http://americanbookreview.org
Format:
Author Last, First Initial. (Year of Pub). Title of commentary [Peer commentary on the journal
article “Title of article”]. Retrieved Month Day, Year, from URL
Note: If the title of the item under review is clear from the title of the review, then the bracketed
explanation is not necessary.
Citation:
Bizzell, P., & Herzberg, B. (1988). A response to Kathleen E. Welch [Peer commentary on the
journal article “A critique of classical rhetoric: Thecontemporary appropriation of ancient
discourse”]. Rhetoric Review 6(2): 246. Retrieved from http://www.jstor.org/stable/465942
Format:
Author Last, First Initial. (Year of Publication). Title. Retrieved from host web sitename: URL.
Citation:
459
National Park Service, U.S. Department of the Interior. (2007). Imagining the corps of discovery:
Visual art of and about the Lewis and Clark expedition. Retrieved from Lewis and Clark Journey
of Discovery web site: http://www. nps.gov/archive/jeff/LewisClark2/Education/ Visual Art/
VisualArtLessonPlan.htm
Format:
Author Last, First Initial. (Year of Pub). Title [format of notes].Retrieved from host web site name:
URL.
Citation:
Johannesson, C. (2008). The mole [PowerPoint slides]. Retrieved from Communication Arts High
School web site: http://www.nisd.net/ communicationsarts/pages/chem/ppt/ molarconv_pres.ppt
Format:
Author Last, First Initial or Corporate Author Name. (Year of Publication). Title. [Format of data].
Available from Name of source site: URL
Citation:
Chris Bell U.S. Congress Committee. (2004). FEC-116877 Form F3 [Data file]. Retrieved from
Federal Election Commission web site: http://www.fec.org/finance/disclosure/efile_search.shtml
Format:
Author Last, First Initial or Corporate Author Name. (Year of Pub). [Description of graphic
representation of data]. Title of source. Retrieved from URL
Citation:
Sullivan, R.D. (2007). [Map depicting 10 different political regions in the United States for the
2008 election year]. Beyond red & blue. Retrieved from http:// massinc.typepad.com/
beyondredandblue/2007/09/beyond-red-blue.html
Format:
Author Last, First Initial (Responsibility) and Subject Last, First Initial (Responsibility).
(Year of Publication). Title of collected data [Format of data]. Retrieved from Name of web site:
URL
Note: Interviews conducted one-on-one that have not been preserved in print or other formats
should be cited in text as a personal communication. Data that cannot be retrieved is not included
in the list of references.
Citation:
Quintard, T. (Interviewer) & Monroe, E. (Interviewee). (1974). Ethel Monroe, April
460
5, 1974 [Audio file]. Retrieved from Black Oral History web site: http://www.wsulibs.
wsu.edu/holland/masc/xblackoralhistory.html
Format:
Author Last, First Initial. (Year of Publication). Title of entry. In First Initial Last Name
of editor (Ed.), Title. Retrieved Month Day, Year, from URL of index or main page of encyclopedia
Note: If no author is listed for an entry, include the title of the entry first in the citation.
Citation:
Kania, A. (2007). Philosophy of music. In E.N. Zalta (Ed.), The Stanford encyclopedia of
philosophy. Retrieved from http://plato.stanford.edu/entries/music/
Format:
Title of entry. (Year of Publication). In Title of dictionary. Retrieved Month Day, Year,
from URL of index or main page of dictionary
Citation:
German shepherd. (n.d.). In Merriam-Webster’s collegiate dictionary. Retrieved from http://
www.britannica.com/dictionary?book=Dictionary
Format:
Author Last, First Initial. (Year of Publication). Title of entry. In First Initial Last Name
of editor (Ed.), Title. Retrieved Month Day, Year, from URL of index or main page of handbook
Note: If no author is listed for an entry, include the title of the entry first in the citation.
Citation:
Wallace, E. (n.d.). Fort Stockton, TX. In R.R. Barkley (Ed.), The Handbook of
Texas online. Retrieved from http://www.tshaonline.org/handbook/online/index.html
[i4] Wiki
Format:
Title of entry. (n.d.). Retrieved Month Day, Year, from Title of Wiki: URL
Note: Wikis are collaboratively authored, rarely signed, and always changing. Therefore, author
and publication date are not required.
Citation:
Judo. (n.d.). Retrieved August 29, 2007, from Wikipedia: http://en.wikipedia.org/wiki/Judo
461
[j1] Annual Report
Format:
Author Last, First Initial or Corporate Author Name. (Year of Publication). Title. Retrieved from
URL
Citation:
Honda Motor Co. (2004). Annual report. Retrieved from http://world.honda.com/ investors/
annualreport/2004/07.html
Format:
Author Last, First Initial or Corporate Author Name. (Year of Publication). Title
[Fact sheet]. Retrieved from URL
Citation:
Boy Scouts of America. (n.d.). Merit badge program [Fact sheet]. Retrieved from http://www.
scouting.org/factsheets/02-500.html
Format:
Author Last, First Initial or Corporate Author Name. (Year of Publication). Title
[Brochure]. Retrieved from URL
Citation:
First Five Oral Health. (2005). Healthy teeth begin at birth [Brochure]. Retrieved from
http://www.first5oralhealth.org/page.asp?page_id=439
Format:
Author Last, First Initial or Corporate Author Name (Responsibility of Author).
(Year of Publication). Title [Format of announcement]. Retrieved from URL
Citation:
Lynch, D. (Director). (2007). Clean up New York [Video file]. Retrieved from http://www.
youtube.com/watch?v=ZSWv90msTUc
Format:
Author Last, First Initial or Corporate Author Name. (Year of Publication). Title [Format of slides].
Retrieved from URL
Citation:
Rutter, R., & Boulton, M. (2007). Web typography sucks [PowerPoint slides].Retrieved from
http://webtypography.net/sxsw2007/
462
Format:
Author Last, First Initial or Corporate Author Name. (Year of Publication). Title (Report No. if
available). Retrieved from URL
Citation:
Miller, D.C., Sen, A., & Malley, L.B. (2007). Comparative indicators of education in the United
States and other G8 countries (Report No. GPO ED003826P). http://nces.ed.gov/
pubsearch/pubsinfo.asp?pubid=2007006
Format:
Author Last, First Initial or Corporate Author Name. (Year, Month Day of Publication).Title [Press
release]. Retrieved from URL
Citation:
Department of Athletics, Trinity University. (2008, January 7). Trinity winsPontiac game changing
performance of the year award [Press release].Retrieved from
http://www.trinity.edu/departments/athletics/Football/Pontiac_GCPOY.htm
Format:
Author Last, First Initial or Corporate Author Name. (Year of Publication). Title
(Policy No. if available). Retrieved from URL
Citation:
Organization for Economic Cooperation and Development. (2007). Climate
change: Meeting the challenge to 2050. Retrieved from http://www.oecd.org/dataoecd/
6/21/39762914.pdf
Format:
Author Last, First Initial or Corporate Author Name. (Year of Publication). Title.Retrieved from
URL
Citation:
Texas Education Agency. (1998). Texas essential knowledge and skills for social
studies, subchapter c., high school. Retrieved from http://www.tea.state.tx.us/
rules/tac/chapter113/ch113c.pdf
Format:
Author Last, First Initial or Corporate Author Name. (Year of Publication). Title
[White paper]. Retrieved from URL
Citation:
OCLC Online Computer Library Center. (2002). OCLC white paper on the
information habits of college students [White paper]. Retrieved from
http://www5.oclc.org/downloads/community/informationhabits.pdf
463
[j11] Newsletter Article
Format:
Author Last, First Initial or Corporate Author Name. (Year of Publication). Title of
Article. Title of Newsletter, volume # (issue #). Retrieved from URL
Citation:
Seiden, Peggy. (2006). Library survey evaluates service. @library.edu, the
Swarthmore College library newsletter, 8(2). Retrieved from http://
www.swarthmore.edu/Documents/library/spring06.pdf
Format:
Author Last, First Initial. (Year, Month Day of Publication). Article title. Newspaper Name.
Retrieved from URL
Citation:
Mapes, L.V. (2005, May 25). Unearthing Tse-whit-zen. Seattle Times.Retrieved from
http://seattletimes.nwsource.com
Format:
Author Last, First Initial, & Author Last, First Initial (Author Responsibility). (Year of Pub). Title of
television feature [Motion picture]. In First Initial Last Name (Role of Presenter), Title of program.
Podcast retrieved from Name of host
site: URL
Citation:
Schultz, D. (Producer/Writer). (2007). Silence of the bees [Motion picture]. In Kaufman, F.
(Executive Producer), Nature. Podcast retrieved from PBS:
http://www.pbs.org/wnet/nature/rss/podcast.xml
Format:
Author Last, First Initial (Author Responsibility). (Year, Month Day of Publication).
Title of podcast [Podcast identification number if available]. Podcast series Name. Podcast
retrieved from URL
Citation:
Hinze, S. (Host). (2007, December 25). Robots! [Show 440]. Fanboy radio. Podcast retrieved
from http://media.libsyn.com/media/fanboyradio/fbr_440.mp3
Format:
Author Last, First Initial. (Year of Pub). Title of article [Online exclusive]. Title of
Magazine. Retrieved Month Day, Year, from URL
464
Citation:
Millet, M. (2005). NextGen: Is this the ninth circle of hell? [Online exclusive]. Library Journal.
Retrieved December 7, 2007, from http:// www.libraryjournal.com/article/CA509641.html
Format:
Author Last, First Initial. (Year, Month Day of posting). Title of post [post number if available].
Message posted to URL
Citation:
Epstein, P. (2005, November 20). Dice manipulation. Message posted to
http://www.bkgm.com/rgb/rgb.cgi?menu
Format:
Author Last, First Initial. (Year, Month Day of Pub). Title of post [Msg. # if available]. Message
posted to Name of List, archived at URL
Note: Since messages in an e-mail list are posted through email, the URL should direct readers
to the web site or page where the messages have been archived.
Citation:
Bennick, T. (2007, December 28). Speedball press [Msg. #00189]. Message posted to Book Arts
Web electronic mailing list, archived at http://palimpsest.stanford.edu/byform/mailing-
lists/bookarts/#archive
Format:
Author Last, First Initial. (Year, Month Day of Publication). Title. Message posted to URL
Citation:
Rush, Wilhelmina. (2007, July 12). Four stars! Message posted to
http://wilhelminarush.livejournal.com
Format:
Author Last, First Initial. (Year, Month Day of Publication). Title [Format of post]. Video posted to
URL
Note: If the author’s name is not provided, the screen name of the posting author may be used
instead.
Citation:
Rjsivey. (2007, July 27). Narcoleptic Chihuahua [Video file]. Video posted to
http://www.youtube.com/watch?v=XyzeCiW-nn0
465
[m] Computer Programs, Software, and Programming Languages
Format:
Author Last, First Initial. (Year of Publication). Name of product [format of product]. Available from
Name of source: URL Title.
Citation:
Elizabeth Huth Coates Library, Trinity University. (2007). Coates library toolbar [Software].
Retrieved from Coates Library: http://lib.trinity.edu/toolbar/index.shtml
Conclusion
The Internet and other digital sources of information are widely used tools for research, but since
they are still relatively new tools, various disciplines are still deciding what the correct way to
document electronic sources is, and disciplines are constantly changing their minds as to what
the most appropriate ways are. It may be noted that the situation is in the very formative stage.
To ensure accuracy, it is always best to consult the style manual and/or accompanying website
for your discipline first before consulting other sources. Other way to determine the style you
should use is to ask your research supervisor for guidelines or resources, or to locate the official
website for publications in your discipline and see if they have any guidelines or style manuals
available.
---------------------------------------------------------------------------------------------------------------------------------
References
02] http://www.dianahacker.com/resdoc/home.html
03] http://owl.english.purdue.edu/owl/section/2/
04] http://www.apastyle.org/apa-style-help.aspx
05] http://www.apastyle.org/learn/index.aspx
06] http://www.apastyle.org/products/index.aspx
07] http://owl.english.purdue.edu/owl/section/2/10/
08] http://www.apastyle.org/
09] http://www.lib.usm.edu/help/style_guides/apa.html
10] http://www.nlight.com/Success/Research/8cite.html
“The proper words in the proper places are the true definition of style.”
<< Jonathan Swift
466
A2Z
PhD
Thesis
Appendix IV
Excel for
467 Statistical Data Analysis
Excel For Statistical Data Analysis
Europe Site Site for Asia Site for Middle East UK Site USA Site
Para mis visitantes del mundo de habla hispana, este sitio se encuentra
disponible en español en:
Sitio Espejo para España Sitio Espejo para América Latina Sitio de los E.E.U.U.
To search the site, try Edit | Find in page [Ctrl + f]. Enter a word or phrase in the
dialogue box, e.g. "variance" or "mean" If the first appearance of the word/phrase is
not what you are looking for, try Find Next.
MENU
1. Introduction
2. Entering Data
3. Descriptive Statistics
4. Normal Distribution
5. Confidence Interval for the Mean
6. Test of Hypothesis Concerning the Population Mean
7. Difference Between Mean of Two Populations
8. ANOVA: Analysis of Variances
9. Goodness-of-Fit Test for Discrete Random Variables
10. Test of Independence: Contingency Tables
468
11. Test Hypothesis Concerning the Variance of Two Populations
12. Linear Correlation and Regression Analysis
13. Moving Average and Exponential Smoothing
14. Applications and Numerical Examples
15. E-Labs to Fully Understand Statistical Concepts
16. Interesting and Useful Sites
Companion Sites:
Introduction
This site provides illustrative experience in the use of Excel for data summary,
presentation, and for other basic statistical analysis. I believe the popular use of
Excel is on the areas where Excel really can excel. This includes organizing
data, i.e. basic data management, tabulation and graphics. For real statistical
analysis on must learn using the professional commercial statistical packages such
as SAS, and SPSS.
Microsoft Excel 2000 (version 9) provides a set of data analysis tools called
the Analysis ToolPak which you can use to save steps when you develop
complex statistical analyses. You provide the data and parameters for each
analysis; the tool uses the appropriate statistical macro functions and then
displays the results in an output table. Some tools generate charts in addition to
output tables.
If the Data Analysis command is selectable on the Tools menu, then the
Analysis ToolPak is installed on your system. However, if the Data Analysis
command is not on the Tools menu, you need to install the Analysis ToolPak
by doing the following:
Step 1: On the Tools menu, click Add-Ins.... If Analysis ToolPak is not listed
in the Add-Ins dialog box, click Browse and locate the drive, folder name, and
469
file name for the Analysis ToolPak Add-in — Analys32.xll — usually located
in the Program Files\Microsoft Office\Office\Library\Analysis folder. Once
you find the file, select it and click OK.
Step 2: If you don't find the Analys32.xll file, then you must install it.
1. Insert your Microsoft Office 2000 Disk 1 into the CD ROM drive.
2. Select Run from the Windows Start menu.
3. Browse and select the drive for your CD. Select Setup.exe, click
Open, and click OK.
4. Click the Add or Remove Features button.
5. Click the + next to Microsoft Excel for Windows.
6. Click the + next to Add-ins.
7. Click the down arrow next to Analysis ToolPak.
8. Select Run from My Computer.
9. Select the Update Now button.
10.Excel will now update your system to include Analysis ToolPak.
11.Launch Excel.
12.On the Tools menu, click Add-Ins... - and select the Analysis
ToolPak check box.
Step 3: The Analysis ToolPak Add-In is now installed and Data Analysis... will
now be selectable on the Tools menu.
Excel is available on all public-access PCs (i.e., those, e.g., in the Library and
PC Labs). It can be opened either by selecting Start - Programs - Microsoft
Excel or by clicking on the Excel Short Cut which is either on your desktop, or
on any PC, or on the Office Tool bar.
Opening a Document:
470
Click on File-Open (Ctrl+O) to open/retrieve an existing
workbook; change the directory area or drive to look for files in
other locations
To create a new workbook, click on File-New-Blank Document.
To save your document with its current filename, location and file format either
click on File - Save. If you are saving for the first time, click File-Save;
choose/type a name for your document; then click OK. Also use File-Save if
you want to save to a different filename/location.
When you have finished working on a document you should close it. Go to the
File menu and click on Close. If you have made any changes since the file was
last saved, you will be asked if you wish to save them.
Your work is stored in an Excel file called a workbook. Each workbook may
contain several worksheets and/or charts - the current worksheet is called the
471
active sheet. To view a different worksheet in a workbook click the appropriate
Sheet Tab.
You can access and execute commands directly from the main menu or you can
point to one of the toolbar buttons (the display box that appears below the
button, when you place the cursor over it, indicates the name/action of the
button) and click once.
To move from one worksheet to another click the sheet tabs. (If your workbook
contains many sheets, right-click the tab scrolling buttons then click the sheet
you want.) The name of the active sheet is shown in bold.
To move between cells on a worksheet, click any cell or use the arrow keys. To
see a different area of the sheet, use the scroll bars and click on the arrows or
the area above/below the scroll box in either the vertical or horizontal scroll
bars.
Note that the size of a scroll box indicates the proportional amount of the used
area of the sheet that is visible in the window. The position of a scroll box
indicates the relative location of the visible area within the worksheet.
Entering Data
472
A new worksheet is a grid of rows and columns. The rows are labeled with
numbers, and the columns are labeled with letters. Each intersection of a row
and a column is a cell. Each cell has an address, which is the column letter and
the row number. The arrow on the worksheet to the right points to cell A1,
which is currently highlighted, indicating that it is an active cell. A cell must
be active to enter information into it. To highlight (select) a cell, click on it.
Click on a cell (e.g. A1), then hold the shift key while you click
on another (e.g. D4) to select all cells between and including A1
and D4.
Click on a cell (e.g. A1) and drag the mouse across the desired
range, unclicking on another cell (e.g. D4) to select all cells
between and including A1 and D4.
To select several cells which are not adjacent, press "control" and
click on the cells you want to select. Click a number or letter
labeling a row or column to select that entire row or column.
One worksheet can have up to 256 columns and 65,536 rows, so it'll be a while
before you run out of space.
473
To enter information into a cell, select the cell and begin typing.
Note that as you type information into the cell, the information you enter also
displays in the formula bar. You can also enter information into the formula
bar, and the information will appear in the selected cell.
Press "Enter" to move to the next cell below (in this case, A2)
Press "Tab" to move to the next cell to the right (in this case, B1)
Click in any cell to select it
Entering Labels
If you are creating a long worksheet and you will be repeating the same label
information in many different cells, you can use the AutoComplete function.
This function will look at other entries in the same column and attempt to
match a previous entry with your current entry. For example, if you have
already typed "Wesleyan" in another cell and you type "W" in a new cell, Excel
will automatically enter "Wesleyan." If you intended to type "Wesleyan" into
the cell, your task is done, and you can move on to the next cell. If you
intended to type something else, e.g. "Williams," into the cell, just continue
typing to enter the term.
To turn on the AutoComplete funtion, click on "Tools" in the menu bar, then
select "Options," then select "Edit," and click to put a check in the box beside
"Enable AutoComplete for cell values."
Another way to quickly enter repeated labels is to use the Pick List feature.
Right click on a cell, then select "Pick From List." This will give you a menu of
474
all other entries in cells in that column. Click on an item in the menu to enter it
into the currently selected cell.
Entering Values
Dates are stored as MM/DD/YYYY, but you do not have to enter it precisely in
that format. If you enter "jan 9" or "jan-9", Excel will recognize it at January 9
of the current year, and store it as 1/9/2002. Enter the four-digit year for a year
other than the current year (e.g. "jan 9, 1999"). To enter the current day's date,
press "control" and ";" at the same time.
Times default to a 24 hour clock. Use "a" or "p" to indicate "am" or "pm" if
you use a 12 hour clock (e.g. "8:30 p" is interpreted as 8:30 PM). To enter the
current time, press "control" and ":" (shift-semicolon) at the same time.
1. Select a cell in the region, and press Ctrl+Shift+* (in Excel 2003,
press this or Ctrl+A) to select the Current Region.
2. From the Format menu, select Conditional Formatting.
3. In Condition 1, select Formula Is, and type =MAX($F:$F) =$F1.
4. Click Format, select the Font tab, select a color, and then click
OK.
5. In Condition 2, select Formula Is, and type =MIN($F:$F) =$F1.
475
6. Repeat step 4, select a different color than you selected for
Condition 1, and then click OK.
Solution: Use the IF, MOD, and ROUND functions in the following formula:
=IF(MOD(A2,1)=0.5,A2,ROUND(A2,0))
1. Select the cells in the sheet by pressing Ctrl+A (in Excel 2003,
select a cell in a blank area before pressing Ctrl+A, or from a
selected cell in a Current Region/List range, press Ctrl+A+A).
OR
Click Select All at the top-left intersection of rows and columns.
2. Press Ctrl+C.
3. Press Ctrl+Page Down to select another sheet, then select cell A1.
4. Press Enter.
Copying the entire sheet means copying the cells, the page setup parameters,
and the defined range Names.
Option 1:
Option 2:
476
location in the current workbook or to a different workbook. Be
sure to mark the Create a copy checkbox.
Option 3:
Sorting by Columns
Descriptive Statistics
The Data Analysis ToolPak has a Descriptive Statistics tool that provides you
with an easy way to calculate summary statistics for a set of sample data.
Summary statistics includes Mean, Standard Error, Median, Mode, Standard
Deviation, Variance, Kurtosis, Skewness, Range, Minimum, Maximum, Sum,
and Count. This tool eliminates the need to type indivividual functions to find
each of these results. Excel includes elaborate and customisable toolbars, for
example the "standard" toolbar shown here:
477
Excel can be used to generate measures of location and variability for a
variable. Suppose we wish to find descriptive statistics for a sample data: 2, 4,
6, and 8.
Step 1. Select the Tools *pull-down menu, if you see data analysis, click on
this option, otherwise, click on add-in.. option to install analysis tool pak.
Enter A1:A4 in the input range box, A1 is a value in column A and row 1, in
this case this value is 2. Using the same technique enter other VALUES until
you reach the last one. If a sample consists of 20 numbers, you can select for
example A1, A2, A3, etc. as the input range.
Step 5. Select an output range, in this case B1. Click on summary statistics to
see the results.
Select OK.
When you click OK, you will see the result in the selected range.
As you will see, the mean of the sample is 5, the median is 5, the standard
deviation is 2.581989, the sample variance is 6.666667,the range is 6 and so on.
Each of these factors might be important in your calculation
of different statistical procedures.
478
Normal Distribution
Consider the problem of finding the probability of getting less than a certain
value under any normal probability distribution. As an illustrative example, let
us suppose the SAT scores nationwide are normally distributed with a mean
and standard deviation of 500 and 100, respectively. Answer the following
questions based on the given information:
A: What is the probability that a randomly selected student score will be less
than 600 points?
B: What is the probability that a randomly selected student score will exceed
600 points?
C: What is the probability that a randomly selected student score will be
between 400 and 600?
Hint: Using Excel you can find the probability of getting a value approximately
less than or equal to a given value. In a problem, when the mean and the
standard deviation of the population are given, you have to use common sense
to find different probabilities based on the question since you know the area
under a normal curve is 1.
Solution:
In the work sheet, select the cell where you want the answer to appear.
Suppose, you chose cell number one, A1. From the menus, select "insert pull-
down".
Steps 2-3 From the menus, select insert, then click on the Function option.
Step 4. After clicking on the Function option, the Paste Function dialog appears
from Function Category. Choose Statistical then NORMDIST from
theFunction Name box; Click OK
479
As you see the value 0.84134474 appears in A1, indicating the probability that
a randomly selected student's score is below 600 points. Using common sense
we can answer part "b" by subtracting 0.84134474 from 1. So the part "b"
answer is 1- 0.8413474 or 0.158653. This is the probability that a randomly
selected student's score is greater than 600 points. To answer part "c", use the
same techniques to find the probabilities or area in the left sides of values 600
and 400. Since these areas or probabilities overlap each other to answer the
question you should subtract the smaller probability from the larger probability.
The answer equals 0.84134474 - 0.15865526 that is, 0.68269. The screen shot
should look like following:
Inverse Case
Calculating the value of a random variable often called the "x" value
You can use NORMINV from the function box to calculate a value for the
random variable - if the probability to the left side of this variable is given.
Actually, you should use this function to calculate different percentiles. In this
problem one could ask what is the score of a student whose percentile is 90?
This means approximately 90% of students scores are less than this number. On
the other hand if we were asked to do this problem by hand, we would have had
to calculate the x value using the normal distribution formula x = m + zd. Now
let's use Excel to calculate P90. In the Paste function, dialog click on statistical,
then click on NORMINV. The screen shot would look like the following:
At the end of this screen you will see the formula result which is approximately
628 points. This means the top 10% of the students scored better than 628.
480
Suppose we wish for estimating a confidence interval for the mean of a
population. Depending on the size of your sample size you may use one of the
following cases:
In this formula is the mean of the sample; Z is the interval coefficient, which
can be found from the normal distribution table (for example the interval
coefficient for a 95% confidence level is 1.96). S is the standard deviation of
the sample and n is the sample size.
Now we would like to show how Excel is used to develop a certain confidence
interval of a population mean based on a sample information. As you see in
order to evaluate this formula you need "the mean of the sample" and the
margin of error Excel will automatically calculate these
quantities for you.
add the margin of error to the mean of the sample, ; Find the
upper limit of the interval and subtract the margin of error from the mean to the
lower limit of the interval. To demonstrate how Excel finds these quantities we
will use the data set, which contains the hourly income of 36 work-study
students here, at the University of Baltimore. These numbers appear in cells A1
to A36 on an Excel work sheet.
481
then click OK.
On the descriptive statistics dialog, click on Summary Statistic. After you have
done that, click on the confidence interval level and type 95% - or in other
problems whatever confidence interval you desire. In the Output Range box
enter B1 or what ever location you desire.
Now click on OK. The screen shot would look like the following:
482
As you see, the spreadsheet shows that the mean of the sample is =
6.902777778 and the absolute value of the margin of error =
0.231678109. This mean is based on this sample information. A 95%
confidence interval for the hourly income of the UB work-study students has an
upper limit of 6.902777778 + 0.231678109 and a lower limit of 6.902777778 -
0.231678109.
On the other hand, we can say that of all the intervals formed this way 95%
contains the mean of the population. Or, for practical purposes, we can be 95%
483
confident that the mean of the population is between 6.902777778 -
0.231678109 and 6.902777778 + 0.231678109. We can be at least 95%
confident that interval [$6.68 and $7.13] contains the average hourly income of
a work-study student.
Smal Sample Size (say less than 30) If the sample n is less than 30 or we must
use the small sample procedure to develop a confidence interval for the mean
of a population. The general formula for developing confidence intervals for
the population mean based on small a sample is:
Now you would like to see how Excel is used to develop a certain confidence
interval of a population mean based on this small sample information.
As you see, to evaluate this formula you need "the mean of the sample" and
the margin of error Excel will automatically calculate these
quantities the way it did for large samples.
Again, the only things you have to do are: add the margin of
error to the mean of the sample, , find the upper limit of the
interval and to subtract the margin of error from the mean to find the lower
limit of the interval.
To demonstrate how Excel finds these quantities we will use the data set, which
contains the hourly incomes of 10 work-study students here, at the University
of Baltimore. These numbers appear in cells A1 to A10 on an Excel work sheet.
After entering the data we follow the descriptive statistic procedure to calculate
the unknown quantities (exactly the way we found quantities for large sample).
Here you are with the procedures in step-by-step form:
Now, like the calculation of the confidence interval for the large sample,
calculate the confidence interval of the population based on this small sample
information. The confidence interval is:
485
6.8 ± 0.414426102
or
$6.39<===>$7.21.
We can be at least 90% confidant that the interval [$6.39 and $7.21] contains
the true mean of the population.
Again, we must distinguish two cases with respect to the size of your sample
Large Sample Size (say, over 30): In this section you wish to know how Excel
can be used to conduct a hypothesis test about a population mean. We will use
the hourly incomes of different work-study students than those introduced
earlier in the confidence interval section. Data are entered in cells A1 to A36.
The objective is to test the following Null and Alternative hypothesis:
The null hypothesis indicates that the average hourly income of a work-study
student is equal to $7 per hour; however, the alternative hypothesis indicates
that the average hourly income is not equal to $7 per hour.
I will repeat the steps taken in descriptive statistics and at the very end will
show how to find the value of the test statistics in this case, z, using a cell
formula.
Step 3. Click on Data Analysis then choose the Descriptive Statistics option,
click OK.
On the descriptive statistics dialog, click on Summary Statistic. Select
the Output Range box, enter B1 or whichever location you desire. Now
click OK.
(To calculate the value of the test statistics search for the mean of the sample
then the standard error. In this output, these values are in cells C3 and C4.)
486
Step 4. Select cell D1 and enter the cell formula = (C3 - 7)/C4. The screen shot
should look like the following:
The value in cell D1 is the value of the test statistics. Since this value falls in
acceptance range of -1.96 to 1.96 (from the normal distribution table), we fail
to reject the null hypothesis.
487
Using steps taken the large sample size case, Excel can be used to conduct a
hypothesis for small-sample case. Let's use the hourly income of 10 work-study
students at UB to conduct the following hypothesis.
I will repeat the steps taken in descriptive statistics and at the very end will
show how to find the value of the test statistics in this case "t" using a cell
formula.
Step 4. Select cell D1 and enter the cell formula = (C3 - 7)/C4. The screen shot
would look like the following:
488
Since the value of test statistic t = -0.66896 falls in acceptance range -2.262 to
+2.262 (from t table, where = 0.025 and the degrees of freedom is 9), we
fail to reject the null hypothesis.
In this section we will show how Excel is used to conduct a hypothesis test
about the difference between two population means assuming that populations
have equal variances. The data in this case are taken from various offices here
at the University of Baltimore. I collected the hourly income data of 36
489
randomly selected work-study students and 36 student assistants. The hourly
income range for work-study students was $6 - $8 while the hourly income
range for student assistants was $6-$9. The main objective in this hypothesis
testing is to see whether there is a significant difference between the means of
the two populations. The NULL and the ALTERNATIVE hypothesis is that
the means are equal and the means are not equal, respectively.
Data for Work Study Student: 6, 6, 6, 6, 6, 6, 6, 6.5, 6.5, 6.5, 6.5, 6.5, 6.5, 7,
7, 7, 7, 7, 7, 7, 7.5, 7.5, 7.5, 7.5, 7.5, 7.5, 8, 8, 8, 8, 8, 8, 8, 8, 8.
Use the Descriptive Statistics procedure to calculate the variances of the two
samples. The Excel procedure for testing the difference between the two
population means will require information on the variances of the two
populations. Since the variances of the two populations are unknowns they
should be replaced with sample variances. The descriptive for both samples
show that the variance of first sample is s12 = 0.55546218, while the variance of
the second sample s22 =0.969748.
490
To conduct the desired test hypothesis with Excel the following steps can be
taken:
Step 1. From the menus select Tools then click on the Data Analysis option.
Step 3. When the z-Test: Two Sample for means dialog box appears:
The value of test statistic z=-1.9845824 appears in our case in cell D24. The
rejection rule for this test is z < -1.96 or z > 1.96 from the normal distribution
table. In the Excel output these values for a two-tail test are z<-1.959961082
and z>+1.959961082. Since the value of the test statistic z=-1.9845824 is less
than -1.959961082 we reject the null hypothesis. We can also draw this
conclusion by comparing the p-value for a two tail -test and the alpha value.
Since p-value 0.047190813 is less than a=0.05 we reject the null hypothesis.
Overall we can say, based on the sample results, the two populations' means are
different.
In this section we will show how Excel is used to conduct a hypothesis test
about the difference between two population means. - Given that the
populations have equal variances when two small independent samples are
taken from both populations. Similar to the above case, the data in this case are
taken from various offices here at the University of Baltimore. I collected
hourly income data of 11 randomly selected work-study students and 11
randomly selected student assistants. The hourly income range for both groups
491
was similar range, $6 - $8 and $6-$9. The main objective in this hypothesis
testing is similar too, to see whether there is a significant difference between
the means of the two populations. The NULL and the ALTERNATIVE
hypothesis are that the means are equal and they are not equal, respectively.
Similar to the previous case, but a bit different in step # 2, to conduct the
desired test hypothesis with Excel the following steps can be taken:
Step 1. From the menus select Tools then click on the Data Analysis option.
Step 3 When the t-Test: Two Sample Assuming Equal Variances dialog box
appears:
Enter A1:A12 in the variable 1 range box (work-study student hourly income)
Enter B1:B12 in the variable 2 range box (student assistant hourly income)
492
Enter 0 in the Hypothesis Mean Difference box(if you desire to test a mean
difference other than zero, enter that value) then select Labels
Enter 0.05 or, whatever level of significance you desire, in the Alpha box
Select a suitable Output Range for the results, I chose C1, then click OK.
The value of the test statistic t=-1.362229828 appears, in our case, in cell D10.
The rejection rule for this test is t<-2.086 or t>+2.086 from the t distribution
table where the t value is based on a t distribution with n1-n2-2 degrees of
freedom and where the area of the upper one tail is 0.025 ( that is equal to
alpha/2).
In the Excel output the values for a two-tail test are t<-2.085962478 and
t>+2.085962478. Since the value of the test statistic t=-1.362229828, is in an
acceptance range of t<-2.085962478 and t>+2.085962478, we fail to reject the
null hypothesis.
We can also draw this conclusion by comparing the p-value for a two-tail test
and the alpha value.
Since the p-value 0.188271278 is greater than a=0.05 again, we fail to reject
the null hypothesis.
Overall we can say, based on sample results, the two populations' means are
equal.
493
ANOVA: Analysis of Variances
In this section the objective is to see whether or not means of three or more
populations based on random samples taken from populations are equal or not.
Assuming independents samples are taken from normally distributed
populations with equal variances, Excel would do this analysis if you choose
one way anova from the menus. We can also choose Anova: two way factor
with or without replication option and see whether there is significant
difference between means when different factors are involved.
Enter data in an Excel work sheet starting with cell A2 and ending with cell C8.
The following steps should be taken to find the proper output for interpretation.
Step 1. From the menus select Tools and click on Data Analysis option.
Step 2. When data analysis dialog appears, choose Anova single-factor option;
enter A2:C8 in the input range box. Select labels in first row.
Step3.Select any cell as output(in here we selected A11). Click OK.
Total
494
Suppose the test is done at level of significance a = 0.05, we reject the null
hypothesis. This means there is a significant difference between means of
hourly incomes of student assistants in these departments.
In this section, the study involves six students who were offered different
hourly wages in three different department services here at the University of
Baltimore. The objective is to see whether the hourly incomes are the same.
Therefore, we can consider the following:
Factor: Department
Blocks: Each student is a block since each student has worked in the three
different departments
To find the Excel output for the above data the following steps can be
taken:
495
Step 1. From the menus select Tools and click on Data Analysis option.
Step2. When data analysis box appears: select Anova two-factor without
replication then Enter A2: D8 in the input range. Select labels in first row.
Step3. Select an output range (in here we selected A11) then OK.
2 3 21 7 1
5 3 23 7.666667 2.333333
6 3 22 7.333333 1.333333
ANOVA
Total 21.06944 17
496
Conclusion: There is not sufficient evidence to conclude that hourly rates
differ for the three departments.
Referring to the student assistant and the work study hourly wages here at the
university of Baltimore the following data shows the hourly wages for the two
categories in three different departments:
Factors
Factor A: Student job category (in here two different job categories exists)
Interaction:
ANOVA
Total 4.245 17
Conclusion:
Mean hourly income differ by job category.
Mean hourly income differ by department.
Interaction is not significant.
498
In this example the objective is to see whether or not based on a randomly
selected sample information the standards set for a population is met. There are
so many practical examples that can be used in this situation. For example it is
assumed the guidelines for hiring people with different ethnic background for
the US government is set at 70%(WHITE), 20%(African American) and
10%(others), respectively. A randomly selected sample of 1000 US employees
shows the following results that is summarized in a table.
As you see the observed sample numbers for groups two and three are lower
than their expected values unlike group one which has a higher expected value.
Is this a clear sign of discrimination with respect to ethnic background? Well
depends on how much lower the expected values are. The lower amount might
not statistically be significant. To see whether these differences are significant
we can use Excel and find the value of the CHI-SQUARE. If this value falls
within the acceptance region we can assume that the guidelines are met
otherwise they are not. Now lets enter these numbers into Excel spread- sheet.
We used cells B7-B9 for the expected proportions, C7-C9 for the observed
values and D7-D9 for the expected frequency. To calculate the expected
frequency for a category, you can multiply the proportion of that category by
the sample size (in here 1000). The formula for the first cell of the expected
value column, D7 is 1000*B7. To find other entries in the expected value
column, use the copy and the paste menu as shown in the following picture.
These are important values for the chi-square test. The observed range in this
case is C7: C9 while the expected range is D7: D9. The null and the alternative
hypothesis for this test are as follows:
HA: The population proportions are not PW = 0.70, PA= 0.20 and PO = 0.10
Now lets use Excel to calculate the p-value in a CHI-SQUARE test. Step
1.Select a cell in the work sheet, the location which you like the p value of
theCHI-SQUARE to appear. We chose cell D12.
499
Step 2. From the menus, select insert then click on the Function option, Paste
Function dialog box appears.
Step 3.Refer to function category box and choose statistical, from function
name box select CHITEST and click on OK.
As you see the p value is 0.002392 which is less than the value of the level of
significance (in this case the level of significance, a= 0.10). Hence the null
hypothesis should be rejected. This means based on the sample information the
guidelines are not met. Notice if you type "=CHITEST(C7:C9,D7:D9)" in the
formula bar the p-value will show up in the designated cell.
NOTE: Excel can actually find the value of the CHI-SQUARE. To find this
value first select an empty cell on the spread sheet then in the formula bar type
"=CHIINV(D12,2)." D12 designates the p-Value found previously and 2 is the
degrees of freedom (number of rows minus one). The CHI-SQUARE value in
this case is 12.07121. If we refer to the CHI-SQUARE table we will see that
the cut off is 4.60517 since 12.07121>4.60517 we reject the null. The
following screen shot shows you how to the CHI-SQUARE value.
The CHI-SQUARE distribution is also used to test and see whether two
variables are independent or not. For example based on sample data you might
want to see whether smoking and gender are independent events for a certain
population. The variables of interest in this case are smoking and the gender of
an individual. Another example in this situation could involve the age range of
an individual and his or her smoking habit. Similar to case one data may appear
in a table but unlike the case one this table may contains several columns in
addition to rows. The initial table contains the observed values. To find
expected values for this table we set up another table similar to this one. To
find the value of each cell in the new table we should multiply the sum of the
cell column by the sum of the cell row and divide the results by the grand total.
The grand total is the total number of observations in a study. Now based on
500
the following table test whether or not the smoking habit and gender of the
population that the following sample taken from are independent. On the other
hand is that true that males in this population smoke more than females?
You could use formula bar to calculate the expected values for the expected
range. For example to find the expected value for the cell C5 which is replaced
in c11 you could click on the formula bar and enter C6*D5/D6 then enter in
cell C11.
yes no total
male 31 69 100
female 45 122 167
total 76 191 267
28.46442 71.53558
47.53558 119.4644
When the CHITEST box appears, enter b4:c5 for the actual range, then b10:c11
for the expected range.
Conclusion: Since p-value is greater than the level of significance (0.05), fails
to reject the null. This means smoking and gender are independent events.
Based on sample information one can not assure females smoke more than
males or the other way around.
501
Step 6. To find the chi-square value, use CHINV function, when Chinv box
appears enter 0.477395 for probability part, then 1 for the degrees of freedom.
CHI-SQUARE=0.504807
In this section we would like to examine whether or not the variances of two
populations are equal. Whenever independent simple random samples of equal
or different sizes such as n1 and n2 are taken from two normal distributions with
equal variances, the sampling distribution of s12/s22 has F distribution with n1- 1
degrees of freedom for the numerator and n2 - 1 degrees of freedom for the
denominator. In the ratio s12/s22 the numerator s12 and the denominator s22 are
variances of the first and the second sample, respectively. The following figure
shows the graph of an F distribution with 10 degrees of freedom for both the
numerator and the denominator. Unlike the normal distribution as you see the F
distribution is not symmetric. The shape of an F distribution is positively
skewed and depends on the degrees of freedom for the numerator and the
denominator. The value of F is always positive.
Now let see whether or not the variances of hourly income of student-assistant
and work-study students based on samples taken from populations previously
are equal. Assume that the hypothesis test in this case is conducted at a = 0.10.
The null and the alternative are:
Rejection Rule: Reject the null hypothesis if F< F0.095 or F> F0.05 where F, the
value of the test statistic is equal to s12/s22, with 10 degrees of freedom for both
the numerator and the denominator. We can find the value of F.05 from the F
distribution table. If s12/s22, we do not need to know the value of
F0.095 otherwise, F0.95 = 1/ F0.05 for equal sample sizes.
502
A survey of eleven student-assistant and eleven work-study students shows the
following descriptive statistics. Our objective is to find the value of s12/s22,
where s12 is the value of the variance of student assistant sample and s22 is the
value of the variance of the work study students’ sample. As you see these
values are in cells F8 and D8 of the descriptive statistic output.
503
To calculate the value of s12/s22, select a cell such as A16 and enter cell formula
= F8/D8 and enter. This is the value of F in our problem. Since this value,
F=1.984615385, falls in acceptance area we fail to reject the null hypothesis.
Hence, the sample results do support the conclusion that student assistants
hourly income variance is equal to the work study students hourly income
variance. The following screen shoot shows how to find the F value. We can
follow the same format for one tail test(s).
504
Linear Correlation and Regression Analysis
In this section the objective is to see whether there is a correlation between two
variables and to find a model that predicts one variable in terms of the other
variable. There are so many examples that we could mention but we will
mention the popular ones in the world of business. Usually independent
variable is presented by the letter x and the dependent variable is presented by
the letter y. A business man would like to see whether there is a relationship
between the number of cases of sold and the temperature in a hot summer day
based on information taken from the past. He also would like to estimate the
number cases of soda which will be sold in a particular hot summer day in a
ball game. He clearly recorded temperatures and number of cases of soda sold
on those particular days. The following table shows the recorded data from
June 1 through June 13. The weatherman predicts a 94F degree temperature for
505
June 14. The businessman would like to meet all demands for the cases of
sodas ordered by customers on June 14.
Cases of
DAY Temperature
Soda
1-Jun 57 56
2-Jun 59 58
3-Jun 65 63
4-Jun 67 66
5-Jun 75 73
6-Jun 81 78
7-Jun 86 85
8-Jun 88 85
9-Jun 88 87
10-
84 84
Jun
11-
82 88
Jun
12-
80 84
Jun
13-
83 89
Jun
Now lets use Excel to find the linear correlation coefficient and the regression
line equation. The linear correlation coefficient is a quantity between -1 and +1.
This quantity is denoted by R. The closer R to +1 the stronger positive (direct)
correlation and similarly the closer R to -1 the stronger negative (inverse)
correlation exists between the two variables. The general form of the regression
line is y = mx + b. In this formula, m is the slope of the line and b is the y-
intercept. You can find these quantities from the Excel output. In this situation
the variable y (the dependent variable) is the number of cases of soda and the x
(independent variable) is the temperature. To find the Excel output the
following steps can be taken:
Step 1. From the menus choose Tools and click on Data Analysis.
506
Step 3. When correlation dialog box appears, enter B1:C14 in the input range
box. Click on Labels in first row and enter a16 in the output range box. Click
on OK.
As you see the correlation between the number of cases of soda demanded and
the temperature is a very strong positive correlation. This means as the
temperature increases the demand for cases of soda is also increasing. The
linear correlation coefficient is 0.966598577 which is very close to +1.
Now lets follow same steps but a bit different to find the regression
equation.
Step 1. From the menus choose Tools and click on Data Analysis
Step 3. When Regression dialog box appears, enter b1:b14 in the y-range box
and c1:c14 in the x-range box. Click on labels.
Note: The regression equation in general should look like Y=m X + b. In this
equation m is the slope of the regression line and b is its y-intercept.
SUMMARY OUTPUT
Regression Statistics
Multiple R 0.966598577
R Square 0.934312809
Adjusted R Square 0.928341246
Standard Error 2.919383191
Observations 13
ANOVA
507
df SS MS F Significance F
Regression 1 1333.479989 1333.479989 156.4603497 7.58511E-08
Residual 11 93.75078034 8522798213
Total 12 1427.230769
Standard Upper
Coefficients t Stat P-value Lower 95%
Error 95%
The relationship between the number of cans of soda and the temperature is: Y
= 0.879202711 X + 9.17800767
Moving Average Models: Use the Add Trendline option to analyze a moving
average forecasting model in Excel. You must first create a graph of the time
series you want to analyze. Select the range that contains your data and make a
scatter plot of the data. Once the chart is created, follow these steps:
1. Click on the chart to select it, and click on any point on the line to
select the data series. When you click on the chart to select it, a
new option, Chart, s added to the menu bar.
2. From the Chart menu, select Add Trendline.
508
Exponential Smoothing Models: The simplest way to analyze a timer series
using an Exponential Smoothing model in Excel is to use the data analysis tool.
This tool works almost exactly like the one for Moving Average, except that
you will need to input the value of a instead of the number of periods, k. Once
you have entered the data range and the damping factor, 1- , and indicated
what output you want and a location, the analysis is the same as the one for the
Moving Average model.
1.2, 1.5, 2.6, 3.8, 2.4, 1.9, 3.5, 2.5, 2.4, 3.0
a. for the "input range" enter "A1:Dn", assuming you typed the data into cells
A1 to An.
509
b. click on the "output range" button and enter the output range "C1:C16".
1.2, 1.5, 1.9, 2.4, 2.4, 2.5, 2.6, 3.0, 3.5, 3.8
(1.2 1.5 2.6 3.8 2.4 1.9 3.5 2.5 2.4 3.0) / 10 = 2.48
The mode is 2.4, since it is the only value that occurs twice.
Note that the mean, median and mode of this set of data are very close to each
other. This suggests that the data is very symmetrically distributed.
1.2, 1.5, 1.9, 2.4, 2.4, 2.5, 2.6, 3.0, 3.5, 3.8 mean = 2.48 as computed earlier.
An estimate for the population variance is: s2 = 1 / (10-1) [ (1.2 - 2.48)2+ (1.5 -
2.48)2 + (1.9 - 2.48)2 + (2.4 -2.48)2 + (2.4 - 2.48)2 + (2.5 - 2.48)2 + (2.6 -
2.48)2 + (3.0 - 2.48)2 + (3.5 -2.48)2 + (3.8 - 2.48)2 ]
= (1 / 9) (1.6384 + 0.9604 + 0.3364 + 0.0064 + 0.0064 + 0.0004 + 0.0144 +
0.2704 + 1.0404 + 1.7424) = 0.6684
To construct the probability function for bank robberies, first define the random
variable x, bank robbery take. If the robber is not caught, x = $3,244. If the
robber is caught and manages to keep half, x = $1,622. If the robber is caught
and loses it all, then x = 0. The associated probabilities for these x values are
0.15 = (1 - 0.85), 0.34 = (0.85)(0.4), and 0.51 = (0.85)(0.6). After entering the x
values in cells A1, A2 and A3 and after entering the associated probabilities in
B1, B2, and B3, the following steps lead to the probability mass function:
511
Discrete & Continuous Random Variables:
Solution: Letting Y stand for a correct answer and N a wrong answer, where
the probability of Y is 0.2 and the probability of N is 0.8 for each of the four
questions, the probability tree diagram is shown in the textbook on page 182.
This probability tree diagram shows the "branches" that must be followed to
show the calculations captured in the binomial mass function for n = 4 and =
0.2. For example, the tree diagram shows the six different branch systems that
yield two correct and two wrong answers (which corresponds to 4!/(2!2!) = 6.
The binomial mass function shows the probability of two correct answers as
Solution: Because the average and standard deviation are known, what needs
to be established is the amount of time, above the mean time, such that 99
percent of the distribution is lower. This is a distance that is measured in
standard deviations as given by the Z value corresponding to the 0.99
probability found in the body of Appendix B, Table 5,as shown in the textbook
OR the commands entered into any cell of Excel to find this Z value is
=NORMINV(0.99,0,1) for 2.326342.
512
The closest cumulative probability that can be found is 0.9901, in the row
labeled 2.3 and column headed by .03, Z = 2.33, which is only an
approximation for the more exact 2.326342 found in Excel. Using this more
exact value the calculation with mean and standard deviation in the
following formula would be
Z=(X-)/
That is, Z = ( x - 65)/15
Thus, x = 65 + 15(2.32634) = 99.9 minutes.
513
Sampling Distribution and the Central Limit Theorem : A bakery sells an
average of 24 loaves of bread per day. Sales (x) are normally distributed with a
standard deviation of 4.
Solutions:
1. The sampling distribution of the sample mean xbar is normal with a mean of
24 and a standard error of the mean of 4. Thus, using Excel, 0.15866 =1-
NORMDIST(28,24,4,1).
2. The sampling distribution of the sample mean xbar is normal with a mean of
24 and a standard error of the mean of 2 using Excel, 0.02275 =1-
NORMDIST(28,24,2,1).
Regression Analysis: The highway deaths per 100 million vehicle miles and
highway speed limits for 10 countries, are given below:
(Death, Speed) = (3.0, 55), (3.3, 55), (3.4, 55), (3.5, 70), (4.1, 55), (4.3, 60),
(4.7, 55), (4.9, 60), (5.1, 60), and (6.1, 75).
From this we can see that five countries with the same speed limit have very
different positions on the safety list. For example, Britain ... with a speed limit
of 70 is demonstrably safer than Japan, at 55. Can we argue that, speed has
little to do with safety. Use regression analysis to answer this question.
Solution: Enter the ten paired y and x data into cells A2 to A11 and B2 to B11,
with the "death" rate label in A1 and "speed" limits label in B1, the following
steps produce the regression output.
514
Note: Use the mouse to move between the boxes and buttons. Click on the
desired box or button. The large rectangular boxes require a range from the
worksheet. A range may be typed in or selected by highlighting the cells with
the mouse after clicking on the box. If the dialog box blocks the data, it can be
moved on the screen by clicking on the title bar and dragging.
For the "Input Y Range," enter A1 to A11, and for the "Input X Range" enter
B1 to B11.
Because the Y and X ranges include the "Death" and "Speed" labels in A1 and
B1, select the "Labels" box with a click.
Click the "Output Range" button and type reference cell, which in this
demonstration is A13.
To get the predicted values of Y (Death rates) and residuals select the
"Residuals" box with a click.
Your screen display should show a Table, clicking "OK" will give the
"SUMMARY OUTPUT," "ANOVA" AND RESIDUAL OUTPUT"
The first section of the EXCEL printout gives "SUMMARY OUTPUT." The
"Multiple R" is the square root of the "R Square;" the computation and
interpretation of which we have already discussed. The "Standard Error" of
estimate (which will be discussed in the next chapter) is s = 0.86423, which is
the square root of "Residual SS" = 5.97511 divided by its degrees of freedom,
df = 8, as given in the "ANOVA" section. We will also discuss the adjusted R-
square of 0.21325 in the following chapters.
Under the "ANOVA" section are the estimated regression coefficients and
related statistics that will be discussed in detail in the next chapter. For now it is
sufficient to recognize that the calculated coefficient values for the slope and y
intercept are provided (b = 0.07556 and a = -0.29333). Next to these coefficient
estimates is information on the variability in the distribution of the least-
squares estimators from which these specific estimates were drawn: the column
titled "Std. Error" contains the standard deviations (standard errors) of the
intercept and slope distributions; the "t-ratio" and "p" columns give the
calculated values of the t statistics and associated p-values. As shown in
Chapter 13, the t statistic of 1.85458 and p-value of 0.10077, for example,
indicates that the sample slope (0.07556) is sufficiently different from zero, at
even the 0.10 two-tail Type I error level, to conclude that there is a significant
515
relationship between deaths and speed limits in the population. This conclusion
is contrary to assertion that "speed has little to do with safety."
ANOVA df SS MS F P-value
Regression 1 2.56889 2.56889 3.43945 0.10077
Residual 8 5.97511 0.74689
Total 9 8.54400
Residual Output:
Predicted Residuals
3.86222 -0.86222
3.86222 -0.56222
3.86222 -0.46222
4.99556 -1.49556
3.86222 0.23778
4.24000 0.06000
3.86222 0.83778
4.24000 0.66000
4.24000 0.86000
5.37333 0.72667
516
failures in their business environment is low. This is the realm of simulations-a
safe place to fail.
The appearance of statistical software is one of the most important events in the
process of decision making under uncertainty. Statistical software systems are
used to construct examples, to understand the existing concepts, and to find
new statistical properties. On the other hand, new developments in the process
of decision making under uncertainty often motivate developments of new
approaches and revision of the existing software systems. Statistical software
systems rely on a cooperation of statisticians, and software developers.
Beside the statistical software, Java Applets, Online statistical computation, and
the use of a scientific calculator is required for the course. A Scientific
Calculator is the one, which has capability to give you, say, the result of square
root of 5. Any calculator that goes beyond the 4 operations is fine for this
course. These calculators allow you to perform simple calculations you need in
this course, for example, enabling you to take square root, to raise e to the
power of say, 0.36. and so on. These types of calculators are called general
Scientific Calculators. There are also more specific and advanced calculators
for mathematical computations in other areas such as Finance, Accounting,
Civil Engineering, and even Statistics. The last one, for example, computes
mean, variance, skewness, and kurtosis of a sample by simply entering all data
one-by-one and then pressing any of the mean, variance, skewness, and kurtosis
keys.
517
Histogram
The Three Means
2. Computational probability
Combinatorial Maths
Comparing Two Random Variables
Multinomial Distributions
P-values for the Popular Distributions
518
ANOVA: Testing Equality of the Means
Compatibility of Multi-Counts
Equality of Multi-variances: The Bartlett's Test
Identical Populations Test for Crosstable Data
Testing the Proportions
Testing Several Correlation Coefficients
A selection of:
Back to
Business Statistics
The Copyright Statement: The fair use, according to the 1996 Fair
Use Guidelines for Educational Multimedia, of materials presented on
this Web site is permitted for non-commercial and classroom
purposes only. This site may be mirrored intact (including these
notices), on any server with public access. All files are available
athttp://www.mirrorservice.org/sites/home.ubalt.edu/ntsbarsh/Business-
stat for mirroring.
519