CONTENTS

CHAPTER TOPIC Introduction to Educational Research 10 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 Quantitative, Qualitative, and Mixed Research 19 Developing Research Questions and Proposal Preparation 29 Research Ethics 34 Standardized Measurement and Assessment 45 Methods of Data Collection 58 Sampling 65 Validity of Research Results 78 Experimental Research 94 Quasi-Experimental and Single-Case Designs 105 Non-experimental Quantitative Research 113 Qualitative Research 119 Historical Research 122 Mixed Model and Mixed Method Research 132 Descriptive Statistics 144 Inferential Statistics 162 Data Analysis in Qualitative Research 174 Preparation of the Research Report PAGES 1 1

Page 1 of 179

Chapter 1 Introduction to Educational Research The purpose of Chapter One is to provide an overview of educational research and introduce you to some important terms and concepts. My discussion in this set of lectures will usually center around the same headings that are used in the book chapters. You might want to have your book open as you read through my lectures. My goal is to help you to better understand the material in the book. Why Study Educational Research? Here are a few reasons to take this course and learn about educational research: • To become "research literate." • Because we live in a society that's driven by research. • To improve your critical thinking skills. • To learn how to read and critically evaluate published research. • To learn how to design and conduct research in case the need arises one day. Areas of Educational Research There are many areas in educational research. As you can see in Table 1.1 (reproduced here for your convenience), there are 10 major divisions in our largest Association and there are many special interest groups (SIGs). Do you see any areas that are of interest to you?

Page 2 of 179

To learn more about the areas of educational research and current issues, we recommend that you explore the AERA website at http://aera.net . By the way, The AERA has great student membership rates. Examples of Educational Research Many examples of educational research are discussed throughout your textbook. To get you started, we have reproduced the abstracts from four journal articles in this section of the book.

Page 3 of 179

..Mixed......... merit....An excellent way to see examples of recent educational research articles is to browse through educational journals........... Evaluation Research Evaluation involves determining the worth.. One excellent journal to get you started is entitled the "Journal of Educational Psychology.. • Summative evaluation is used for the purpose of making summary judgments about a program and decisions to continue of discontinue the program..... or quality of an evaluation object... evaluation research. which asks this question: Is this program conceptualized in a way that it should work? • Implementation assessment......Applied Research examining the process of cognitive "priming" is an example of relatively basic research. Page 4 of 179 .... A newer and currently popular way to classify evaluation is to divide it into five types: • Needs assessment. Here is the continuum: Basic..... with the center representing the idea that research can be applied research can contribute to basic research and vice versa. which asks: Was this program implemented properly and according to the program plan? • Impact assessment... action research. which asks: Is this program cost effective? Evaluation is generally done by program evaluators and is focused on specific programs or products.. a comparison of the effectiveness of two approaches to counseling is an example of relatively applied research. which asks: Did this program have an impact on its intended targets? • Efficiency assessment..." General Kinds of Research In this section we discuss five general kinds of research: basic research.. Evaluation is traditionally classified according to its purpose: • Formative evaluation is used for the purpose of program improvement... which ask this question: Is there a need for this type of program? • Theory assessment.. applied research. Basic and applied research are generally conducted by researchers at universities. and orientational research.. Basic and Applied Research Basic research is research aimed at generating fundamental knowledge and theoretical understanding about basic human and other natural processes. Applied research is focused on answering practical questions to provide relatively immediate solutions... Basic and applied research can be viewed as two endpoints on a research continuum.

original knowledge comes from experience).. Reasoning. inequality and discrimination based on one’s sexual preferences) Many orientational researchers work for universities or interest group organizations. Page 5 of 179 . inequality resulting from one’s economic class in society). We hope you get this “state of mind” as you read our textbook! Orientational Research Orientational research is done for the purpose of advancing an ideological position. or experience. experiment. • Ethnic and racial stratification (i. The term empirical means "based on observation. The major ways we learn can be classified into experience. It is important to understand that action research is also a state of mind.e.. inequality resulting from one’s gender). Here are some major areas of interest to orientational researchers: • Class stratification (i. however. We use the broader term orientational research because critical theory was originally concerned only with class inequalities and was based on the Karl Marx’s theory of economics. Some areas in which inequality manifests itself are large differences in income." Expert Opinion Because we don’t want to and don’t have time to conduct research on everything. Experience The idea here is that knowledge comes from experience. wealth. and revolution.. • Gender stratification (i. power. and reasoning. society. discrimination.e.e. • Sexual orientation stratification (i. access to high quality education.Action Research Action research focuses on solving practitioner’s local problems. It is traditionally called critical theory. teachers who are action researchers are constantly observing their students for patterns and thinking about ways to improve instruction. Note. classroom management.. and occupation.e. that if you rely on an expert’s opinion it is important to make sure that the expert is an expert in the specific area under discussion and you should check to see if the expert has a vested interest in the issue. people frequently rely on expert opinion as they learn about the world.e. Sources of Knowledge In this section we discuss how people learn about the world around them and gain knowledge. and so forth. expert opinion. for example. It is generally conducted by the practitioners after they have learned about the methods of research and research concepts that are discussed in your textbook. this view was called empiricism (i. Historically. Orientational research is focused on some form of inequality. or stratification in society.. inequality resulting from one’s ethnic or racial grouping).

a conclusion from deductive reasoning will necessarily be true if the argument form is valid and if the premises are true. Basic Assumptions of Science In order to do science.e.e.e.e.e. reasoning from the particular to the general). Dynamics of science. we usually make several assumptions. The Scientific Approach to Knowledge Generation Science is also an approach for the generation of knowledge. the collection of data) and rationalism (i. Science is "Critical. The so called “problem of induction” is that the future might not resemble the present.. this idea was called rationalism (i. "We stand on the shoulders of giants" (Newton).Historically." Science is never-ending.. Page 6 of 179 . original knowledge comes from thought and reasoning).e. Science is dynamic. • • • • • Science is creative.. Science is open. In other words. It relies on a mixture of empiricism (i. the process of drawing a specific conclusion from a set of premises). in formal logic and mathematics.. • Science is rational.. Here they are as summarized in Table 1. There are two main forms of reasoning: • Deductive reasoning (i. The conclusion from inductive reasoning is probabilistic (i. Deductive reasoning is the classical approach used by the great rationalists in the history of western civilization.. you make a statement about what will probably happen). Note that.3. the use of reasoning and theory construction and testing). Science has many distinguishing characteristics: • Science is progressive. • Inductive reasoning (i.

This idea is demonstrated in Figure 1. 3. Observe the world. Make decision to accept or reject the hypothesis.Scientific Methods There are many scientific methods.1. 3. The inductive method is as “bottom up” method that is especially useful for generating theories and hypotheses. Page 7 of 179 . 2. 2. Collect data to test the hypothesis. The two major methods are the inductive method and the deductive method. Search for a pattern in what is observed. Make a generalization about what is occurring. • Virtually any application of science includes the use of both the deductive and the inductive approaches to the scientific method either in a single study or over time. The inductive method. This approach also involves three steps: 1. the deductive method is a “top down” method that is especially useful for testing theories and hypotheses. • The deductive method involves the following three steps: 1. State the hypothesis (based on theory or research literature).

This means that knowledge based on educational research is ultimately tentative. We have summarized the key criteria to use in evaluating a theory in Table 1. Empirical research provides evidence. it does not provide proof.e. Page 8 of 179 . what is gained in empirical research is evidence. please eliminate the word "proof" from your vocabulary when you talk about research results." Theories explain "How" and "Why" something operates as it does. Some theories are highly developed and encompass a large terrain (i. others theories are "smaller" theories or briefer explanations. you should take NOT draw firm conclusions from a single research study. Hence.Theory The word "theory" most simply means "explanation. Therefore. NOT proof. "big" theories or "grand" theories).4 and reproduced it hear for your convenience. The Principle of Evidence According to the principle of evidence.. evidence increases when a finding has been replicated. Also note that.

you are usually interested in explanation. This is done when you want to describe the characteristics of something or some phenomenon. Exploration. 1. This is done when you are trying to generate ideas about something. 2. predictive research. Influence. This is done when you want to show how and why a phenomenon operates as it does. 3. Note that the advanced sciences make much more accurate predictions than the newer social and behavioral sciences. Prediction. 4. and demonstration research. One convenient and useful way to classify research is into exploratory research. This objective is a little different. Explanation. Page 9 of 179 . explanatory research. It involves the application of research results to impact the world. If you are interested in causality. 5. This is your objective when your primary interest is in making accurate predictions.Objectives of Educational Research There are five major objectives of educational research. Description. descriptive research. A demonstration program is an example of this.

I think that it truly became the legitimate third paradigm with the publication of the Handbook of Mixed Methods in Social and Behavioral Research (2003. For now. Here are the definitions of each: • Quantitative research – research that relies primarily on the collection of quantitative data. They are quantitative research. Here is Table 2. Page 10 of 179 . and it was often conceptualized as the polar opposite of quantitative research.Chapter 2 Quantitative. although the modern roots of mixed research go back to the late 1950s. and mixed research. At the same time. Finally. mixed method and mixed model research. concepts. keep in mind that the mixing of quantitative and qualitative research can take many forms. Later in the lecture you will learn about the two major types of mixed research. For the most of the 20th century the quantitative paradigm was dominant. the qualitative paradigm came of age as an alternative to the quantitative paradigm.) • Mixed research – research that involves the mixing of quantitative and qualitative methods or paradigm characteristics. qualitative research.1 for your convenience and review. mixed research has been conducted by practicing researchers throughout the history of research. by Tashakkori and Teddlie). A paradigm is a perspective based on a set of assumptions. the possibilities for mixing are almost infinite.1. Qualitative. Characteristics of the Three Research Paradigms There are currently three major research paradigms in education (and in the social and behavioral sciences). (Note that pure qualitative research will follow all of the paradigm characteristics of qualitative research shown in the right column of Table 2. (Note that pure quantitative research will follow all of the paradigm characteristics of quantitative research shown in the left column of Table 2. During the 1980s.) • Qualitative research – research that relies on the collection of qualitative data.1. and Mixed Research This chapter is our introduction to the three research methodology paradigms. and values that are held by a community or researchers. In fact.

Quantitative Research Methods: Experimental and Nonexperimental Research Page 11 of 179 .

Page 12 of 179 . As you can see.. annual income) and categorical variables vary in type or kind (e. gender). in Table 2. with examples.g.g. Here is that table for your review: In looking at the table note that when we speak of measurement.The basic building blocks of quantitative research are variables.2. Variables (something that takes on different values or categories) are the opposite ofconstants (something that cannot vary. such as a single value or category of a variable). the most simple classification is between categorical and quantitative variables. quantitative variables vary in degree or amount (e.. Many of the important types of variables used in quantitative research are shown.

Pretest O1 O1 Where: Treatment XE XC Posttest O2 O2 Page 13 of 179 . As you can see. Experimental Research The purpose of experimental research is to study cause and effect relationships. tissue damage is an intervening variable in the smoking and lung cancer relationship. researchers often use moderator variables to show how the relationship changes across the levels of an additional variable. Now. gender is the moderator variable. therefore. What is the IV and DV in the relationship between smoking and lung cancer? (Smoking is the IV and lung cancer is the DV. In this case. Its defining characteristic is active manipulation of an independent variable (i. perhaps behavioral therapy works better for males and cognitive therapy works better for females. For example..The other set of variables in the table (under the heading role taken by the variable) are the kinds of variables we talk about when explaining how the world operates and when we design a quantitative research study. Here is an example of an experiment. random assignment (which creates "equivalent" groups) is used in the strongest experimental research designs. it is only in experimental research that “manipulation” is present). Dependent variables (symbolized by "DV") are the presumed effect or outcome. We can use arrows (which mean causes or affects) and draw the relationship that includes an intervening variable like this: Smoking---->Tissue Damage---->Lung Cancer. For example.e. independent variables (symbolized by "IV") are the presumed cause of another variable. Intervening variables are variables that occur between two other variables. Sometimes a relationship does not generalize to everyone. This brings us to the idea of intervening variables (also called mediator or mediating variables). I will talk about the major types of quantitative research: experimental and nonexperimental research.) Sometimes we want to understand the process or variables through which one variable affects another variable. The relationship be type of therapy (behavioral versus cognitive) and psychological relief is moderated by gender. Dependent variables are influenced by one or more independent variables. Also.

Now (after the manipulation) you measure the participants’ knowledge to see how much they know after having participated in our experiment. What would you conclude? In this case. In particular. Turner) or to the IQ levels of the groups (perhaps Mrs. It is the problem of alternative explanations.g.. An extraneous variable is a variable that may compete with the independent variable in explaining the outcome. It is important to remember the definition of an extraneous variable because they can destroy the integrity of a research study that claims to show a cause and effect relationship. Smith had the smarter students (remember the students were not randomly assignment to the two groups. it is very possible that the difference we saw between the two groups was due to variables other than the IV. you manipulate the independent variable by using the new teaching approach with the experimental group and using the old teaching approach for the control group. We have a name for the problems just mentioned. we made our groups approximately the same at the start of the study by using random assignment (i. They are called extraneous variable. You pretest the participants to see how much they know. Smith’s students had higher IQs than Mr. Let’s say that. Smith vs Mr. the old or standard teaching approach) Because the best way to make the two groups similar in the above research design is to randomly assign the participants to the experimental and control groups. we had our best teacher (Mrs. Turner) use the old teaching approach with his 5th period class. Let’s again say that the experimental group did better than the control group. Smith’s students) We have a special name for these kinds of variables. instead. First. Do you see any problems with claiming that the reason for the difference between the two groups is because of the teaching method? The problem is that there are alternative explanations.e. and the DV.• • E stands for the experimental group (e. if you are ever interested in identifying cause and effect relationships you must always determine whether there are any extraneous variables you need to worry about. and specifically we can conclude that the new teaching approach is better than the old teaching approach. knowledge. we used two intact classrooms). perhaps the difference is because Mrs. new teaching approach) C stands for the control or comparison group (e. First. the difference might have been due to the teacher (Mrs. teaching method. Next.g. Page 14 of 179 . Smith) use the new teaching approach with her students in her 5th period class and we had a newer and less experienced teacher (Mr. In particular. let’s say that in the above experiment we could not use random assignment to equate our groups. Smith is the better teacher. Make sense? Now.. instead. we can conclude that there is a causal relationship between the IV. If an extraneous variable really is the reason for an outcome (rather than the IV) then we sometimes like to call it a confounding variable because it has confused or confounded the relationship we are interested in. Second. Remember this. Here is the logic of this experiment. the groups are “equated”). perhaps Mrs.. Let’s say that the people in the experimental group show more knowledge improvement than those in the control group. let’s assume that we have a convenience sample of 50 people and that we randomly assign them to the two groups in our experiment.

Well. temporal order.65) then you have a positive correlation.e.Nonexperimental Research Remember that the defining characteristic of experimental research was manipulation of the IV. The farther the number is from 0.g.g. and 0 stands for no relationship. • The correlation coefficient is a number that varies between –1 and +1. +. Therefore. • You would look for the relationship by calculating the correlation coefficient. the higher the education. in nonexperimental research there is no manipulation of the independent variable. please remember these important points: 1) You can obtain much stronger evidence for causality from experimental research than from nonexperimental research (e. we make a distinction between two examples of nonexperimental research. a strong experiment is better than causal-comparative and correlation research). • Example: Self-esteem (IV) and class performance (DV). which means the two variables move in opposite directions (as one variable increases. There also is no random assignment of participants to groups. the basic cases of both causalcomparative and correlation research are severely flawed! 3) In later chapters we explain three necessary conditions for causality (relationship. In the chapter.. • If the sign of the correlation coefficient is negative (e. but for now. In the "basic case" of causal-comparative research. there is one quantitative IV and one quantitative DV.e. and lack of alternative explanations) Page 15 of 179 .. • You would look for the relationship by comparing the male and female average performance levels. Smoking and life expectancy are negatively correlated (i.. • Example: Gender (IV) and class performance (DV). the other decreases). 2) You cannot conclude that a relationship is causal when you only have one IV and one DV in nonexperimental research (without controls). Education level and annual income are positively correlated (i. the stronger the relationship.. which means the two variables move in the same direction (as one variable increases.g. the higher the smoking. We will show you how to improve on the two basic nonexperimental designs in later chapters.. • If the sign of the correlation coefficient is positive (e. What this means is that if you ever see a relationship between two variables in nonexperimental research you cannot jump to a conclusion of cause and effect because there will be too many other alternative explanations for the relationship. so does the other variable). the higher the annual income). the lower the life expectancy). there is one categorical IV and one quantitative DV. -. In the simple case of correlational research.71) then you have a negative correlation.

and historical research. just skip to the next section of this lecture entitled Qualitative Research. All of the approaches are similar in that they are qualitative approaches. read this next section. Page 16 of 179 . you will want to CONDUCT AN EXPERIMENT. Qualitative Research Methods We describe qualitative research earlier. They are shown in the following Table: Our experiment met these criteria quite nicely. remember. however. experimental research with random assignment is better for studying cause and effect than nonexperimental research. we had a relationship between teaching method and knowledge.1. and because we randomly assigned the people to the two groups.For a preview of these three necessary conditions required to make a firm statement of cause and effect. if it is feasible. we have only established condition 1. There are three necessary conditions that you must establish whenever you want to conclude that a relationship is causal. It is provided as supplemental or preview material for this topic which occurs in many chapters of the book. has some distinct characteristics and tends to have its own roots and following. That is. there should be no other variables that can explain away the relationship. On the other hand. if you want to show that one thing causes another thing. Another way of saying this is. case study research. grounded theory. Each approach. in the basic cases of causal-comparative and correlational research. In chapter 11 we will show you how to design nonexperimental research that performs better than the basic cases on the three above conditions. ethnography. There are five major types of qualitative research: phenomenology. If you have had enough for now. We can only conclude that the two variables are related. Still. the manipulation occurred before the posttest. even when these basic cases are improved. then. where we only observed a relationship between two variables (we had no manipulation or random assignment). in Table 2.

you might interview 20 widows and ask them to describe their experiences of the deaths of their husbands. ultimately developing a theory of school pull-out. and material things of a group of people. For another example. For an example. The Advantages of Mixed Research First of all. • Mixed model research – is research in which the researcher mixes both qualitative and quantitative research approaches within a stage of the study or across two of the stages of the research process. Mixed Research Methods Mixed research is a general type of research (it’s one of the three paradigms) in which quantitative and qualitative methods. • Case study research – is a form of qualitative research that is focused on providing a detailed account of one or more cases. practices. you might collect data from parents who have pulled their children out of public schools and develop a theory to explain how and why this phenomenon occurs.1. Note that a culture is the shared attitudes. norms. For example. For an example of an ethnography. • Historical research – research about events that occurred in the past. For example. Earlier we showed it major characteristics of mixed research in Table 2. a researcher might conduct an experiment (quantitative) and after the experiment conduct an interview study with the participants (qualitative) to see how they viewed the experiment and to see if they agreed with the results. or other paradigm characteristics are mixed in one overall study. • Grounded theory – is a qualitative approach to generating and developing a theory form data that the researcher collects.Here are the definitions and an example of the different types of qualitative research: • Phenomenology – a form of qualitative research in which the researcher attempts to understand how one or more individuals experience a phenomenon. For an example. An example. you might study a classroom that was given a new curriculum for technology use. techniques. a researcher might collect qualitative data but then try to quantify the data. a researcher might conduct a survey and use a questionnaire that is composed of multiple closed-ended or quantitative type items as well as several open-ended or qualitative type items. • Mixed method research – is research in which the researcher uses the qualitative research paradigm for one phase of a research study and the quantitative research paradigm for another phase of the study. For example. Now the two major types of mixed research are distinguished: mixed method versus mixed model research. We are excited about this new movement in educational research and believe it will help qualitative and quantitative Page 17 of 179 . you might decide to go and live in a Mohawk communities and study the culture and their educational practices. we advocate the use of mixed research when it is feasible. • Ethnography – is the form of qualitative research that focuses on describing the culture of a group of people. Mixed method research is like conducting two mini-studies within one overall research study. values. you might study the use of corporeal punishment in schools in the 19th century. language.

it is interesting to note that virtually all research literatures would be mixed at the aggregate level. The examples just listed for mixed method and mixed model research can be viewed as following this principle. to expand a set of results. The use of multiple methods or approaches to research works the same way. • When different approaches are used to focus on the same phenomenon and they provide the same result. it will promote the conduct of excellent educational research. more importantly. the researcher should mix quantitative and qualitative research methods.researchers to get along better and. and this is what is truly called mixed research. Our Research Typology We have now covered the essentials of the three research paradigms and their subtypes. • Perhaps the major goal for researcher who design and conduct mixed research is to follow the fundamental principle of mixed research. you have "corroboration" which means you have superior evidence for the result. The "new" net will not have any holes in it. However. That's because there will usually be some quantitative and some qualitative research studies in a research literature. procedures. Let’s put it all together in the following picture of our research typology: Page 18 of 179 . or to discover something that would have been missed if only a quantitative or a qualitative approach had been used. • Some researchers like to conduct mixed research in a single study. even if no single researcher uses mixed research. According to this principle. and paradigm characteristics in a way that the resulting mixture or combination has complementary strengths and nonoverlapping weaknesses. Can you see how? • Here is a metaphor for thinking about mixed research: Construct one fish net out of several fish nets that have holes in them by laying them on top of one another. Other important reasons for doing mixed research are to complement one set of results with another.

students. research usually generates more questions than it answers. the use of past research is extremely helpful. and theory. This is also the best way to come up with a specific idea that will fit into and extend the research literature. In my opinion (BJ). Based on a questioning and inquisitive approach. For students planning on writing a thesis or dissertation.. importantly.g. What are some current problems facing education (e. Regardless of the source of your idea. • • Page 19 of 179 . What research topics do you think can address some of these current problems? Past research can be an excellent source of research ideas. it will be helpful for you to think about the ideas shown in Table 3. think about what educational techniques or practices you believe work well. Would you be interested in doing a research study on one or more of those techniques or practices? Practical issues can be a source of research ideas. past research is probably the most important source of research ideas. and remember to not just look at the variables and the results. We discuss four of these sources in the text: everyday life.. and develop a research proposal. you can draw from your experiences and come up with many research topics. and. facing administrators. but also carefully examine how they conducted the study (i.Chapter 3 Problem Identification and Hypothesis Formation The purpose of Chapter Three is to help you to learn how to come up with a research topic. • Everyday life is one common source of research ideas. refine it. examine the methods). Sources of Research Ideas Research ideas and research problems originate from many sources. a key point is that you must develop a questioning and inquisitive approach to life when you are trying to come up with research ideas. That’s because a great deal of educational research has already been conducted on a multitude of topics. parents). For example.1.e. practical issues. past research. or do not work well. When you read a research article. teachers.

Ideas that Can't Be Researched Empirically Page 20 of 179 .e..• Theory (i. o Can you summarize and integrate a set of past studies into a theory? o Are there any theoretical predictions needing empirical testing? o Do you have any "theories" that you believe have merit? Test them! o If there is little or no theory in the area of interest to you. explanations of phenomena) can be a source of research ideas. then think about collecting data to help you generate a theory using the grounded theory technique.

Other important databases are PsycINFO or PsycLIT (for psychological research)...” or “ethical” questions. A literature review can take a different form in qualitative and quantitative research: • In qualitative research (which often means exploratory research). Most importantly. we recommend that a literature review is conducted to see what has been done and to provide sensitizing concepts. For example. Should homosexuals be allowed to legally marry?.. • Books is a good starting point.” “metaphysical. So do not expect to conduct an empirical research study that will "show whether school prayer should be adopted. is school prayer good?. and show theoretical and methodological issues that have arisen.The point in this section is that empirical research (i. SocioFILE and Sociological Abstracts (for sociological research). democracy. the next step is to become familiar with the published information on your topic. empirical research can’t offer the solution.g. but he took the valued endpoints for granted (e. and show methodological techniques and problems specific to your research problem that will help you in designing a study. It gives you an overview and a summary of relevant research and theory. for much qualitative research. interpreting results. Conducting a literature review will help you to see if your topic has already been researched. after conducting a thorough literature review. and Dissertation Abstracts (for summaries of Page 21 of 179 . In quantitative research. review of prior research must be done before conducting the study. John Dewey made the point that empirical research can provide answers about how to get to valued endpoints. little prior literature may be available. Furthermore. The most important computer database in education is ERIC. show designs that have been used. the literature review will help you to see if your research problem has already been done. show you data collection instruments that have been used. too much review may make a researcher "myopic. research that is based on the collection of observable data) cannot provide answers to “ultimate.g. If a question is asking which value is true or correct. • Journals is another excellent source." Literature is especially important during the later stages (e. help you to see how you might need to revise your research idea. the researcher can use strategies (discussed in chapter 8) to minimize the researcher’s biases. Journals provide more recent information than books and provide full length empirical research articles for you to carefully examine. equality. Therefore. discussion) of exploratory research. the researcher directly "builds" on past research. your specific research questions and hypotheses will become clearer to you." Review of the Literature After you have identified your research idea. Still. Sources of Information There are several major sources of information for you to use when conducting a literature review. Then when data are collected. and identified a general problem that sounds interesting to you. • In quantitative research.e. Should the teaching of Christianity (and no other religion) be provided in public schools? These are moral and legal issues which cannot be directly addressed or resolved by empirical research in the social or behavioral sciences. • Computer databases are excellent sources for locating information. education for all).

4) that shows exactly how to search ERIC. Conducting the Literature Search In this section. we have included a Table (3. • Using the Public Internet. and meta-search engines that is from your chapter: Page 22 of 179 . • In Table 3. You should access ERIC through your library to get the full version. we strongly recommend that you do not search only for full-text articles because this will eliminate most of the best published research. The Internet has obviously become extremely important. search engines. • Because ERIC is the most important database in education. We strongly recommend that you do not limit your search to a single computer database. It is important for you to understand that the quality of material on the Internet varies widely and it must be evaluated before use. Below. Also. we have included some practical material on conducting the literature search.6 we explain how to evaluate the quality of Internet resources.doctoral dissertations in education and related fields). is a list of some useful subject directories.

Feasibility of the Study Before deciding whether to carry out your research project.. people). Page 23 of 179 . Furthermore. Interviewing a set of children with ADHD at your school would be more feasible.g. part of determining feasibility involves making sure that the study can be carried out ethically. money. This means that you must design a research study that can be carried out given your available resources (e. The Institutional Review Board will help you with this decision. So far we have discussed how to come up with your research topic and how to find the needed information. Interviewing all children with ADHD in your state probably would not be feasible for a single research study. time. you must decide whether it would be feasible to conduct. You should do this as early as possible so you don/t waste your time.

(Note: to see the full process that is explained in this chapter. after you get your topic. we recommend that you also view the concept map for chapter 3 click here for concept map or go to companion website) Statement of the Research Problem Page 24 of 179 . a hypothesis is much more specific than a research topic).1. In fact. it is usually helpful (when conducting basic or applied research) to start your literature review right at the beginning of the process shown in Figure 3.As seen in the following figure (Figure 3.1 involves a movement from the general to the specific (e. and if you are conducting a quantitative study you will also state need to your hypotheses. Note that movement from the top to the bottom of Figure 3.1 from your book). Also note that as you move from the top to the bottom.. you need to move to determining your research problem. you will need to conduct your literature review so that you can determine what specific research questions and/or hypotheses need to be addressed. your statement of the purpose of your study.g. your statement of the research questions.

what methods will be needed). Statement of Research Questions After you have completed your literature review and have digested the literature. For example: What are the social and cultural characteristics of a highly successful school where students and teachers get along well and students work hard and achieve highly? Here is another research question: How does the social context of a school influence perservice teachers’ beliefs about teaching? Here is another: What is the experience of a teacher being a student like? Formulating Hypotheses • If you are conducting a quantitative research study. Page 25 of 179 . • In qualitative research.. Statement of the Purpose of the Study As seen in the figure. For example: What effect does playing football have on students’ overall grade point average during the football season? o We have included scripts for writing quantitative research questions in Table 3. an event. research problems tend to emphasizes the need to explain. the purpose identifies the specific type of relationship being investigated using a specific set of variables. or causal) to be investigated. or phenomenon to be explored or described. research problems tend to focus on exploring a process. • In quantitative research. a research question typically asks about a relationship that may exist between or among two or more variables. you start with your topic and then try to identify one or more research problems that you believe need to be solved in that topic area. • In qualitative research.g. your research purpose follows from the problem you have selected. or a phenomenon. A hypothesis is the researcher’s prediction of the relationship that exists among the variables being investigated. a research question asks about the specific process.7. predict. you will typically state your specific hypotheses that you have developed from your literature review. and it will help guide the research process (e. • In qualitative research. what variables will be examined.As seen in the above figure. or describe something. predictive. • In quantitative research. A good literature review will logically end with your specific research questions. you will need to make an exact statement of the specific research questions you want to pursue. the research problem is the educational issue or problem within your broad topic area. This will help ensure that you have a good grasp of what you want to do. it will enable you to communicate your idea to others. In other words. It should identify the variables being investigated and specify the type of relationship (descriptive. issue. the purpose focuses on exploring or understanding a phenomenon. It is important to include this in your proposals and final reports because it helps orient your reader to your study. and it is your statement of your intent or objective for your research study. • In quantitative research.

Unlike in quantitative research (where hypotheses are stated before collecting the data). Here are the major sections for a typical research proposal: Title Page Abstract Introduction • Include a statement of the research topic. The Research Proposal After you have identified your research idea. you are ready to develop a research proposal to guide your research study. problem. • Include the purpose of the study..• • If you wrote a research question. What effect does playing football have on students’ overall grade point average during the football season?) the related hypothesis might go like this: Students who play football during the football season will experience a decrease in their GPAs as compared to students not playing football. and purpose. Page 26 of 179 . For the quantitative research question stated above (i. It includes a statement of the topic. It includes a discussion of the prior relevant research. and it will make you think about and specify each step of your study. • Include the research question(s) • Include the hypotheses for quantitative studies Method • Research Participants • Apparatus and/or Instruments • Procedure Data Analysis References The following briefly explains what goes in the major sections just shown: I. reviewed the research literature. hypotheses in qualitative research are often generated as the data are collected and as the researcher gains insight into what is being studied. Finally. determined the feasibility of your study. • Include a statement of the research problem(s) • Include a summary of the prior literature. This will force you to carefully spell out the rationale for your research study." moving from general to specific. Introduction This section is "V shaped.e. the hypothesis will by your tentative answer to your question. made a formal statement of the research questions (and hypotheses for a quantitative study). it ends with the research questions and hypotheses of the study. It is essential that you develop your research proposal before conducting a research study.

II. Method This section typically includes a discussion of the following: • The research participants (e.g., Who are they?, What are their characteristics?, How many will there be?, Where are they located?, How will they be selected? What kind of response rate are you planning for?). • The apparatus (e.g., is any special equipment needed for your study?). • The instruments to be used in the study (i.e., What are your specific variables and how will you measure those variables?, What specific data collection instruments will you use?, What kinds of reliability and validity evidence is available for the instruments?, Why are the instruments appropriate for your study and your particular participants?). • The procedure (this is a narrative outline of the specific steps you intend to follow to carry out your data collection; it should be clear enough for someone to replicate your study). A section on design is sometimes included (often in the procedure section), describing the research design used (e.g., a nonequivalent comparison group design or a longitudinal design). III. Data Analysis This section includes a discussion of how you intend on organizing and analyzing the data that you collect. • Quantitative studies use statistical data analysis procedures (e.g., ANOVA and regression). • Qualitative research studies are based on inductive data analysis (e.g., searching for categories, patterns, and themes present in the transcribed data). Note that some research proposals include a separate section or "Chapter" for the literature review (especially dissertations). Also, some prefer to include the data analysis section in the Method section. For example, the research proposal for a dissertation might include the following three chapters: 1. Introduction 2. Literature Review 3. Method Consumer Use of the Literature Frequently there will be no need to conduct an empirical research study because the necessary research will have already been done. In other words, many times, only a literature review will be needed to answer your questions. • We have provided checklist for evaluating research studies in Tables 3.8 and 3.9. These will help you to evaluate each study you review. Don’t forget this point that we want to continue to emphasize: never place too much confidence in a single research study. That is, you should place much more confidence in a research finding that has been replicated (i.e., shown in many different research studies).
Page 27 of 179

Because of the importance of viewing the full set of studies on an issue and the built in benefit of replication when this is done, you can see why we recommend that you pay special attention to meta-analyses when you find them in your literature searches. • A meta-analysis is a quantitative technique for summarizing the results of multiple studies on a specific topic. It will tell you if a variable consistently has been shown to have an effect as well as the average size of effect.

Page 28 of 179

Chapter 4 Research Ethics Note: as you read this lecture, it’s a good idea to also look at the concept map for the chapter. Remember that you can click of different parts of the concept map to move upward or downward. Here is the link: http://www.southalabama.edu/coe/bset/johnson/dr_johnson/clickmaps/ch4/fr_ch4.htm

What Are Research Ethics? Ethics is the division in the field of philosophy that deals with values and morals. It is a topic that people may disagree on because it is based on people's personal value systems. What one person or group considers to be good or right might be considered bad or wrong by another person or group. In this chapter, we define ethics as the principles and guidelines that help us to uphold the things we value. There are three major approaches to ethics that are discussed in the chapter. 1. Deontological Approach - This approach states that we should identify and use a Universal code when making ethical decisions. An action is either ethical or not ethical, without exception. 2. Ethical skepticism - This viewpoint states that concrete and inviolate ethical or moral standards cannot be formulated. In this view, ethical standards are not universal but are relative to one's particular culture, time, and even individual. 3. Utilitarianism - This is a very practical viewpoint, stating that decisions about the ethics should be based on an examination and comparison of the costs and benefits that may arise from an action. Note that the utilitarian approach is used by most people in academia (such as Institutional Review Boards) when making decisions about research studies. Ethical Concerns The are three primary areas of ethical concern for researchers: 1. The relationship between society and science. • Should researchers study what is considered important in society at a given time? • Should the federal government and other funding agencies use grants to affect the areas researched in a society? • Should researchers ignore societal concerns? 2. Professional issues. • The primary ethical concern here is fraudulent activity (fabrication or alteration of results) by scientists. Obviously, cheating or lying are neverdefensible. • Duplicate publication (publishing the same data and results in more than one journal or other publication) should be avoided. • Partial publication (publishing several articles from the data collected in one study). This is allowable as long as the different publications involve different
Page 29 of 179

1) is the information that you (the researcher) must put in a consent form so that potential participants are able to provideinformed consent. 3. Informed Consent.research questions and different data. Page 30 of 179 . • An actual consent form is shown in Exhibit 4. Potential research participants must be provided with information that enables them to make an informed decision as to whether they want to participate in the research study.aera.3. and as long as it facilitates scientific communication. we will go into the issue of treatment of research participants in depth. • In the next section.net/about/policy/ethics. The AERA is the largest professional association in the field of education. it should be avoided. Here is the link to the American Educational Research Association’s Code of Ethics: http://www.htm Here are some of the most important issues discussed in the chapter (and in the AERA Guidelines). Treatment of Research Participants • This is probably the most fundamental ethical issue in the field of empirical research. 1. • It is essential that one insures that research participants are not harmed physically or psychologically during the conduct of research. • Here (shown in Table 4. Ethical Guidelines for Research with Humans One set of guidelines specifically developed to guide research conducted by educational researchers is the AERA Guidelines. and is also known as the American Educational Research Association. Otherwise.

All students in the class will take the test. assent must be obtained from minors who are old enough or have enough intellectual capacity to say they are willing to participate. Assentmeans the minor agrees to participate after being informed of all the features of the study that could affect the participant’s willingness to participate.” 4. please tell your child to hand in a blank test sheet when the class is given the mathematics test so that your child will not be included in the study. • Informed consent must be obtained from parents or guardians of minors. please fill out the form at the bottom of this letter and return it to me. An example is shown in Exhibit 4. If you do not wish for your child to be in this study. • Also. Here is the key passage in the passive consent form: “Participation in this study is completely voluntary. Active consent is usually the preferred form of consent. Also.5. Deception Page 31 of 179 . Informed Consent with Minors as Research Participants. • Passive consent is the process whereby consent is given by not returning the consent form. Passive versus Active Consent So far we have only talked about active consent (i..e.2. 3. when consent is provided by the potential participant signing the consent form).

the process by which a study is rapidly reviewed by Page 32 of 179 . A stronger and even better condition (if it can be met) is called anonymity.e. Dehoaxing — informing study participants about deception that was used and the reasons for its use.e. • Researchers must submit a Research Protocol to the IRB for review.S. Anonymity means that the identity of the participant is not known by anyone in the study. Desensitizing — helping study participants deal with and eliminate any stress or other undesirable feelings that the study might have created. including the researcher. Institutional Review Board The IRB is a committee consisting of professionals and lay people who review research proposals to insure that the researcher adheres to federal and local ethical standards in the conduct of the research. • If deception is used.g. and any questions the participant has about the study are answered. Virtually every university in the U. However. An example would be where the researcher has a large group of people fill out a questionnaire but NOT write their names on it. the researcher is ethically obligated not to use any more deception than is needed to conduct a valid study. 2. • Debriefing has two goals: 1. to medical research). has an IRB. studies involving no risk to participants and not requiring full IRB review). A full example of a research protocol submitted to the IRB is shown in Exhibit 4. Debriefing is a poststudy interview in which all aspects of the study are revealed.. Freedom to Withdraw Participants must be informed that they are free to withdraw from the study at any time without penalty.. much educational research poses minimal risk to participants (as compared. debriefing should be used. Fortunately.expedited review (i. Protection from Mental and Physical Harm This is the most fundamental ethical issue confronting the researcher. Confidentiality and Anonymity Confidentiality is a basic requirement in all studies. but no names. • If you have a power relationship with the participants (e. In this way.. 7. the researcher ends up with data.6. 5. It means that the researcher agrees not to reveal the identity of the participant to anyone other than the researcher and his or her staff. if you are their teacher or employer) you must be extra careful to make sure that they really do feel free to withdraw. any reasons for deception are explained.Deception is present when the researcher provides misleading information or when the researcher withholds information from participants about the nature and/or purpose of the study. for example. Deception is allowable when the benefits outweigh the costs. • Three of the most important categories of review are exempt studies (i. 6.

for your convenience.. The IRB will provide the formal documentation of this status for your study. Page 33 of 179 . Although many educational studies are fall into the exempt category.e. and full board review(i.dhhs.3 the exempt categories used by the IRB. For more information than is provided in the text about IRB regulations.• fewer members than constitute the full IRB board). review by all members of the IRB). go here: http://ori. we have included in Table 4.gov/ Also. it is essential that you understand that it is the IRB staff and not the researcher that makes the decision as to whether a research protocol is exempt.

g. • Measurement is formally defined as the act of measuring by assigning symbols or numbers to something according to a specific set of rules. capacity.. IQ scores. college major. Fahrenheit temperature. judgments about rank order). Interval Scale.. Page 34 of 179 . not as indicators of amount or quantity (e. It does not possess an absolute zero point. They are called the four "scales of measurement. Zero degrees in these scales does not mean zero or no temperature. experimental group or control group). In particular.g. • Any variable where the levels can be ranked (but you don't know if the distance between the levels is the same) is an ordinal variable. • Some examples are order of finish position in a marathon.. it is equal to the freezing point or 32 degrees. • Here is the idea of the lack of a true zero point: zero degrees Celsius does not mean no temperature at all. you could mark the categories of the variable called "gender" with 1=female and 2=male). 3. there are four levels or types of information are discussed next in the chapter. Ordinal Scale. quantity. or identify variables. or degree of something. experimental group (e. click here. if you wanted to.. name. This level of measurement enables one to make ordinal judgments (i. rank in class.) Defining Measurement When we measure. the distance between adjacent points is the same). Nominal Scale. • Numbers can be used to label the categories of a nominal variable but the numbers serve only as markers.e.e. we attempt to identify the dimensions. billboard top 40. • It is used to categorize.Chapter 5 Standardized Measurement and Assessment (For the concept map that goes with this chapter. • This scale or level of measurement has the characteristics of rank order and equal intervals (i. personality type. in a Fahrenheit scale. 2. classify. • Some examples are Celsius temperature. label." Scales of Measurement 1. This is a nonquantitative measurement scale. Measurement can be categorized by the type of information that is communicated by the symbols or numbers assigned to the variables of interest. It classifies groups or types. • Some examples of nominal level variables are the country you were born in.

Psychological traits and states can be quantified and measured.e. but they are real in the sense that they are useful for classifying and organizing the world. Psychological traits and states exist. Various approaches to measuring aspects of the same thing can be useful. long lasting) characteristic on which people differ. Assumptions Underlying Testing and Measurement Before I list the assumptions. Assessment can provide answers to some of life's most momentous questions. • Here is an example of the presence of a true zero point: If your annual income is exactly zero dollars then you earned no annual income at all. • For example. behavioral observation. et al. and ability to mark a value with a name (nominal scale). • A trait is a relatively enduring (i.. weight. 4. 3. • Most traits and states measured in education are taken to be at the interval level of measurement. This is a scale with a true zero point. Kelvin temperature. but lacks a true zero point) to ratio scales (shows amount or quantity as we usually understand this concept in mathematics or everyday use of the term). interviews. we also list the twelve assumptions that Cohen. In this section of the text. • It also has all of the "lower level" characteristics (i.. different tests of intelligence tap into somewhat different aspects of the construct of intelligence. note the difference between testing and assessment. response time. • Traits and states are actually social constructions. the numbers become more and more quantitative as you move from ordinal scales (shows ranking only) to interval scales (shows amount.e. a state is a less enduring or more transient characteristic on which people differ. and they refer to something in the world that we can measure. • For nominal scales. For the other scales. (You can buy absolutely nothing with zero dollars. the key characteristic of each of the lower level scales) of equal intervals (interval scale). rank order (ordinal scale). 2.4. case studies. • Some examples of ratio level scales are number correct. According to the definitions that we use: • Testing is the process of measuring variables by means of devices or procedures designed to obtain a sample of behavior and • Assessment is the gathering and integration of data for the purpose of making an educational evaluation. the number is used as a marker. height. Ratio Scale. accomplished through the use of tools such as tests. Page 35 of 179 . Consider basic to testing and assessment: 1. they can be used to understand and predict behavior.) Zero means zero. and specially designed apparatus and measurement procedures. and annual income.

7. Assessment can pinpoint phenomena that require further attention or study. projective) can vary widely and still provide good measurement of educational.• It is important that the users of assessment tools know when these tools will provide answers to their questions. • Information from several sources usually should be obtained in order to make an accurate and informed decision. • It is essential that users of tests understand this so that they can use them appropriately and intelligently. 9. error present every time the measurement instrument is used such as an essay exam being graded by an overly easy grader).e. paper-and-pencil achievement tests given to children are used to say something about their level of achievement. • Another paper-and-pencil test (also called a self-report test) that is popular in counseling is the MMPI (i. Present-day behavior sampling predicts future behavior. 5. • In this chapter. • The goal of testing usually is to predict behavior other than the exact behaviors required while the exam is being taken. • The point here is that the actual mechanics of measurement (e. • Perhaps the most important reason for giving tests is to predict future behavior. this "sample" is used to predict future behavior. Various sources of data enrich and are part of the assessment process. the idea of portfolio assessment is useful. • For example. behavioral performance. (Later when we discuss reliability and validity. the Minnesota Multiphasic Personality Inventory).. All measurement has some error.g. you might note that unreliability is due to random error and lack of validity is due to systematic error. error due to transient factors such as being sick or tired) and systematic error (e. we will be talking about the two major characteristics: reliability and validity. Clients' scores on this test are used as indicators of the presence or absence of various mental disorders.g. and other types of variables.. For example. Page 36 of 179 . • Tests provide a sample of present-day behavior. However. psychological. 6. self-reports.. • For example. • There is no such thing as perfect measurement.) 8. Tests and other measurement techniques have strengths and weaknesses. 10.. Test-related behavior predicts non-test-related behavior.g. assessment may identify someone as having dyslexia or low self-esteem or at-risk for drug use. • We defined error as the difference between a person’s true score and that person’s observed score. • The two main types of error are random error (e. Various sources of error are always part of the assessment process. • For example. an employment test given by someone in a Personnel Office may be used as a predictor of future work behavior.

if you are going to have validity.. In psychological and educational testing. Testing and assessment can be conducted in a fair and unbiased manner. Identifying A Good Test or Assessment Procedure As mentioned earlier in the chapter. Regarding strength. 12. Reliability Reliability refers to consistency or stability. If we do not have good measurement then we cannot have good research. • Assume you weigh 125 pounds. zero indicates no correlation at all. the stronger the correlation.75 is stronger than Page 37 of 179 . importantly.. and positive one (+1.85 is stronger than +. teacher competency. Overview of Reliability and Validity As an introduction to reliability and validity and how they are related.00) indicates a perfect negative correlation. degree of student satisfaction.). Testing and assessment benefit society. employability. • Reliability is usually determined using a correlation coefficient (it is called a reliability coefficient in this context). 134. 136 then your scales are reliable but not valid.g. • Test makers always have to be on the alert to make sure tests are fair and unbiased. 134. 135.55. For example.g. degree of teacher satisfactions. you want your scales to be both reliable and valid. you must have reliability but reliability in and of itself is not enough to ensure validity. it refers to the consistency or stability of the scores that we get from a test or assessment procedure. • This assumption also requires that the test be administered to those types of people for whom it has been shown to operate properly. • Without tests. That’s why it’s so important to use testing and assessment procedures that are characterized by high reliability and high validity. • Remember (from chapter two) that a correlation coefficient is a measure of relationship that varies from -1 to 0 to 1 and the farther the number is from zero.• Another example: the Beck Depression Inventory is used to measure depression and. If you weigh yourself five times and get 135. good measurement us fundamental for research. 11. • Many critical decisions are made on the basis of tests (e. The scores were consistent but wrong! Again. and +.e. to predict test taker’s future behavior (e.00) indicates a perfect positive correlation. are they a risk to themselves?). minus one (-1. etc. presence of a psychological disorder. the world would be much more unpredictable. • This requires careful construction of test items and testing of the items on different types of people. -.. note the following: • Reliability refers to the consistency or stability of test scores • Validity refers to the accuracy of the inferences or interpretations we make from test scores • Reliability is a necessary but not sufficient condition for validity (i.

the variables move in the same direction (e.. that is. This refers to the consistency of test scores over time. and +1. the variables move in opposite directions (e.g. the lower the reliability coefficient tends to be. (It is also sometimes called Cronbach’s alpha. when you have a positive correlation. Reliability coefficients of . This refers to the consistency of test scores obtained on two equivalent forms of a test designed to measure the same thing. poor diet and life expectancy). Note that zero means no reliability. • • • The first type of reliability is called test-retest reliability. It refers to the consistency with which the items on a test measure a single construct.00 means perfect reliability. Internal consistency reliability only requires one administration of the test. 1. we must obtain the reliability coefficients of interest to us.35. education and income). It is measured by correlating the test scores obtained at one point in time with the test scores obtained at a later point in time for a group of people. • • • • • The third type of reliability is called internal consistency reliability.90 or higher are needed to make decisions that have impacts on people's lives (e..g.• • • +. Reliability is empirically determined. that is.g. which makes it a very convenient form of reliability. 3. The longer the time interval between the two testing occasions. When you have a negative correlation. we are only interested in positive correlations. It is measured by correlating the scores obtained by giving two forms of the same test to a group of people. The success of this method hinges on the equivalence of the two forms of the test. • • • • 2.. The measure of internal consistency that we emphasize in the chapter is coefficient alpha. Reliability coefficients of . There are four primary ways to measure reliability.) The beauty of coefficient alpha is that it is readily provided by statistical analysis packages and it can be used when test items are quantitative and when they are dichotomous (as in right or wrong). Researchers use coefficient alpha when they want an estimate of the reliability of a homogeneous test (i. When looking at reliability coefficients we are interested in the values ranging from 0 to 1. which involves splitting a test into two equivalent halves and checking the consistency of the scores obtained from the two halves. A primary issue is identifying the appropriate time interval between the two testing occasions.70 or higher are generally considered to be acceptable for research purposes. That is. the clinical uses of tests).. One type of internal consistency reliability is split-half reliability. The second type of reliability is called equivalent forms reliability. a test that measures only one construct or trait) or an estimate of Page 38 of 179 .e. we must check the reliability of test scores with specific sets of people.

4. interpretations. self-efficacy).70) when the items on a test are correlated with one another.g. the higher coefficient alpha will be). For example. • Technically speaking.. To make a decision about content-related evidence. You could have two judges rate one set of papers.. the more items you have on a test. depression. showing the consistency of the two judges’ ratings. There are three main methods of collecting validity evidence. Evidence Based on Internal Structure Some tests are designed to measure one general construct. • • Validity Validity refers to the accuracy of the inferences. Inter-Scorer Reliability refers to the consistency or degree of agreement between two or more scorers. or questions on a test adequately represent the domain of interest. judges. have you included any irrelevant items)? 2. • All of the ways of collecting validity evidence are really forms of what used to be called construct validity. The fourth and last major type of reliability is called inter-scorer reliability. we are always measuring something (e. Expert judgment is used to provide evidence of content validity. or actions made on the basis of test scores. Then you would just correlate their two sets of ratings to obtain the inter-scorer reliability coefficient. you should try to answer these three questions: • Do the items appear to represent the thing you are trying to measure? • Does the set of items underrepresent the construct’s content (i.• the reliability of each dimension on a multidimensional test. IQ. tasks. Evidence Based on Content Content-related evidence is based on a judgment of the degree to which the items. All that means is that in testing and assessment.e. 1.g. Coefficient alpha will be high (e. the Rosenberg SelfPage 39 of 179 .e. You will see it commonly reported in empirical research articles.e. but other tests are designed to measure several components or dimensions of a construct.. age. have you excluded any important content areas or topics)? • Do any of the items represent something other than what you are trying to measure (i.. Validation refers to gathering evidence supporting some inference made on the basis of test scores. or raters. gender. But note that the number of items also affects the strength of coefficient alpha (i.. It is the interpretations and actions taken based on the test scores that are valid or invalid. it is incorrect to say that a test is valid or invalid. greater than . This latter point is important because it shows that it is possible to get a large alpha coefficient even when the items are not very homogeneous or internally consistent.

Esteem Scale is a 10 item scale designed to measure the construct of global self-esteem. In contrast, the Harter Self-Esteem Scale is designed to measure global self-esteem as well as several separate dimensions of self-esteem. • The use of the statistical technique called factor analysis tells you the number of dimensions (i.e., factors) that are present. That is, it tells you whether a test is unidimensional (just measures one factor) or multidimensional (i.e., measures two or more dimensions). • When you examine the internal structure of a test, you can also obtain a measure of test homogeneity (i.e., how well the different items measure the construct or trait). • The two primary indices of homogeneity are the item-to-total correlation (i.e., correlate each item with the total test score) and coefficient alpha (discussed earlier under reliability). 3. Evidence Based on Relations to Other Variables This form of evidence is obtained by relating your test scores with one or more relevant criteria. A criterion is the standard or benchmark that you want to predict accurately on the basis of the test scores. Note that when using correlation coefficients for validity evidence we call them validity coefficients. There are several different kinds of relevant validity evidence based on relations to other variables. The first is called criterion-related evidence which is validity evidence based on the extent to which scores from a test can be used to predict or infer performance on some criterion such as a test or future performance. Here are the two types of criterion-related evidence: • Concurrent evidence—validity evidence based on the relationship between test scores and criterion scores obtained at the same time. • Predictive evidence—validity evidence based on the relationship between test scores collected at one point in time and criterion scores obtained at a later time. Here are three more types of validity evidence researchers should provide: • Convergent evidence—validity evidence based on the relationship between the focal test scores and independent measures of the same construct. The idea is that you want your test (that your are trying to validate) to strongly correlate with other measures of the same thing. • Divergent evidence—evidence that the scores on your focal test are not highly related to the scores from other tests that are designed to measure theoretically different constructs. This kind of evidence shows that your test is not a measure of those other things (i.e., other constructs). • Putting the ideas of convergent and divergent evidence together, the point is that to show that a new test measures what it is supposed to measure, you want it to correlate with other measures of that construct (convergent evidence) but you also want it NOT to correlate strongly with measures of other things (divergent evidence). You want your test to overlap with similar tests and to diverge from tests of different things. In short, both convergent and divergent evidence are desirable.
Page 40 of 179

Known groups evidence is also useful in demonstrating validity. This is evidence that groups that are known to differ on the construct do differ on the test in the hypothesized direction. For example, if you develop a test of gender roles, you would hypothesize that females will score higher on femininity and males will score higher on masculinity. Then you would test this hypothesis to see if you have evidence of validity.

Now, to summarize these three major methods for obtaining evidence of validity, look again at Table 5.6 (also shown below). Please note that, if you think we have spent a lot of time on validity and measurement, the reason is because validity is so important in empirical research. Remember, without good measurement we end up with GIGO (garbage in, garbage out).

Using Reliability and Validity Information You must be careful when interpreting the reliability and validity evidence provided with standardized tests and in empirical research journal articles. • With standardized tests, the reported validity and reliability data are typically based on a norming group (which is an actual group of people). If the people with which you intend to use a test are very different from those in the norming group, then the validity and reliability evidence provided with the test become questionable. Remember that
Page 41 of 179

what you need to know is whether a test will work with the people in your classroom or in your research study. When reading journal articles, you should view an article positively to the degree that the researchers provide reliability and validity evidence for the measures that they use. Two related questions to ask when reading and evaluating an empirical research article are “It this research study based on good measurement?” and “Do I believe that these researchers used good measures?” If the answers are yes, then give the article high marks for measurement. If the answers are no, then you should invoke the GIGO principle (garbage in, garbage out).

Educational and Psychological Tests Three primary types of educational and psychological tests are discussed in your textbook: intelligence tests, personality tests, and educational assessment tests. 1) Intelligence Tests Intelligence has many definitions because a single prototype does not exist. Although far from being a perfect definition, here is our definition:intelligence is the ability to think abstractly and to learn readily from experience. • Although the construct of intelligence is hard to define, it still has utility because it can be measured and it is related to many other constructs. For some examples of intelligence tests, click here.

2) Personality Tests. Personality is a construct similar to intelligence in that a single prototype does not exist. Here is our definition: personality is the relatively permanent patterns that characterize and can be use to classify individuals. • Most personality tests are self-report measures. A self-report measure is a test-taking method in which the participants check or rate the degree to which various characteristics are descriptive of themselves. • Performance measures of personality are also used. A performance measure is a testtaking method in which the participants perform some real-life behavior that is observed by the researcher. • Personality has also been measured with projective tests. A projective test is a testtaking method in which the participants provide responses to ambiguous stimuli. The test administrator searches for patterns on participants’ responses. Projective tests tend to be quite difficult to interpret and are not commonly used in quantitative research. For some examples of personality tests, click here.

3) Educational Assessment Tests.
Page 42 of 179

• • Aptitude Tests. --These are designed to measure the degree of learning that has taken place after a person has been exposed to a specific learning experience.There are four subtypes of educational assessment tests: • Preschool Assessment Tests. click here. Also. --They are often used to predict future performance whereas achievement tests are used to measure current performance. Diagnostic Tests. • Sources of Information about Tests The two most important main sources of information about tests are the Mental Measurements Yearbook (MMY) and Tests in Print (TIP). here are some useful internet links (from Table 5. --These are typically screening tests because the predictive validity of many of these tests is weak. Some additional sources are provided in Table 5.7. --These focus on information acquired through the informal learning that goes on in life. Achievement Tests.8): Page 43 of 179 . For some examples of achievement tests. --These tests are used to identify the locus of academic difficulties in students. They can be teacher constructed or standardized tests.

Page 44 of 179 .

not methods of research (which are covered in later chapters). validity. Observation (i. skills tests. Existing or Secondary data (i.e. It is important to consider and utilize the fundamental principle of mixed research during the planning of a research study.e..e. click here... Page 45 of 179 • • • • • .e. • Once data are collected they are analyzed and interpreted and turned into information and results or findings. We will briefly summarize each of these in this lecture: • Tests (i.. Questionnaires (i. Remember: concept maps help provide the big picture as well as show how the parts are interrelated. and norms as well as tests constructed by researchers for specific purposes. self-report instruments).. etc). looking at what people actually do). There are six major methods of data collection. Interviews (i.e. • The focus in this chapter is on methods of data collection.) The purpose of Chapter 6 is to help you to learn how to collect data for a research project.. • The principle states that researchers should mix methods (including methods of data collection as well as methods of research) in a way that is likely to provide complementary strengths and nonoverlapping weaknesses. includes standardized tests that usually include information on reliability. Focus groups (i. situations where the researcher interviews the participants).e. • We will provide you with additional tables (not in the chapter because of space limitations) for each method of data collection so that you can compare the strengths and weaknesses of each method of data collection and attempt to put together the match that will best serve your purpose and will follow the fundamental principle of mixed research. using data that are originally collected and then archived or any other kind of “data” that was simply left behind at an earlier time for some other purpose). • The term method of data collection simply refers to how the researcher obtains the empirical data to be used to answer his or her research questions. a small group discussion with a group moderator present to keep the discussion focused). • All empirical research relies on one or more method of data collection.Chapter 6 Methods of Data Collection (Note: For the concept map that goes with this lecture.

• Tests are usually already developed. • A wide range of tests is available (most content can be tapped).e. or cognitive activity that is being studied. therefore. published by the American Psychological Association. • Tests are sometimes biased against certain groups of people.. • Can provide “hard.” quantitative data. the same stimulus is provided to all participants). skills. The following table lists the strengths and weaknesses of tests. in conjunction with the tables for the other five major methods of data collection. • Strong psychometric properties (high measurement validity). • Allows comparability of common measures across research populations. • Often standardized (i. note that sometimes. aptitude. • Response rate is high for group administered tests.8 • Remember that if a test has already been developed that purports to measure what you want to measure. a researcher must develop a new test to measure the specific knowledge. • Many tests can be administered to groups which saves time. • Reactive effects such as social desirability can occur. behavior. • Ease of data analysis because of quantitative nature of data. Note that tests can also be used to complement other measures (following the fundamental principle of mixed research). • Availability of reference group data. we only have a brief discussion in this chapter. • Some tests lack psychometric data. • We listed the major internet sources for finding tests in Table 5. will help you in applying the fundamental principle of mixed research: Strengths and Weaknesses of Tests Strengths of tests (especially standardized tests) • Can provide measures of many characteristics of people. • We list the major sources of tests and test reviews in Table 5. In addition to the tests discussed in the last chapter. achievement. Page 46 of 179 . and performance. • Test may not be appropriate for a local or unique population. a researcher might need to measure response time to a memory task using a mechanical apparatus or develop a test to measure a specific mental or cognitive activity (which obviously cannot be directly observed). • Open-ended questions and probing not available. Weaknesses of tests (especially standardized tests) • Can be expensive if test must be purchased for each research participant. It.7. then you should strongly consider using it rather. For example. • An excellent source of tests (and other measures) (that we didn’t get into the chapter in time) is called The Directory of Unpublished Experimental Mental Measures (2003) edited by Goldman and Mitchell.Tests Tests are commonly used in research to measure personality. • Nonresponse to selected items on the test. The last chapter discussed standardized tests.

GIGO. Principle 3: Use natural and familiar language.e. • Consider the demographic and cultural characteristics of your potential participants so that you can make it understandable to them.. • Leading questions lead the participant to where you want him or her to be. but they can also be placed on the web for participants to go to and “fill out. • Loaded questions include loaded words (i. garbage out. precise. which is fine. • When developing a questionnaire make sure that you follow the 15 Principles of Questionnaire Construction.. the instrument of data collection should be called the questionnaire or the survey instrument. Principle 6: Avoid double-barreled questions. • Your participants (not you!) will be filling out the questionnaire. • A double-barreled question combines two or more issues in a single question (e. you would not know whether they were referring to parents or teachers or both). click here. click here.” Questionnaires are sometimes called survey instruments. For example. • A questionnaire is composed of questions and/or statements. words that create an emotional reaction or response by your participants). Always use neutral wording.Questionnaires A questionnaire is a self-report data collection instrument that is filled out by research participants. your research study will have the garbage in. your data will be invalid (i. Page 47 of 179 . syndrome). • Because one way to learn to write questionnaires is to look at other questionnaires. Principle 2: Understand your research participants. and relatively short. here is a double barreled question: “Do you elicit information from parents and other teachers?” It’s double barreled because if someone answered it. Principle 4: Write items that are clear. Questionnaires are usually paper-and-pencil instruments. • Always remember that you do not want the participant's response to be the result of how you worded the question. • For an example of a qualitative questionnaire. but the actual questionnaire should not be called “the survey. jargon is not..g. here is an example of a typical questionnaire that has mostly quantitative items. • Short items are more easily understood and less stressful than long items. you might do a survey of teacher attitudes about inclusion. • Familiar language is comforting. Principle 1: Make sure the questionnaire items match your research objectives. • If your participants don't understand the items. I will briefly review the 15 principles now.e.” The word “survey” refers to the process of using a questionnaire or interview protocol to collect data. Principle 5: Do not use "leading" or "loaded" questions.

it might be a double-barreled question. If yes...e. Principle 10: Consider the different types of response categories available for closed-ended questionnaire items. 10-20.g. 20-30 are NOT mutually exclusive and should be rewritten as less than 10.. including: o Numerical rating scales (where the endpoints are anchored.. 60-69) are NOT exhaustive because there is no where to put someone who is 70 years old or older.g.). 40-49.g. • Mutually exclusive categories do not overlap (e. 30-39. Principle 8: Determine whether an open-ended or a closed ended question is needed.. • Open-ended questions provide qualitative data in the participants' own words. Here is an open ended question: How can your principal improve the morale at your school? _______________________________________________ • Closed-ended questions provide quantitative data based on the researcher's response categories. . 20-29. 50-59. ages 0-10. "I disagree that teachers should not be required to supervise their students during library time"). 20-29.• • Does the question include the word "and"? If yes. rewrite it. if you are doing a national survey of adult citizens (i. Here is an example of a closed-ended question: • Open-ended questions are common in exploratory research and closed-ended questions are common in confirmatory research. • Does the answer provided by the participant require combining two negatives? (e. Answers to double-barreled questions are ambiguous because two or more ideas are confounded. • Exhaustive categories include all possible responses (e. 10-19. Principle 7: Avoid double negatives. Principle 9: Use mutually exclusive and exhaustive response categories for closed-ended questions. sometimes the center point or area is also labeled). 3 4 5 6 7 Very High 1 2 Very Low Page 48 of 179 . 30-39. 18 or older) then the these categories (18-19.. • Rating scales are the most commonly used.

where participants put their responses into rank order. such as most important. I do not recommend a 1 to 10 scale because too many respondents mistakenly view the 5 as the center point.. where participants "check all of the responses in a list that apply to them").g. Some researchers prefer 5.o Fully anchored rating scales (where all the points on the scale are anchored). • Another name for a summated rating scale is a Likert Scale because the summated rating scale was pretty much invented by the famous social psychologist named Rensis Likert. o o • Rankings (i. If you want to use a wide scale like this. Semantic differential (i. which is a summated rating scale: Page 49 of 179 . • Here is the Rosenberg Self-Esteem Scale. Both generally work well.e. where one item stem and multiple scales. second most important. with each item measuring self-esteem). are included and are rated by the participants). using a 4-point rather than a 5point rating scale) does not appreciably affect the response pattern. You should use somewhere from four to eleven points on your rating scale. Personally.. Checklists (i. • • Principle 11: Use multiple items to measure abstract constructs. • This is required if you want your measures to have high reliability and validity.. • One approach is to use a summated rating scale(such as the Rosenberg Self-Esteem Scale that is composed of 10 items. 2 Agree 3 Neutral 4 Disagree 5 Strongly Disagree 1 Strongly Agree 1 Strongly Agree 2 Agree 3 Disagree 4 Strongly Disagree o Omitting the center point on a rating scale (e.point rating scales. use a 0 to 10 scale (where the 5 is the middle point) and label the 5 with the anchor “medium” or some other appropriate anchor.e. I like the 4 and 5-point scales because all of the points are easily anchored. other researchers prefer 4-point rating scales.. that are anchored with polar opposites or antonyms.e. and third most important).

you might measure student’s self-esteem via the Rosenberg Scale just shown (which is used in a self-report form) as well as using teachers’ ratings of the students’ self-esteem. • On the other hand. Page 50 of 179 . are the answers corroborated across the methods of measurement or do you get different answers for the different methods?).e.Principle 12: Consider using multiple methods when measuring abstract constructs. For example. • The idea here is that if you only use one method of measurement. then your measurement may be an artifact of that method of measurement.. you might even want to observe the students in situations that should provide indications of high and low self-esteem. if you use two or more methods of measurement you will be able to see whether the answers depend on the method (i.

It. • After pilot testing your questionnaire. you may want to avoid reverse wording if it creates a double negative. (A response set is the tendency of a participant to respond in a specific direction to items regardless of the item content. • Closed-ended items can provide exact information needed by researcher.. • On the other hand. • Can provide information about participants’ internal meanings and ways of thinking. checking "yes" or "strongly agree" for all the items. revise it and pilot test it again. Therefore. high reliability and validity) for well constructed and validated questionnaires.e. recent research suggests that the use of reverse wording reduces the reliability and validity of scales. Principle 14: Develop a questionnaire that is easy for the participant to use. • Make sure that the directions are clear and that any filter questions used are easy to follow. until it works correctly.Principle 13: Use caution if you reverse the wording in some of the items to prevent response sets. Weaknesses of questionnaires • Usually must be kept short. • Can be administered to groups. • Moderately high measurement validity (i. • Useful for exploration as well as confirmation. you should generally use reverse wording sparingly. • You will always find some problems that you have overlooked! • The best pilot tests are with people similar to the ones to be included in your research study. • The participant must not get confused or lost anywhere in the questionnaire. if at all. • Ease of data analysis for closed-ended items. • Can administer to probability samples. • Quick turnaround. • Perceived anonymity by respondent may be high. Principle 15: Always pilot test your questionnaire. • Open-ended items can provide detailed information in respondents’ own words. • Also. in conjunction with the tables for the other five major methods of data collection. The following table lists the strengths and weaknesses of questionnaires.) • Reversing the wording of some items can help ensure that participants don't just "speed through" the instrument. Page 51 of 179 . will help you in applying the fundamental principle of mixed research: Strengths and Weaknesses of Questionnaires Strengths of questionnaires • Good for measuring attitudes and eliciting other content from research participants. • Inexpensive (especially mail questionnaires and group administered questionnaires).

Interviews In an interview. Nonresponse to selective items.Anything else? . Qualitative interviews • They are based on open-ended questions. People filling out questionnaires may not recall important information and may lack self-awareness. 1) Informal Conversational Interview.. . • Probing is available (unlike in paper-and-pencil questionnaires) and is used to reach clarity or gain additional information • Here are some examples of standard probes: .Any other reason? . Page 52 of 179 . obscuring the issues of interest.It is loosely structured (i.It is spontaneous. 2) Interview Guide Approach. • There are three types of qualitative interviews.3 has an example of an interview protocol. • Exhibit 6. • It is more structured than the informal conversational interview. • The questions can be asked in any order by the interviewer. • Trust and rapport are important. Open-ended items may reflect differences in verbal ability.e.. Measures need validation. Note that it looks very much like a questionnaire! The key difference between an interview protocol and a questionnaire is that the interview protocol is read by the interviewer who also records the answers (you have probably participated in telephone surveys before.g.you were interviewed). • Use closed-ended questions. Data analysis can be time consuming for open-ended items. ..What do you mean? Interviews may be quantitative or qualitative.. Quantitative interviews: • Are standardized (i.• • • • • • • Reactive effects may occur (e. Response rate may be low for mail and email questionnaires. interviewees may try to show only what is socially desirable). the interviewer asks the interviewee questions (in-person or over the telephone).. • It includes an interview protocol listing the open-ended questions. no interview protocol us used). the same information is provided to everyone).e.

The following table lists the strengths and weaknesses of focus groups.. • Allows probing and posing of follow-up questions by the interviewer. • Relatively high response rates are often attainable. will help you in applying the fundamental principle of mixed research: Page 53 of 179 .g. • Useful for exploration as well as confirmation. • Telephone and e-mail interviews provide very quick turnaround. It. • Focus group sessions generally last between one and three hours and they are recorded using audio and/or videotapes. • Reactive effects (e. Weaknesses of interviews • In-person interviews usually are expensive and time consuming. • Data analysis can be time consuming for open-ended items. 3) Standardized Open-Ended Interview. interviewees may try to show only what is socially desirable). • Moderately high measurement validity (i. in conjunction with the tables for the other five major methods of data collection. and they are asked in the exact order given on the protocol. • Open-ended questions are written on an interview protocol.. • Perceived anonymity by respondents may be low.e. • Can use with probability samples. • Can provide in-depth information. • Can provide information about participants’ internal meanings and ways of thinking. Focus Groups A focus group is a situation where a focus group moderator keeps a small and homogeneous group (of 6-12 people) focused on the discussion of a research topic or issue.• Question wording can be changed by the interviewer if it is deemed appropriate. • Measures need validation. • Focus groups are useful for exploring ideas and obtaining in-depth information about how people think about an issue.. • Interviewees may not recall important information and may lack self-awareness. in conjunction with the tables for the other five major methods of data collection. • Investigator effects may occur (e. high reliability and validity) for well constructed and tested interview protocols. It. The following table lists the strengths and weaknesses of interviews. will help you in applying the fundamental principle of mixed research: Strengths and Weaknesses of Interviews Strengths of interviews • Good for measuring attitudes and most other content of interest. • Closed-ended interviews provide exact information needed by researcher.g. untrained interviewers may distort data because of personal biases and poor interviewing skills). • The wording of the questions cannot be changed.

• Allows quick turnaround.Where the observations are to take place. Observation In the method of data collection called observation.How the observations are to take place. • Most content can be tapped. • Can obtain in-depth information. . • Naturalistic observation (which is done in real-world settings). • Reactive and investigator effects may occur if participants feel they are being watched or studied. • It is important to collect observational data (in addition to attitudinal data) because what people say is not always what they do! Observation can be carried out in two types of environments: • Laboratory observation (which is done in a lab set up by the researcher). Weaknesses of focus groups • Sometimes expensive. • May include large amount of extra or unnecessary information. • Measurement validity may be low. . Page 54 of 179 .g. • Difficult to generalize results if small. • Can examine how participants react to each other. . • The following can be standardized: .What is observed. • Allows probing. • May be dominated by one or two participants. unrepresentative samples of participants are used.Who is observed. • Standardized instruments (e.When the observations are to take place. • Usually should not be the only data collection methods used in a study.Strengths and Weaknesses of Focus Groups Strengths of focus groups • Useful for exploring ideas and concepts. • May be difficult to find a focus group moderator with good facilitative and rapport building skills. 1) Quantitative observation involves standardization procedures. • Data analysis can be time consuming because of the open-ended nature of the data. checklists) are often used in quantitative observation. the researcher observes participants in natural and/or structured environments.. • Provides window into participants’ internal thinking. . and it produces quantitative data. There are two important forms of observation: quantitative observation and qualitative observation.

• Sampling procedures are also often used in quantitative observation: --Time-interval sampling (i. • Observer-as-Participant (i. • Investigator effects (e.. Weaknesses of observational data • Reasons for observed behavior may be unclear. • Can provide relatively objective measurement of behavior (especially for standardized observations). • Can be used with participants with weak verbal skills. --Event sampling (i. The following table lists the strengths and weaknesses of observational data.e. • Reactive effects may occur when respondents know they are being observed (e.g. • May provide information on things people would otherwise be unwilling to talk about. and the researcher takes extensive field notes.e.e... • Participant-as-Observer (i. over-identifying with the group being studied).... personal biases and selective perception of observers) • Observer may “go native” (i. • Observer can determine what does not occur. especially if the observer participates in activities. becoming a full member of the group and not informing the participants that you are studying them)..g.e. in conjunction with the tables for the other five major methods of data collection. 2) Qualitative observation is exploratory and open... observing after teacher asks a question). spending a limited amount of time "inside" and informing them that you are studying them).e.ended. e. observing after an event has taken place. • Observer may move beyond selective perceptions of people in the setting.. people being observed may behave in atypical ways). observing from the "outside" and not informing that participants that you are studying them).e. • Helps in understanding importance of contextual factors. It. • Good for description. • Excellent way to discover what is occurring in a setting.e. The qualitative observer may take on four different roles that make up a continuum: • Complete participant (i. e. • Provides moderate degree of realism (when done outside of the laboratory). during the first minute of each 10 minute interval). • Observer may see things that escape the awareness of people in the setting.g. will help you in applying the fundamental principle of mixed research: Strengths and Weaknesses of Observational Data Strengths of observational data • Allows one to directly see what people do without having to rely on what they say they do. observing during time intervals. Page 55 of 179 .g. spending extensive time "inside" and informing the participants that you are studying them). • Provides firsthand experience. • Complete Observer (i..

click here. Some settings and content of interest cannot be observed. • Unobtrusive. Strengths of archived research data: • Archived research data are available on a wide variety of topics. historical data). 1. Data analysis can be time consuming. • Often are reliable and valid (high measurement validity). original data collected for the new research study).e. Documents. 2. Physical data (are any material thing created or left by humans that might provide information about a phenomenon of interest to a researcher). Archived research data (i. Cannot observe large or dispersed populations. and these data are save often in tape form or cd form so that others might later use the data). The following table lists the strengths and weaknesses of secondary/existing data.e. and archived research data. minutes.e... Secondary/Existing Data Secondary data (i. • Provides useful background and historical data on people. Page 56 of 179 . annual reports. More expensive to conduct than questionnaires and tests. things written or recorded for private purposes).• • • • • • Sampling of observed people and settings may be limited. Letters. and organizations. Newspapers. yearbooks. diaries. Collection of unimportant material may be moderately high. The most commonly used secondary data are documents. groups. 3. family pictures. • Can be collected for time periods occurring in the past (e. • Official documents (i. • Can study trends. physical data... making reactive and investigator effects very unlikely. For the biggest repository of archived research data. • Personal documents (i.. • Useful for corroboration.g. It. will help you in applying the fundamental principle of mixed research: Strengths and Weaknesses of Secondary Data Strengths of documents and physical data: • Can provide insight into what people think and what they do. • Useful for exploration. • Inexpensive. There are two main kinds of documents. things written or recorded for public or private organizations).e.e. in conjunction with the tables for the other five major methods of data collection. • Grounded in local setting. data originally used for a different purpose) are contrasted with primary data (i. research data collected by other researchers for other purposes..

• Often based on high quality or large probability samples. • Many of the most important findings have already been mined from the data. • Data may be dated.Ease of data analysis. • Page 57 of 179 . • May not apply to general populations. • May not provide insight into participants’ personal thinking for physical data. • Access to some types of content is limited. • Open-ended or qualitative data usually not available. • May not be available for the research questions of interest to you. • May be representative only of one perspective. Weaknesses of documents and physical data: • May be incomplete. Weaknesses of archived research data: • May not be available for the population of interest to you.

and they are numbered). Page 58 of 179 . a perfect representative sample would be a "mirror image" of the population from which it was selected (again. except that it would include fewer people). you will learn how participants are selected to be part of empirical research studies. a sample that is similar to the population on all characteristics. Terminology Used in Sampling Here are some important terms used in sampling: • A sample is a set of elements taken from a larger population. • The usual goal in sampling is to produce a representative sample (i. such as the population mean. With random sampling methods. Sampling refers to drawing a sample (a subset) from a population (the full set). such as the sample mean. • The response rate is the percentage of people in the sample selected for the study who actually participate in the study. Here is an example of a sampling frame (a list of all the names in my population. • A statistic is a numerical characteristic of a sample.. • The sample is a subset of the population which is the full set of elements or people or whatever you are sampling. In other words. and the true value of the population parameter.) The purpose of Chapter 7 it to help you to learn about sampling in quantitative and qualitative research. • A sampling frame is just a list of all the people that are in the population. except that it includes fewer people because it is a sample rather than the complete population).e.Chapter 7 Sampling (Reminder: Don’t forget to utilize the concept maps and study questions as you study this and the other chapters. • Metaphorically. the error is random rather than systematic. • Sampling error refers to the difference between the value of a sample statistic. Note: some error is always present in sampling. Note that the following sampling frame also has information on age and gender included in case you want to draw some samples and do some calculations. but a parameter is a numerical characteristic of population.

• Remember that EPSEM means "everyone in the sampling frame has an equal chance of being in the final sample. • It is an equal probability sampling method (which is abbreviated by EPSEM). Simple Random Sampling The first type of random sampling is called simple random sampling." • You should understand that using an EPSEM is important because that is what produces "representative" samples (i. • It's the most basic type of random sampling.. It is the most basic and well know. • The former produces representative samples. Page 59 of 179 .Random Sampling Techniques The two major types of sampling in quantitative research are random sampling and nonrandom sampling.e. • The latter does not produce representative samples. however. samples that represent the populations from which they were selected)! You will see below that. simple random samples are not the only equal probability sampling method (EPSEM).

if there are 2500 people in the population then use 4 digits). if you had took lists from multiple teachers who had all ordered their lists on some variable such as IQ). or by IQ).. researchers often use computer programs to randomly select their samples.g. After getting the random numbers. 25..g. 35. if you get the same number twice. • To use a computer program (called a random number generator) you must make sure that you give each of the people in your population a number. there Page 60 of 179 . Also. if you are attaching multiple lists to one another.. • Here is a program the you can easily use for simple random sampling. here’s what you need to do. pick a place to start. It could occur when you attach several ordered lists to one another (e.. and then move in one direction (e. Then the program will give you a list of randomly selected numbers within the range you give it. stratification within one overall list is not a problem at all (e. • Remember simple random sampling was also an EPSEM. Use the number of digits in the table that is appropriate for your population size (e. and include that person in your sample. just ignore it and move on to the next number. determine the sampling interval. Systematic sampling involves three steps: • First." (it is the population size divided by the desired sample size).e. if you have one list and have it ordered by gender. Systematic Sampling Systematic sampling is the second type of random sampling. find out who those people are and try to get them to participate in your research study. also include each kth element in your sample. then you will select persons 5.g. randomly select a number between 1 and k. Once you get the set of randomly selected numbers. move down the columns). • Second. which is symbolized by "k. etc. 45. it requires slightly fewer people and is therefore a little cheaper).. For example if k is 10 and your randomly selected number between 1 and 10 was 5. there is a cyclical pattern in the sampling frame). Basically. • One potential (but rarely occurring) problem is called periodicity (i.g. On the other hand.. pull out 100 names from the hat). • Third. 15. • These days.e. • When you get to the end of your sampling frame you will have all the people to be included in your sample. just click here.• Sampling experts recommend random sampling "without replacement" rather than random sampling "with replacement" because the former is a little more efficient in producing representative samples (i. First. • In the chapter we demonstrate the use of a table of random numbers. • It is an equal probability sampling method (EPSEM).g. “How do you draw a simple random sample?" • One way is to put all the names from your population into a hat and then select a subset (e. you identify the people with those randomly selected numbers and try to get them to participate in your research study! • If you decide to use a table of random numbers such as the one shown on page 201 of the book..

sampling frame). • To select a one-stage cluster sample. Assume also that you want a sample of size 100 and you want to stratify on the variable called gender.could be a problem.e. Cluster Random Sampling In this type of sampling you randomly select clusters rather than individual type units in the first stage of sampling. take a random sample from each group (i... the samples of males and females) are proportional to their sizes in the population. • In disproportional stratified sampling. it is EPSEM). Stratified Random Sampling The third type of random sampling is called stratified random sampling. a classroom. take a random sample of males and a random sample of females).g. the subsamples are not proportional to their sizes in the population. • First. stratify your sampling frame (e. (Note that you could also take a systematic sample from the joined lists if that’s easier.g. • Second. one-stage and two-stage (note that more stages are possible in multistage sampling but are left for books on sampling).e. divide it into the males and the females if you are using gender as your stratification variable). We discuss two types of cluster sampling in the chapter. and most common.) There are actually two different types of stratified sampling.. • A cluster has more than one unit in it (e. a team). The first type of stratified sampling...e. Put these two sets of people together and you now have your final sample. • For disproportional stratified sampling.. The first type of cluster sampling is called one-stage cluster sampling. Page 61 of 179 . which is good! The second type of stratified sampling is called disproportional stratified sampling. Here is an example showing the difference between proportional and disproportional stratified sampling: • Assume that your population is 75% female and 25% male. • For proportional stratified sampling. • In proportional stratified sampling you must make sure the subsamples (e. you might randomly select 50 females and 50 males from the population. you would randomly select 75 females and 25 males from the population. is called proportional stratified sampling. • Note that proportional stratified sampling is an equal probability sampling method (i.g. you first select a random sample of clusters. It would be better to reorganize the lists into one overall list (i. a school.

the researcher specifies the characteristics of the population of interest and then locates individuals who match those characteristics). Nonrandom Sampling Techniques The other major type of sampling used in quantitative research is nonrandom sampling (i.. • The third type of nonrandom sampling is called purposive sampling (i. you might decide that you want to only include "boys who are in the 7th grade and have been diagnosed with ADHD" in your research study.. ask them for some. The second type of cluster sampling is called two-stage cluster sampling.. This will make your cluster sampling an equal probability sampling method (EPSEM). you take a random sample of elements from each of the clusters you selected in stage one (e. when you do not use one of the ransom sampling techniques). • In the second stage. You use convenience sampling to actually find the people. and it will..e... A set of quotas might be given to you as follows: find 25 African American males.e. • The second type of nonrandom sampling is called quota sampling (i. such as the school board and the Page 62 of 179 .e. There are four main types of nonrandom sampling: • The first type of nonrandom sampling is called convenience sampling (i.g. • In the first stage you take a random sample of clusters (i. • The fourth type of nonrandom sampling is called snowball sampling (i. You would then.e.. it simply involves using the people who are the most available or the most easily selected to be in your research study). it involves setting quotas and then using convenience sampling to obtain those quotas). but you must make sure you have the right number of people for each quota. where no sampling frame exists).e.g. try to find 50 students who meet your "inclusion criteria" and include them in your research study. ask them for more. just like you did in onestage cluster sampling). 25 African American females. and continue until you have a sufficient sample size. each research participant is asked to identify other potential research participants who have a certain characteristic).) • When clusters are not the same size. you might want to use snowball sampling if you wanted to do a study of people in your city who have a lot of power in the area of educational policy making (in addition to the already known positions of power. For example. you must fix the problem by using the technique called "probability proportional to size" (PPS) for selecting your clusters in stage one. You start with one or a few participants. and 25 European American females. produce representative samples. (Remember that EPSEM is very important because that is what produces representative samples. Important points about cluster sampling: • Cluster sampling is an equal probability sampling method (EPSEM) ONLY if the clusters are approximately the same size. 25 European American males. This technique might be used for a hard to find population (e. For example.e. therefore.• Then you include in your final sample all of the individual units that are in the selected clusters. in stage two you might randomly select 10 students from each of the 15 classrooms you selected in stage one).. find those.

because the bigger the sample the better). In particular. • When you want to breakdown the data into multiple categories. you select a sample from a population using one of the random sampling techniques discussed earlier. then click here. the sample will look like the adult population of Ann Arbor. • To see exactly how to do random assignment. • You can also use this randomizer program for random assignment.e. note that you will need larger samples under these circumstances: • When the population is very heterogeneous. you start with a set of people (you already have a sample.g. the two groups will be "equivalent" on all known and unknown variables.g.e. Random Selection and Random Assignment In random selection (using an equal probability selection method). I’ll list more when we get to the chapter on statistics. include the whole population).5 which shows recommended sample sizes. Michigan. • The groups or subsets will be "mirror images" of each other (except for chance differences).. In random assignment. • There are many sample size calculators on the web but they generally require you to learn a little bit of statistics first. except for chance differences. just look at Figure 7. • For example.e. which very well may be a convenience sample). • For an exact number. don't take a sample. Determining the Sample Size When Random Sampling is Used Would you like to know the answer to the question "How big should my sample be?" I will start with my four "simple" answers to your question: • Try to get as big of a sample as you can for your study (i.. just click here. Page 63 of 179 .school system superintendent). if you randomly select (e. if you start with a convenience sample of 100 people and randomly assign them to two groups of 50 people. I want to make a few more points about sample size in this chapter. • Random assignment generates similar groups.. and it is used in the strongest of the experimental research designs. • If your population is size 100 or less. then include the whole population rather than taking a sample (i. • When you want a relatively narrow confidence interval (e. and then you randomly divide that set of people into two or more groups (i. • Look at other studies in the research literature and see how many they are selecting. • You are taking a set of people and “assigning” them to two or more groups.. note that the estimate that 75% of teachers support a policy plus or minus 4% is more narrow than the estimate of 75% plus or minus 5%).. • The resulting random sample will be like a "mirror image" of the population. • For example. using simple random sampling) 1000 people from the adult population in Ann Arbor. Here is one click here. you take the full set and randomly divide it into subsets).

cluster sampling is less efficient than proportional stratified sampling).. • Opportunistic sampling (i.e. The response rate is the percentage of people in your sample who agree to be in your study. • Negative-case sampling (i.e.• • • When you expect a weak relationship or a small effect. you select cases that are known to be very important).. • Homogeneous sample selection (i. (Hit the right arrow key to move from slide to slide.e. When you expect to have a low response rate. • Extreme case sampling (i. you select a small and homogeneous case or set of cases for intensive study).g. you can mix the sampling strategies we have discussed into more complex designs tailored to your specific needs).. you select cases that represent the extremes on some dimension).. click here. The primary goal in qualitative research is to select information rich cases.e. you select typical or average cases). There are several specific purposive sampling techniques that are used in qualitative research: • Maximum variation sampling (i.. you select useful cases as the opportunity arises).) Page 64 of 179 .e. so that you can make sure that you are not just selectively finding cases to support your personal theory)... you purposively select cases that disconfirm your generalizations.e. • Typical-case sampling (i. you select a wide range of cases)... • Critical-case sampling (i. When you use a less efficient technique of random sampling (e. For a little more information on sampling in qualitative research.e. Sampling in Qualitative Research Sampling in qualitative research is usually purposive (see the above discussion of purposive sampling).e. • Mixed purposeful sampling (i.

.) In this chapter we discuss validity issues for quantitative research and for qualitative research. • For now.Chapter 8 Validity of Research Results (Reminder: Don’t forget to utilize the concept maps and study questions as you study this and the other chapters. Statistical Conclusion Validity Statistical conclusion validity refers to the ability to make an accurate assessment about whether the independent and dependent variables are related and about the strength of that relationship.. and external validity. and a relationship is not statistically significant when the null hypothesis testing procedure says that any Page 65 of 179 . you must think about what extraneous variables are probably confounding variables and do something about it. if Pepsi is served in cups with the letter "M" and Coke is served in cups with the letter "Q"). This procedure will tell you whether a relationship is statistically significant or not. • A confounding variable (also called a third variable) is an extraneous variable that DOES cause a problem because we know that it DOES have a relationship with the independent and dependent variables. Validity Issues in the Design of Quantitative Research On page 228 we make a distinction between an extraneous variable and a confounding variable. We will discuss each of these in this lecture. • We gave an example of "The Pepsi Challenge" (on p.g. So the two key questions here are 1) Are the variables related? and 2) How strong is the relationship? • Typically. null hypothesis significance testing (discussed in Chapter 16) is used to determine whether two variables are related in the population from which the study data were selected. internal validity. A confounding variable is a variable that systematically varies or influences the independent variable and also influences the dependent variable. it may also be a confounding variable). construct validity. If this is true then the variable of cup letter (M versus Q) is a confounding variable. • An extraneous variable is a variable that MAY compete with the independent variable in explaining the outcome of a study. • In short we must always worry about extraneous variables (especially confounding variables) when we are interested in conducting research that will allow us to make a conclusion about cause and effect. perhaps people are more likely to pick Pepsi over Coke if different letters are placed on the Pepsi and Coke cups (e. • There are four major types of validity in quantitative research: statistical conclusion validity. 228) and showed that anything that varies with the presentation of Coke or Pepsi is an extraneous variable that may confound the relationship (i. just remember that a relationship is said to be statistically significant when we do NOT believe that it is nothing but a chance occurrence. For example.e. • When you design a research study in which you want to make a statement about cause and effect.

• In general. if you have low internal validity then you must conclude that you have little or no evidence of causality. • If you can show that you have high internal validity (i. There are many different effect size indicators.1: Page 66 of 179 . however. but they all tell you how strong a relationship is. This involves the inclusion (in your research study) of mediating or intervening variables and moderator variables. That's because internal validity is defined as the "approximate validity with which we infer that a relationship between two variables is causal" (Cook and Campbell. 1979. To see Table 2. We call these the three necessary conditions for causality. For now remember that the answer to the first key question (Are the variables related?) is answered using null hypothesis significance testing.37). Criteria for Inferring Causation There are three main conditions that are always required if you want to make a claim that changes in one variable cause changes in another variable. Types of Causal Relationships There are two different types of causal relationships: causal description and causal explanation. • Causal description involves describing the consequences of manipulating an independent variable. and the answer to the second key question (How strong is the relationship?) is answered using an effect size indicator. high causal validity) then you can conclude that you have strong evidence of causality. researchers use what are called effect size indicators. • These three conditions are summarized below in Table 11.2. causal description involves showing that changes in variable X (the IV) cause changes in variable Y (the DV): X---->Y • Causal explanation involves more than just causal description. Internal Validity When I hear the term "internal validity" the word cause always comes into my mind. • A good synonym for the term internal validity is causal validity because that is what internal validity is all about.• • • observed relationship is probably nothing more than normal sampling error or fluctuation. To determine how STRONG a relationship is.e. Causal explanation involves explaining the mechanisms through which and the conditions under which a causal relationship holds.2 (on page 36). Mediating and moderator variables are defined in Chapter Two in Table 2. P. click here. The concepts of significance testing and effect size indicators are explained in Chapter 16..

you could measure your students' understanding of history at the beginning of the term. In particular. I want you to get the basic idea of two weak designs in your head. • These threats to internal validity usually call into question the third necessary condition for causality (i. people who drink little coffee are less likely to smoke cigarettes than are people who drink a lot of coffee. 1963). One big problem with concluding that coffee drinking causes heart attacks is that cigarette smoking is related to both of these variables (i. perhaps the observed relationship between coffee drinking and heart attacks is the result of the extraneous variable of smoking. Page 67 of 179 . and then you measure them again on their understanding of history at the end of the term. It is also helpful if you have a theoretical rationale explaining the causal relationship.. we discuss several threats to internal validity that have been identified by research methodologists (especially by Campbell and Stanley. Threats to Internal Validity In this section. • The first weak design is the one is the one-group pretest-posttest design which is depicted like this: O X O In this design. Before discussing the specific threats. there is a correlation between coffee drinking and likelihood of having a heart attack. For example.e. a group is pretested. Therefore.. then a treatment is administered. and then the people are post tested.e. The researcher would have to "control for" smoking in order to determine if this rival explanation accounts for the original relationship. then you teach them history for the term.• • If you want to conclude that X causes Y you must make sure that the three above necessary conditions are met. we have a Condition 3 problem). For example. the "lack of alternative explanation condition").

In the twogroup design shown above the counterfactual is the posttest of the control group. one group gets the treatment and the other group gets no treatment or some different treatment. XTreatment O2 ---------------------XControl O2 In this design.posttest design shown above the counterfactual is the pretest. I will also refer to this design as a twogroup design and sometimes as a multigroup design (since it has more than one group). if both a treatment and a history effect occur between the pretest and the posttest. and both groups are post tested (e. • Ambiguous temporal precedence is defined as the inability of the researcher (based on the data) to specify which variable is the cause and which variable is the effect. you cannot establish proper time order so you cannot make a conclusion of cause and effect. In other words. the principal may come into the experimental classroom during the research study which alters the outcome. The first threat to internal validity is called ambiguous temporal precedence.. • If this threat is present then you are unable to meet the second of the three necessary conditions shown above in Table 11. you want the different groups to be the same on all extraneous variables and different ONLY on the independent variable (e. The second threat to internal validity is called the history threat. such that one group gets the treatment and the other group does not). the groups are found wherever they already exist (i. Page 68 of 179 .g. these two events are confounded.. Furthermore. you will not know whether the observed difference between the pretest and the posttest is due to the treatment or due to the history event. Remember this key point: In each of the multigroup research designs (designs that include more than one group of participants). • For example. In this lecture. there is no pretest.• The second weak design to remember for this chapter is called the posttest-only design with nonequivalent groups. other than the planned treatment event. That is. participants are not randomly assigned to these groups). you teach two classes history for a quarter and measure their understanding at the end for comparison). The comparison in the two group design is between the two groups' posttest scores.g. that occurs between the pretest and posttest measurement and has an influence on the dependent variable.e. • The history threat refers to any event. • In short. • • • In comparing the two designs just mentioned note that the comparison in the one group design is the participants' pretest scores with their posttest scores.1. you want the only systematic difference between the groups to be exposure to the independent variable. In short." In the onegroup pretest.. Some researchers like to call the point of comparison the "counterfactual. • The history effect is a threat for the one group design but it is not a threat for the multigroup group design.

• Instrumentation refers to any change that occurs in the way the dependent variable is measured in the research study. instrumentation. The fourth threat to internal validity is called testing. this all or part of the difference could be due to a history effect. The problem is that perhaps their scores on the posttest are the result of being sensitized to the issue of racial stereotypes because they took a pretest. In this case. They are confounded. You use the one group design and you have your participants take a pretest and posttest measuring their agreement with certain racial stereotypes. Well. you will not know if their improvement is due to the teacher or if it is due to maturation. one group gets the treatment and the other group does not). Therefore in the one group design. the difference between the two groups will not be due to testing. • Testing is not a threat in the two group design because as long as the people in both groups are affected equally by the pretest. you will not know if their improvement from pretest to posttest is due to your treatment or if it is due to a testing effect. • Testing refers to any change on the second administration of a test as a result of having previously taken the test. • Therefore in the one group design. and regression artifacts. • Maturation is present when a physical or mental change occurs over time and it affects the participants' performance on the dependent variable. you don't know whether the change in scores is due to the treatment or to the history effect.• • You probably want to know why this it true.. Page 69 of 179 .e. The fifth threat to internal validity is called instrumentation. some of their improvement will probably be due to their natural maturation (and not just due to what you have taught them during the year). The third second threat to internal validity is called maturation. and as long as the history effect occurs for both groups the difference between the two groups will not be because of a history effect. in the one group design (shown above) you take as your measure of the effect of the treatment the difference in the pretest and posttest scores. • For example. The two groups do differ on exposure to the treatment (i. therefore. • For example. If you are following this logic about why these first two threats to internal validity are a problem for the one group design but not for the two group design then you have one of the major points of this chapter. if you measure first grade students' ability to perform arithmetic problems at the beginning of the year and again at the end of the year. let's say that you have a treatment that you believe will cause students to reduce racial stereotyping. The basic history effect is not a threat to the two group design (shown above) because now you are comparing the treatment group to a comparison group. This same logic is going to apply to the next three threats of testing. the difference between the two groups will not be due to maturation. • Maturation is not a threat in the two group design because as long as the people in both groups mature at the same rate.

• Differential selection only applies to multigroup designs.g.• • • For example. Instrumentation is not a threat in the two group design because as long as the people in both groups are affected equally by the instrumentation effect. the difference between the two groups will not be due to regression to the mean. • Regression artifacts refers to the tendency of very high pretest scores to become lower and for very low pretest scores to become higher on post testing. reading ability. let's say that you select people who have extremely high scores on your racial stereotyping test. Some of these scores are probably artificially high because of transient factors and a lack of perfect reliability. we want our groups to be the same on all variables except the treatment variable. selection is not an internal validity problem for the one group design but it is a problem for the two or multigroup design. • For example. the difference between the two groups will not be due to instrumentation. etc. if stereotyping goes down from pretest to posttest. The problem is that perhaps much of the positive gain occurring from the pretest to the posttest is due to the posttest assessment not picking up on the use of stereotyping. • Unlike the previous five threats. Page 70 of 179 . The sixth threat to internal validity is called regression artifacts (or regression to the mean). intelligence. you will not know if their improvement from pretest to posttest is due to your treatment for reducing stereotyping or if it is due to an instrumentation effect. The seventh threat to internal validity is called differential selection. One group will get your treatment and the other will act as a control. some or all of the change may be due to a regression artifact. • Table 8. the treatment variable is the only variable that we want to be systematically different for the groups. • Remember. you can see that selection is defined for two or multigroup designs. let's say that one person does your pretest assessment of students' racial stereotyping but you have a different person do your posttest assessment of students' stereotyping. • Looking at the definition again. assume that you select two classes for your study on reducing racial stereotyping. You use two fifth grade classes as your groups. It is not relevant to the internal validity of the single group design. The problem is that these two groups of students may differ on variables other thanyour treatment variable and any differences found at the posttest may be due to these "differential selection" differences rather than being due to your treatment. Therefore in the one group design.1 list a few of the may characteristics on which the students in the different groups may differ (e. age. anxiety. • As an example. Also assume that the second person tends to overlook much stereotyping but that the first person picks up on all stereotyping. • Therefore. gender. in the one group design you will not know if improvement from pretest to posttest is due to your treatment or if it is due to a regression artifact. Therefore. • You should always be on the lookout for regression to the mean when you select participants based on extreme (very high or very low) test scores.).. • Regression artifacts is not a threat in the two group design because as long as the people in both groups are affected equally by the statistical regression effect. It refers to selecting participants for the various groups in your study that have differentcharacteristics.

) • For example.The eight threat to internal validity is called differential attrition (it is also sometimes called mortality). the difference observed at the post test may now be the result of differential attrition..e. and treatment variations. Do you see how your result would be compromised if the kind of children that are most likely to have racial stereotypes drop out of one of your groups but not the other group? Obviously. You can think of this a what could be called a differential maturation effect. part of any observed differences in the reading ability of the two groups at the posttest may be due to maturation. Attrition simply refers to participants dropping out of your research study. assume again that you are doing a study on racial stereotyping. differential attrition is a problem for two or multigroup design but not for the single group design. they do not apply to the onegroup design. outcomes. This set is called additive and interactive effects. • A selection-history effect occurs when an event occurring between the pretest and posttest differentially affects the different comparison groups. For example. They only apply to two or multigroup designs. The ninth threat to internal validity is actually a set of threats. External Validity External validity has to do with the degree to which the results of a study can be generalized to and across populations of persons. maturation. • Remember that the key for the selection-effects is that the groups must be affected differently by the particular threat to internal validity. You can think of this as what could be called a differential history effect. • Differential attrition is the differential loss of participants from the various comparison groups. • A good synonym for external validity is generalizing validity because it always has to do with how well you can generalize research results. or statistical regression). Page 71 of 179 . history. Hence. • You now should be able to construct similar examples demonstrating the following: • Selection-testing effect (where testing affects the groups differently) • Selection-instrumentation effect (where instrumentation occurs differentially) • Selection-regression artifacts effect (where regression to the mean occurs differentially). • A selection-maturation effect occurs if the groups mature at different rates. instrumentation. • Just like the last threat. times. • Additive and interactive effects refers to the fact that the threats to validity can combine to produce a bias in the study which threatens our ability to conclude that the independent variable is the cause of differences between groups on the dependent variable. first grade students may tend to naturally change in reading ability during the school year more than third grade students. (Notice the word differential in the definition. settings. • These threats occur when the different comparison groups are affected differently (or differentially) by one of the earlier threats to internal validity (i. testing.

• Population validity is the ability to generalize the study results to individuals who were not included in the study.g. you might shift into your “television” behavior. • A good metaphor for reactivity comes from television.. let's say that you find that a new teaching technique works in urban schools. This is the issue of "how widely does the finding apply?" If the finding applied to every single individual in the population then it would have full population validity. In other words. and how well you can generalize your sample results across the different kinds if people in the larger population. some methodologists (such as Cook and Campbell) are more concerned about generalizing across populations. • Both of these two kinds of population validity are important. • Reactivity is a problem of ecological validity because the results might only generalize to other people who are also being observed. Ecological Validity Ecological validity is present to the degree that a result generalizes across different settings. • Reactivity is a threat to ecological validity. Page 72 of 179 . treatment variation validity. you would want to know if the technique works across different settings.e. they want to know how widely a finding applies. and outcome validity. • Another threat to ecological validity (not mentioned in the chapter) is called experimenter effects.. You might also want to know if the same technique works in rural schools and suburban schools. the effectiveness of a particular teaching technique) works across many different kinds of people (it works for many sub populations). Researchers should be aware of this problem and do their best to prevent it from happening. This threat occurs when participants alter their performance because of some unintentional behavior or characteristics of the researcher.. as you learned in the earlier chapter on sampling). This can also happen in research studies with human participants who know that they are being observed. • Generalizing from a sample to a population can be provided through random selection techniques (i. reactivity occurs sometimes because research study participants might change their performance because they know they are being observed. ecological validity. That is.. That is. I will discuss each of these now. • The issues are how well you can generalize your sample results to a population. a good sample lets you generalize to a population.• The major types of external validity are population validity. Research results that apply broadly are welcome to practitioners because it makes their jobs easier. Once you know that the camera is turned on to YOU. • Generalizing across populations is present when the result (e. temporal validity. temporal validity. Reactivity is defined as an alteration in performance that occurs as a result of being aware of participating in a study. Population Validity The first type of external validity is called population validity. however. • For example.

if a study shows a positive effect on self-esteem. if the treatment is varied a little.Temporal Validity Temporal validity is the extent to which the study results can be generalized across time.g. ADHD. Page 73 of 179 . Likewise. Outcome Validity Outcome validity is the degree to which one can generalize the results of a study across different but related dependent variables. it is unlikely that the intervention will be administered exactly as it was by the original researchers. types of on-line instruction. and if not to figure out why and to find out what works better. • Ecological validity = generalizing across settings. Construct Representation Educational researchers must measure or represent many different constructs (e. assume you find that a certain discipline technique works well with many different kinds of children and in many different settings. You will need to conduct additional research to make sure that the technique is robust over time. • Treatment variation validity = generalizing across variations of the treatment. there is no single behavior or operation available that can provide a complete and perfect representation of the construct. will it also show a positive effect on the related construct of self-efficacy? • A good way to understand the outcome validity of your research study is to include several outcome measures so that you can get a more complete picture of the overall effect of the treatment or intervention. Treatment Variation Validity Treatment variation validity is the degree to which one can generalize the results of the study across variations of the treatment. • This is.. intelligence. • The problem is that. usually. • Outcome validity = generalizing across related dependent variables. • For example. you might note that it is not working any more. Here is a brief summary of external validity: • Population validity = generalizing to and across populations. findings from far in the past often need to be replicated to make sure that they still work. As you can see. will the results be similar? • One reason this is important is because when an intervention is administered by practitioners in the field. • Temporal validity = generalizing across time. • For example. • For example. After many years. by the way. one reason that interventions that have been shown to work end up failing when they are broadly applied in the field. academic achievement). all of the forms of external validity concern the degree to which you can make generalizations.

and shown here for your convenience. • Why do you think Rosenberg used 10 items to represent self-esteem? The reason is because it would be very hard to tap into this construct with a single item. For example. Page 74 of 179 .• • • The researcher should always clearly specify (in the research report) the way the construct was represented so that a reader of the report canunderstand what was done and be able to evaluate the quality of the measure(s). Operationalism refers to the process of representing a construct by a specific set of operations or measures. you might choose to represent (or "operationalize") the construct of selfesteem by using the ten item Rosenberg Self-Esteem Scale shown on page 165.

the use of multiple investigators to collect and interpret the data).e. Whenever you read a research report. how do you spell the word "restaurant")? No! You might even decide to use more than one test of intelligence to tap into the different dimensions of intelligence. review pages 45-48 in Chapter 2 for a quick overview. • Two strategies for reducing researcher bias are reflexivity (constantly thinking about your potential biases and how you can minimize their effects) andnegative-case sampling (attempting to locate and examine cases that disconfirm your expectations). be sure to check out how they represent their constructs..” • One potential threat to watch out for is researcher bias (i. • One very useful strategy for obtaining descriptive validity is investigator triangulation (i..e. and I will list some very important and effective strategies that can be used to help you obtain high qualitative research validity or trustworthiness. Descriptive validity Descriptive validity is present to the degree that the account reported by the researcher is accurate and factual." Research Validity in Qualitative Research Now we shift our attention to qualitative research! If you need a review of qualitative research. • One useful strategy for obtaining interpretive validity is by obtaining participant feedback or “member checking” (i. the use of several measures to represent a construct). Interpretive validity Interpretive validity is present to the degree that the researcher accurately portrays the meanings given by the participants to what is being studied.. Also look at the qualitative research article in Appendix B titled "You Don’t Have to Be Sighted to Be a Scientist.. Do You? Issues and Outcomes in Science Education.e.• • • Rosenberg used what is called multiple operationalism (i. • When you have agreement among the investigators about the descriptive details of the account. searching out and finding or confirming only what you want or expect to find). • Another useful strategy is to use of low-inference descriptors in your report (i. Think about it like this: Would you want to use a single item to measure intelligence (e. Now I will briefly discuss the major types of validity in qualitative research.g. • Your goal here is to "get into the heads" of your participants and accurately document their viewpoints and meanings.. Then you can evaluate the quality of their representations or "operationalizations.. description phrased very close to the participants' accounts and the researcher's field notes). Page 75 of 179 . readers can place more faith in that account. discussing your findings with your participants to see if they agree and making modifications so that you represent their meanings and ways of thinking).e.e.

This is where you are able to generalize when a research result has been shown with different sets of people. it is still the degree to which you can generalize your results to other people. such as interviews. • The fourth strategy is peer review (discussing your interpretations and conclusions with your peers or colleagues who are not as deep into the study as you are). questionnaires. and in different settings. • I have listed three strategies to use if you are interested in cause and effect in qualitative research. and times. settings. • The second is theory triangulation (using multiple theories and perspectives to help you interpret the data). • The third is pattern matching (making unique or complex predictions and seeing if they occur. Page 76 of 179 . It also refers to whether you can conclude that one event caused another event. did the fingerprint that you predicted actually occur?). You do not want to limit yourself to a single data source. External validity External validity is pretty much the same as it was for quantitative research. That is. • Another way to generalize qualitative research findings is through replication. at different times. and observations in investigating an issue) • The third strategy is called data triangulation (using multiple data sources. • When you make a naturalistic generalization. In other words. • The second is called methods triangulation (using multiple methods. such as interviews with different types of people or using observations in different settings). • I listed four helpful strategies for this type of validity. it is becoming an important goal. However. • The first strategy is called researcher-as-detective (carefully thinking about cause and effect and examining each possible "clue" and then drawing a conclusion). It is the degree to which a researcher is justified in concluding that an observed relationship is causal. this is. • One form of generalizing in qualitative research is called naturalistic generalization (generalizing based on similarity). the reader of the report is making the generalizations rather than the researchers who produced the report. in many research areas today. Internal validity Internal validity is the same as it was for quantitative research. The issue of causal validity is important if the qualitative researcher is interested in making any tentative statements about cause and effect. • The first strategy is extended fieldwork (collecting data in the field over an extended period of time). you look at your students or clients and generalize to the degree that they are similar to the students or clients in the qualitative research study you are reading. • Note that generalizing has traditionally not a priority of qualitative researchers. • Qualitative researchers should provide the details necessary so that readers will be in the position to make naturalistic generalizations.Theoretical validity Theoretical validity is present to the degree that a theoretical explanation provided by the researcher fits the data.

Page 77 of 179 . And. (Note: they are also used in mixed research and can be used creatively even in quantitative research.• Yet another style of generalizing is theoretical generalizations (generalizing the theory that is based on a qualitative study. such as a grounded theory research study. Even if the particulars do not generalize. Here is a summary of the strategies used in qualitative research. if you are conducting research you must use validity strategies if your research is going to be trustworthy and defensible. the main ideas and the process observed might generalize.) The bottom line of this chapter is this: You should always try to evaluate the research validity of empirical studies before trusting their conclusions.

Here is that figure reproduced for your convenience: Page 78 of 179 . • The observations are made in an environment in which all conditions other than the ones the researcher presents are kept constant or controlled.) In this chapter we talk about what experiments are.1 (on page 266) you can see three different ways to manipulate the independent variable. and they are not as good as the strong designs discussed in this chapter. (Note: In the next chapter we will talk about middle of the road experimental designs.Chapter 9 Experimental Research (Reminder: Don’t forget to utilize the concept maps and study questions as you study this and the other chapters. The Experiment Here is our definition of an experiment: The experiment is a situation in which a researcher objectively observes phenomena which are made to occur in a strictly controlled situation where one or more variables are varied and the others are kept constant. This process is called manipulation. or medium quality designs are called quasi-experimental designs. It is the variable that the researcher varies or manipulates in a specific way in order to learn its impact on the outcome variable. and we talk about two sets of experimental designs (weak designs and strong designs). The middle of the road. • The conditions which the researcher presents are systematically varied to see if a person's responses change with the variation in these conditions. • The causal variable is the independent variable (IV) and the effect or outcome variable is the dependent variable (DV). • Experimental research allows us to identify causal relationships because we observe the result of systematically changing one or more variables under controlled conditions.) It is important for you to remember that whenever an experimental research study is conducted the researcher's interest is always in determining cause and effect. they are better than the weak designs discussed in this chapter. Independent Variable Manipulation The independent variable is the variable that is assumed to be the cause of the effect. we talk about how to control for extraneous variables. Ways of Manipulating the Independent Variable In Figure 9. • This means that we observe a person's response to a set of conditions that the experimenter presents.

This is the amount technique. This is the presence or absence technique. Control of Confounding Variables Page 79 of 179 . A third way of manipulating the independent variable is to vary the type of the condition or treatment administered. the independent variable can be manipulated by presenting a condition or treatment to one group of individuals and withholding the condition or treatment from another group of individuals. Second. This is the type technique.• • • First. One type of drug may be administered to one group of learning disabled children and another type of drug may be administered to another group of learning disabled children. the independent variable can be manipulated by varying the amount of a condition or variable such as varying the amount of a drug which is given to children with a learning disorder.

• In experiments. • The purpose of random assignment is to take a sample (usually a convenience sample) and use the process of randomization to divide it into two or more groups that Page 80 of 179 . The control techniques are essentially attempts to make the groups similar or equivalent. we want our groups to be the same (or “equivalent” on all potentially confounding extraneous variables). • If the comparison groups are the same on all extraneous variables at the start of the experiment. You want the groups to be the same on all other variables (i. Then. you should randomly assign whenever and wherever possible.Potential confounding variables can be controlled for by using of one or more of a variety of techniques that eliminate the differential influence an extraneous variable may have for the comparison groups in a research study. holding the extraneous variable constant. (Note: I strongly recommend that you re-read the section titled Random Selection and Ransom Assignment on pages 216-217.. • Random assignment makes the groups similar on all variables at the start of the experiment. you will not know whether the outcome is due to the treatment or due to the effect of gender. Random Assignment Random assignment is the most important technique that can be used to control confounding variables because it has the ability to control for both known and unknown confounding extraneous variables. • Differential influence occurs when the influence of an extraneous variable is different for the various comparison groups. the groups will be mirror images of each other. but will help you with this very important distinction!) • The purpose of random selection is to generate a sample that represents a larger population. then differential influence is unlikely to occur. As a result. matching. This topic was covered in our earlier chapter on Sampling (Chapter 7). building the extraneous variable into the research design. Remember this important point: You want all of your comparison groups to be similar to each other (on all characteristics or variables) at the start of an experiment. • If random assignment is successful. • For example. counterbalancing. then the gender may have a differentially effect on the outcome. • You want the only systematic difference between the groups in an experiment to be the variation of the independent variable. it is only three paragraphs long. You must be careful not to confuse random assignment with random selection! The two techniques differ in purpose. if one group is mostly females and the other group is mostly males.e. Because of this characteristic. Now we will discuss these six techniques that are used to control for confounding variables: random assignment. the same on extraneous or confounding variables). after manipulating the independent variable you will be better able to attribute the difference observed at the posttest to the independent variable because one group got a treatment and the other group did not. and analysis of covariance.

• In other words. • Because the primary goal is experimental research is to establish firm evidence of cause and effect.represent each other. random assignment eliminates the problem of differential influence by making the groups similar on all extraneous variables. random assignment is more important than random selection in experimental research. • This means that the research participants and their characteristics should be distributed approximately equally in all comparison groups! • Again. • Note that random selection (randomly selecting a sample from a population) helps ensure external validity. That is. and random assignment (randomly dividing a set of people into multiple groups) helps ensure internal validity. random assignment is the best way to create equivalent groups for use in experimental research. It does they by insuring that each participant has an equal chance of being assigned to each comparison group. Here is one way to carry out random assignment that we included in the first edition of our textbook: Page 81 of 179 . you use random assignment to create probabilistically “equivalent” groups. • The equal probability of assignment means that not only are participants equally likely to be assigned to each comparison group but that the characteristics they bring with them are also equally likely to be assigned to each comparison group. Random assignment controls for the problem of differential influence (that was discussed earlier). It that is counterintuitive to you. then please reread it as many times as is necessary.

Page 82 of 179 .

These variables that you decide to use are called the matching variables. Once you have completed this. If you do this then you have actually merged two control techniques: matching and random assignment).. That is. let’s say that you decide to equate your two groups (treatment and control group) on IQ.graphpad. What you would do is to rank order all of the participants on IQ.e. Then take the next two highest IQ participants and assign one to the experimental group and one to the control group.e. • A weakness of matching when it is used alone (i. That is.Another way to conduct random assignment is to assign each person in your sample a number and then use a random assignment computer program. Here is one: http://www.cfm Matching Matching controls for confounding extraneous variables by equating the comparison groups on one or more variables that are correlated with the dependent variable. Then just continue this process until you assign one of the lowest IQ participants to one group and the other lowest IQ participant to the other group. IQ is going to be your only matching variable.. the two people with the two highest IQs) and put one in the experimental treatment group and the other in the control group (The best way to do this is to use random assignment to make these assignments. your two groups will be matched on IQ! If you use matching without random assignment. • You can match your groups on one or more extraneous variables. Then select the first two (i. • Matching controls for the matching variables. Page 83 of 179 . • What you have to do is to decide what extraneous variables you want to match on (i. decide what specific variables you want to make your groups similar on). it eliminates any differential influence of the matching variables.com/quickcalcs/randomize1... you run into the problem that although you know that your groups are matched on IQ you have not matched them on other potentially important variables. without also using random assignment) is that you will know that the groups are equated on the matching variable(s) but you will not know whether the groups are similar on other potentially confounding variables.e. • For example.

Order effects that need to be controlled. When this occurs the responses in subsequent treatment conditions are a function of the present treatment condition as well as any lingering effect of the prior treatment condition.e.. participants’ performance in a later treatment is different because of the treatment that occurred prior to it. this if you used this technique you would either study females only or males only. you might use only people who have an IQ of 120-125 in your research study if you are worried about IQ as being a confounding variable.. they may perform better simply because are now familiar with the setting and testing that they acquired earlier.. • Here is the good news! Counterbalancing is a control technique that can be used to control for order effects and carry-over effects. • Order effects are sequencing effects that arise from the order in which the treatments are administered. Physical conditions caused by the earlier treatment might also carry-over if the time elapsing between the treatments is not long enough for the earlier effect to dissipate. as people complete their participation in their first treatment condition they will become more familiar with the setting and testing process. • For example. you will be able to study the effect of your original independent variable as well as the additional variable(s) that you built into your design. For example. later. This is how the order can have an effect on the outcome. • Note that this technique is only relevant for a design in which the participants receive more than one treatment condition (e. • For example. you might decide to include females and males in your research study. Counterbalancing Counterbalancing is a technique used to control for sequencing effects (the two sequencing effects are order effects and carry-over effects). in their second treatment condition. • Carry-over effects are sequencing effects that occur when the effect of one treatment condition carries over to a second treatment condition. • This technique is especially useful when you want to study any effect that the potentially confounding extraneous variable might have (i. That is. but not both. such as the repeated measures design that is discussed later in the chapter) • Sequencing effects are biasing effects that can occur when each participant must participate in each experimental treatment condition. • A problem with this technique it that it can seriously limit your ability to generalize your study results (because you have limited your participants to only one type). Learning from the earlier treatment might carry-over to later treatments. When these people participate. Building the Extraneous Variable into the Research Design This technique takes a confounding extraneous variable and makes it an additional independent variable in your research study.Holding the Extraneous Variable Constant This technique controls for confounding extraneous variables by insuring that the participants in the different treatment groups have the same amount or type on a variable. • If you are worried about gender.g. Page 84 of 179 .

that is. in between the weak and the strong designs) depending on the extent to which they control for the influence of confounding variables. • Another problem with this design is that you do not know if some confounding extraneous variable affected the participants' responses to the dependent variable. • As another example. Weak Experimental Research Designs Some research designs are considered weak because they do not control for the influence of many confounding variables. in multigroup designs that have a pretest. • Analysis of covariance statistically adjusts the dependent variable scores for the differences that exist on an extraneous variable (your control variable). note that the only relevant extraneous variables are those that also affect participants' responses to the dependent variable. but you do it in different orders for different groups of people. • For example. That is. ANCOVA is used to equate the groups on the pretest. you don’t have a pretest or a control group to make your comparison with. in a learning research study you might want to control for intelligence because if there are more brighter students in one of two comparison groups (and these students are expected to learn faster) then the difference between the groups might be because the groups differ on IQ rather than the treatment variable. Experimental Research Designs A research design is the outline. The one-group posttest-only design is a very weak research design where one group of research participants receives an experimental treatment and is then post tested on the dependent variable. Page 85 of 179 . • When selecting variables to control for.• • You counterbalance by administering each experimental treatment condition to all groups of participants. • A serious problem with this design is that you do not know whether the treatment condition had any effect on the participants because you have no idea as to what their response would be if they were not exposed to the treatment condition. For example if you just had two groups making up your independent variable you could counterbalance by dividing you sample into two groups and giving this order to the first group (treatment one followed by treatment two) and giving this order to the second group (treatment two followed by treatment one). you would want to control for intelligence. plan. Analysis of Covariance Analysis of covariance (ANCOVA) is a statistical control technique that is used to statistically equate groups that differ on a pretest or some other variable. or strategy that you are going to use to obtain an answer to your research question. Research designs can be weak or strong (or quasi which are moderately strong. therefore.

so it is still difficult to identify the effect of the treatment condition. In this design. the participants are not randomly assigned to the groups so there is little assurance that the two groups are equated on any potentially confounding variables prior to the administration of the treatment condition. Because the participants were not randomly assigned to the comparison groups. It does not control for potentially confounding extraneous variables such as history. differential attrition. Here is a depiction of it: • • • • The one-group pretest-posttest design is a research design where one group of participants is pretested on the dependent variable and then posttested after the treatment condition has been administered. instrumentation. • Here is its depiction: • • • The posttest-only design with nonequivalent group includes an experimental group that receives the treatment condition and a control group that does not receive the treatment condition or receives some standard condition and both groups are posttested on the dependent variable. that indicates how the participants did prior to administration of the treatment condition. and regression artifacts.Because of the problems with this design it generally gives little evidence as to the effect of the treatment condition. This is a better design than the one-group posttest-only design because it at least includes a pretest. and the various additive and interaction effects Page 86 of 179 . the effect is taken to be the difference between the pretest and posttest scores. While this design includes a control group (which gives something to compare the treatment group with). The next design is the one-group pretest-posttest design. this design does not control for differential selection. maturation. The next of the weak experimental research designs is the posttest-only design with nonequivalent groups. testing.

1 on page 277. the repeated measures design. Both groups of participants are pre tested on the dependent variable and then post tested after the experimental treatment condition has been administered to the experimental group. • I will briefly discuss these strong designs: the pretest-posttest control-group design.For a summary of the threats to validity for the weak experimental designs. Here is a picture of it in its basic form: • • • • The pretest-posttest control-group design is a strong research design in which a group of research participants is randomly assigned to an experimental and control group. you should study Table 9.) The first strong experimental design is the pretest-posttest control-group design. strong research designs include a control group which is the comparison group that either does not receive the experimental treatment condition or receives some standard treatment condition. • The most important of these control techniques is random assignment. • In addition to including control techniques. This is an excellent research design because it includes a control or comparison group and has random assignment. Note that while this design is often presented as a two group design. Page 87 of 179 . look at and study Table 9. and the factorial design based on a mixed model. Strong Experimental Research Designs A research design is considered to be a "strong research design" if it controls for the influence of confounding extraneous variables. This is typically accomplished by including one or more control techniques into the research design. the factorial design. Differential attrition may or may not be a problem depending on what happens during the conduct of the experiment. it can be expanded to include a control group and as many experimental groups as are needed to test your research question.2 on page 281. (For a summary of all of these. This design controls for all of the standard threats to internal validity. the posttest-only control-group design.

The next strong experimental research design is the posttest-only control group design. but this does not detract from its internal validity because it includes the control group and random assignment which means that the experimental and control groups are equated at the outset of the experiment. The next strong experimental research design is the factorial design. For a depiction of this design. • This design does not include a pretest of the dependent variable. • Just like the previous design. please go to page 281 and look at it in Table 9. Page 88 of 179 . it controls for all of the standard threats to internal validity.287) and here for your convenience. Here is a picture of it: The posttest-only control group design is a research design in which the research participants are randomly assigned to an experimental and control group and then post tested on the dependent variable after the experimental group has received the experimental treatment condition. • This is an excellent research design because it includes a control or comparison group and has random assignment. Differential attrition may or may not be a problem depending on what happens during the conduct of the experiment.14 (p. The layout for a factorial design with two independent variables (Type of instruction and level of anxiety) is shown in Figure 9.2.

Page 89 of 179 . There are as many main effects in a factorial design as there are independent variables. Each combination of independent variables is called a "cell. one for gender and one for type of instruction. if gender is one independent variable and method of teaching mathematics is another independent variable. then there would potentially be two main effects. An interaction effect between two or more independent variables occurs when the effect which one independent variable has on the dependent variable depends on the level of the other independent variable." Research participants are randomly assigned to as many groups are there are cells of the factorial design if both of the independent variables can be manipulated. The data collected from this research give information on the effect of each independent variable separately and the interaction between the independent variables. For example. It also has random assignment to the groups. The research participants are administered the combination of independent variables that corresponds to the cell to which they have been assigned and then they respond to the dependent variable. an interaction would exist if the lecture method was more effective for teaching males mathematics and individualized instruction was more effective in teaching females mathematics. The effect of each independent variable on the dependent variable is called a main effect. If a research design included the independent variables of gender and type of instruction.• • • • • • • A factorial design is a design in which two or more independent variables are simultaneously investigated to determine the independent and interactive influence which they have on the dependent variable.

The last strong experimental research design discussed in this chapter is the factorial design based on a mixed model. Page 90 of 179 . first one and then the other. For example. This design has the advantage of requiring fewer participants than other designs because the same participants participate in all experimental conditions.The next strong experimental research design is the repeated-measures design. Here is a picture of this design when it has two independent variables: • The factorial design based on a mixed model is a factorial design in which different participants are randomly assigned to the different levels of one independent variable but all participants take all levels of another independent variable. Differential attrition may or may not be a problem depending on what happens during the conduct of the experiment. Here is a picture of it in its basic form with counterbalancing: • • • • • A repeated-measures design is a design in which all research participants receive all experimental treatment conditions. if you were investigating the effect of type of instruction on learning mathematics and you used two types of instruction (lecture method and individualized instruction) the participants would experience both types of instruction. If you use counterbalancing with this design. then all of the standard threats to internal validity are controlled for. This design also has the advantage of the participants in the various experimental groups being equated because they are the same participants in all of the treatment conditions.

two tables will be of maximum help.2 on page 281 shows the depictions of all of the strong experimental research designs and the threats to internal validity for each of these designs. • Table 9. As you study the designs in this chapter.1 on page 277 shows the depictions of all of the weak experimental research designs and the threats to internal validity for each of these designs. and all participants receive all levels of variable A. participants are randomly assigned to variable B. • Table 9.• • In the depiction above. Differential attrition may or may not be a problem depending on what happens during the conduct of the experiment. Here are copies of these two tables for your convenience. All of the standard threats to internal validity are controlled for with this design if conuterbalancing is used for the repeated measures independent variable. Page 91 of 179 .

Page 92 of 179 .

Page 93 of 179 .

• Causal explanations can be made when using quasi-experimental designs but only when you collect data that demonstrate that plausible rival explanations are unlikely. (In other words. there is no assurance that the groups are highly are Page 94 of 179 . Quasi-Experimental Research Designs These are designs that are used when it is not possible to control for all potentially confounding variables. and the regression discontinuity design.) The experimental research designs discussed in this chapter are used when it is impossible to randomly assign participants to comparison groups (quasi-experimental designs) and when a researcher is faced with a situation where only one or two participants can participate in the research study (single case designs). Because of the lack of random assignment. • You can view quasi-experiments as falling in the center of a continuum with weak experimental designs on the far left side and strong experimental designs on the far right side. • Like the designs in the last chapter. quasi-experimental and single-case designs do have manipulation of the independent variable (otherwise they would not be “experimental research” designs).Chapter 10 Quasi-Experimental and Single-Case Designs (Reminder: Don’t forget to utilize the concept maps and study questions as you study this and the other chapters. Nonequivalent Comparison-Group Design This is a design that contains a treatment group and a nonequivalent untreated comparison group about of which are administered pretest and posttest measures. and the evidence will still not be as strong as with one of the strong designs discussed in the last chapter. They are in-between or moderately strong designs. the interrupted time-series design. in most cases this is because the participants cannot be randomly assigned to the groups. The groups are “nonequivalent” because you lack random assignment (although there are some control techniques that can help make the groups similar such as matching and statistical control).) /------------------------------------/------------------------------------/ Weak Quasi Strong Designs Designs Designs • Three quasi-experimental research designs are presented in the text: the nonequivalent comparison-group design. quasi designs are not the worst and they are not the best.

g.similar at the outset of the study. reading ability. you will need to think about potential rival explanations during the planning phase of your research study so that you can collect the necessary data to control for these factors. The most common threat to the internal validity of this type of design is differential selection. Here is a depiction of the nonequivalent comparison-group design: • • • Because there is no random assignment to groups. • • It is a good idea to collect data that can be used to demonstrate that key confounding variables are not the cause of the obtained results. etc. Here is a list of all of the primary threats to this design. You can eliminate the influence of many confounding variables by using the various control techniques. age.. Hence. IQ. attitude. especially statistical control (where you measure the confounding variables at the pretest and control for them using statistical procedures after the study Page 95 of 179 . gender.). confounding variables (rather than the independent variable) may explain any difference observed between the experimental and control groups. The problem is that the groups may be different on many variables that are also related to the dependent variable (e.

Here is a depiction of the interrupted time-series design: • • • The pretesting phase is called the baseline which refers to the observation of a behavior prior to the presentation of any treatment designed to alter the behavior of interest. the treatment effect is demonstrated by a discontinuity in the pattern of pretreatment and posttreatment responses. Only when you can rule out the effects of confounding variables can you confidently attribute the observed group difference at the posttest to the independent variable. That is.• has been completed) and matching (where you select people to be in the groups so that the members in the different groups are similar on the matching variables). an effect is demonstrated when there is a change in the level and/or slope of the posttreatment responses as compared to the pretreatment responses. In other words. For example. Here is an example where both the level and slope changed during the intervention: • Page 96 of 179 . Interrupted Time-Series Design This is a design in which a treatment condition is accessed by comparing the pattern of pretest responses with the pattern of posttest responses obtained from a single group of participants. A treatment effect is demonstrated only if the pattern of posttreatment responses differs from the pattern of pretreatment responses. the participants are pretested a number of times and then posttested a number of times after or during exposure to the treatment condition.

and so forth.e. the pretreatment and posttreatment responses will not differ on most confounding variables). However. there is an extension of the interrupted time-series design. • Here is a picture of the multiple time-series design: Page 97 of 179 . posttests. I mention this design because I do want you to remember that YOU can put together different designs simply by using different combinations of pretests. • Both the experimental and control groups are repeatedly pretested in the multiple timeseries design. • Bonus material (not required) Although not discussed in the text. both groups are repeatedly posttested. varying the number of pretests and posttests. the main potentially confounding variable that cannot be ruled out is a history effect. Then the experimental group receives the treatment and the control group receives some standard treatment or no treatment.. finally. and. different types of groups.• Many confounding variables are ruled out in the interrupted time-series design because they are present in both the pretreatment and posttreatment responses (i. using a control group or not. It is called the multiple time-series design—it is the basic interrupted time-series design with a comparable control group added to it. including more than one outcome variable. The history threat is a plausible rival explanation if some event other than the treatment co-occurs with the onset of the treatment.

e. and administer the treatment to those falling at 50 or higher and use those with scores lower that 50 as your control group. for a number of years. • Here is the depiction of the design: • • • For example you might use a standardized test as your assignment variable. One uses statistical techniques to control for differences on the assignment variable and then checks to see whether the groups significantly differ. Regression Discontinuity Design This is a design that is used to access the effect of a treatment condition by looking for a discontinuity in regression lines between individuals who score lower and higher than some predetermined cutoff score on an assignment variable.• Including a control group provides control for the history effect. been trying to get researcher to use this design more frequently. This is actually quite a strong design.. but only if the different groups are truly comparable and any history effect influences both groups to the same degree (i. The various additive and interactive effects remain as potential threats to this design. as long as you don't have a selection-history effect). Here is an example where a difference or “discontinuity” is easily seen: Page 98 of 179 . set the cutoff at 50. and methodologists have.

the pretreatment conditions are reinstated and the participant is again repeatedly tested on the dependent variable (the second A phase or the return to baseline condition). • We discuss several single-case designs: A-B-A design. and the changing-criterion design. • Here is a depiction of the A-B-A design: Page 99 of 179 . Single-Case Experimental Designs These are designs where the researcher attempts to demonstrate an experimental treatment effect using single participants.• If you cannot assign the participants to the treatment condition based on their assignment variable scores. then the experimental treatment condition is administered and the participant is repeatedly posttested (the B phase or treatment phase). Following the posttesting stage. A-B-A and A-B-A-B Designs The A-B-A design is a design in which the participant is repeatedly pretested (the first A phase or baseline condition). multiplebaseline design. A-B-A-B design. if you can do this. you will not be able to use this design. then this is an excellent design. On the other hand. one at a time.

or settings to identify the effect of an experimental treatment. attending to what the teacher says). you would hope to see a high-low-high pattern..low pattern. in an A-B design). • Here is a depiction of the design: Page 100 of 179 .• • • • • • • • • The effect of the experimental treatment is demonstrated if the pattern of the pre. Basically. behaviors. if you hope for low values on your dependent measure (e. This limitation can be overcome by including a fourth phase which adds a second administration of the treatment condition so the design becomes anA-B-A-B design. you are looking for the "fingerprint" of a stable baseline (during the first A phase). If a reversal to baseline conditions does not occur another design (such as the multiplebaseline design) must be used to demonstrate the effectiveness of the treatment condition. This may not occur if the experimental treatment is so powerful that its effect continues even when the treatment is withdrawn.. you would hope to see a low-high. Including the second A phase controls for the potential rival hypothesis of history that is a problem in a basic time series design (i.g. One limitation of the A-B-A design is that it ends with baseline condition or the withdrawal of the treatment condition so the participant does not receive the benefit of the treatment condition at the end of the experiment.. then a clear jump or change in level or slope (during the B phase). or settings. if you hope for high values on your dependent measure (e.g. behaviors.and posttreatment responses ( the first A phase and the B phase) differand the pattern of responses reverts back to the original pretreatment level when the pretreatment conditions are reinstated (the second A or return to baseline phase). A limitation of both the A-B-A and the A-B-A-B designs is that they are dependent on the pattern of responses reverting to baseline conditions when the experimental treatment condition is withdrawn. Conversely. and then a clear reversal or return to the stable baseline (during the second A phase). Multiple-Baseline Design This is a design that investigates two or more people. For example.e. The key is that the treatment condition is successively administered to the different people. talking out behavior).

behaviors. • The experimental treatment effect is demonstrated if a change in response occurs when the treatment is administered to each person.• The multiple-baseline design requires that baseline behavior is collected on the several people. behaviors.e. Here is an example where a treatment fingerprint is easily seen: • Page 101 of 179 . or setting (i. or settings. when the fingerprint you are looking for is observed). behavior.. or settings and then the experimental treatment is successively administered to the people.

This design avoids the problem of failure to revert to baseline that can exist with the AB-A and A-B-A-B designs.• • Rival hypotheses are unlikely to account for the changes in the behavior if the behavior change only occurs after the treatment effect is administered to each successive person. behavior. or setting. Changing-Criterion Design Page 102 of 179 .

This is a single-case design that is used when a behavior needs to be shaped over time or when it is necessary to gradually change a behavior through successive treatment periods to reach a desired criterion. Page 103 of 179 . Methodological Considerations in Using Single-Case Designs The following table presents some major methodological issues you must consider when using single-case designs. • This design involves collecting baseline data on the target behavior and then administering the experimental treatment condition across a series of intervention phases where each intervention phase uses a different criterion of successful performance until the desired criterion is reached. • The criterion used in each successive intervention phase should be large enough to detect a change in behavior but small enough so that it can be achieved. • Here is an example this design.

Page 104 of 179 .

and the researcher studies how variables are related.. This can include identifying mediating and moderating variables (see Table 2. there are some new considerations to think about if you want to be able to make any cause and effect claims at all (i. Here’s an example of an experiment where you could not manipulate the independent variable (smoking) for ethical and practical reasons: Randomly assign 500 newborns to experimental and control groups (250 in each group). variables in addition to your IV and DV that measure key extraneous variables). 5. Collect the data. where the experimental group newborns must smoke cigarettes and the controls do not smoke.2 on page 36 for definitions of these two terms).e. Steps in Nonexperimental Research The pretty much the same as they were in experimental research. Select the variables to be used in the study.. Nonexperimental research is research that lacks manipulation of the independent variable by the researcher. 3. This will help you to help rule out some alternative explanations.e.e. however. • Despite its limitations for studying cause and effect (compared to strong experimental research). the researcher studies what naturally occurs or has already occurred. that an IV--->DV). Note: longitudinal data (i. Determine the research problem and hypotheses to be tested. Interpret the results. Note: it is important to have or develop a theory to test in nonexperimental research if you are interested in making any claims of cause and effect. for ethical reasons.. collection of data at more than one time point) is helpful in nonexperimental research to establish the time ordering of your IV and DV if you are interested in cause and effect. Analyze the data.. 1.Chapter 11 No experimental Quantitative Research (Reminder: Don’t forget to utilize the concept maps and study questions as you study this and the other chapters. Note: statistical control techniques will be needed because of the problem of alternative explanations in nonexperimental research. Note: conclusions of cause and effect will be much weaker in nonexperimental research as compared to strong experimental and quasi-experimental research because the researcher cannot manipulate the independent variable in nonexperimental research. for practical reasons. nonexperimental research is very important in education.) Nonexperimental research is needed because there are many independent variables that we cannot manipulate for one reason or the other (e. 2.g. and for literal reasons such as it is impossible to manipulate some variables). Note: in nonexperimental research you will need to include some control variables (i. 4. Page 105 of 179 .

studies with only two variables). (This last point goes back to Figure 1. In the simple case of correlational research you have one quantitative IV (e. Page 106 of 179 .. • The researcher checks to see if the observed correlation is statistically significant (i. • By the way.. GPA.) Independent Variables in Nonexperimental Research This includes variables that cannot be manipulated. learning style. drug use. There are four major points in this section: 1. • Here are some examples of categorical independent variables (IVs) that cannot be manipulated—gender. • The researcher checks to see if the observed difference between the groups is statistically significant (i. any personality trait that is operationalized as a quantitative variable (e.. it is discussed in chapter 16). they are discussed in chapter 16). the Pearson correlation) only detects linear relationships.. but you must always watch out for the fallacy just mentioned and you must remember to empirically test any hypotheses that you develop after the fact so that you can check to see whether your hypothesis holds true with new data. performance on a math test). not due to chance) using the "t-test for correlation coefficients" (it tells you if the relationship is statistically significant.g. level of motivation) and one quantitative DV (performance on math test).e.. retention in grade.g.1 on page 18 showing the research wheel.g.. • Here are some examples of quantitative IVs that cannot be manipulated—intelligence. 2. you must test it.. should not be manipulated. or were not manipulated. personality type. arguing. In the simple case of causal-comparative research you have one categorical IV (e. after generating a hypothesis...e. Simple Cases of Causal-Comparative and Correlational Research Although the terms causal-comparative research and correlational research are dated.e. looking at your data and developing ideas to examine in future research). that A must have caused B simply because you have observed in the past that A preceded B). it is important to watch out for the post hoc fallacy (i. age.e.e.When examining or conducting nonexperimental research. not just due to chance) using a "t-test" or an "ANOVA" (these are statistical tests discussed in a later chapter. it is still useful to think about the simple cases of these (i. post hoc or inductive reasoning is fine (i.e. parenting style.g. gender) and one quantitative DV (e. • Remember that the commonly used correlation coefficient (i. ethnicity. level of self-esteem). • It is generally recommended that researchers should not turn quantitative independent variables into categorical variables. after the fact. they tell you if the difference between the means is statistically significant.. In other words.

3. • It is. we know that biological sex occurs before achievement on a math Page 107 of 179 . • And once you move on to these improved nonexperimental designs. talk about the design in terms of the research objective and the time dimension (which is discussed below. Here are the conditions (which have been stated in previous chapters) in a summary table: Applying the Three Necessary Conditions for Causation in Nonexperimental Research Nonexperimental research is much weaker than strong and quasi experimental research for making justified judgments about cause and effect. quite easy to establish condition 1 in nonexperimental research—just see if the variables are related For example. • It is much more difficult to establish conditions 2 and 3 (especially 3). You can improve on the simple cases by controlling for extraneous variables and designing longitudinal studies (discussed below). and summarized in Table 11. Are the variables correlated? or Is there a difference between the means?. you should drop the “correlational” and “causal-comparative” terminology and. • That's because "observing a relationship between two variables is not sufficient grounds for concluding that the relationship is a causal relationship. researchers use logic and theory (e." (Remember this important point!) 4.3) The Three Necessary Conditions for Cause-and-Effect Relationships It is essential that your remember that researchers must establish three conditions if they are to make a defensible conclusion that changes in variable A causechanges in variable B.. instead. • When attempting to establish condition 2. however.g. It is essential that you remember this point: Both of the simple cases of nonexperimental research are seriously flawed if you are interested in concluding that an observed relationship is a causal relationship.

e. you need to understand the idea of controlling for a variable. make a list of extraneous variables that you want to measure in your research study). it is a spurious relationship)...2 below..g. researchers use logic and theory (e.. In Figure 11. When attempting to establish condition 3. no relationship). control techniques (such as statistical control and matching).e. Page 108 of 179 .. Did you know that there is a correlation between the number of fire trucks responding to a fire and the amount of fire damage? Obviously this is not a causal relationship (i. Condition 3 is a serious problem in nonexperimental research because it is always possible that an observed relationship is "spurious" (i. To get things started. and design approaches (such as using a longitudinal design rather than a crosssectional design). the original positive correlation between the number of fire trucks responding and the amount of fire damage becomes a zero correlation (i.g. longitudinal research is a strong design for establishing proper time order). Here is an example: first.• • • • test) and design approaches that are covered later in this chapter (e.e. you can see that after we control for the size of fire. The rest of the chapter will be explaining these points. due to some confounding extraneous variable or "third variable").

to see if males and females who have equal amounts of education differ in income levels).. What do you think? To test this alternative explanation (i. You could do this by finding people for each of the cells of the following table: Low Medium High Motivation Motivation Motivation Low GPA 15 people 15 people 15 people Medium GPA 15 people 15 people 15 people High GPA 15 people 15 people 15 people Page 109 of 179 . you could attempt to find someone like each person in group one on the matching variable and place these individuals into group two. there is still a relationship between gender and income. intelligence) and you are going to use it in the technique called matching..e. • A "matching variable" is an extraneous variable you wish to control for (e. your IV is categorical). and high GPAs at the different levels of motivation as shown in the following table. it is due not to gender but to education) you could examine the average income levels of makes and females ate each of the levels of education (i. are there any other variables that you think will eliminate the relationship between gender and income? Techniques of Control in Nonexperimental Research We discuss three ways to control for extraneous variables in nonexperimental research. income.” And. • If your IV is a quantitative such as level of motivation and you want to see if motivation is related to test performance.e. you would have to find individuals with low. gender..e. Perhaps this relationship would disappear if we controlled for the amount of education people had. If gender and income are still related (i. medium. if men earn more money than women at each level of education) then you would conclude make this conclusion: “After controlling for education. that is exactly what happens if you examine the real data (actually the relationship becomes a little smaller but there is still a relationship).. Matching. In particular.• Here is one more example of controlling for a variable: There is a relationship between gender and income in the United States. To do this. by the way. In other words. Can you think of any additional variables you would like to control for? That is. • If you have two groups (i. men earn more money than women.g. you could in effect construct a control group. 1..e. you might decide to us GPA as your matching variable.

Again. but that’s what it does). Again. matching makes your independent variable and the matching variable uncorrelated and unconfounded. in cross-sectional research the data are collected at a single point in time. and in retrospective research the researcher looks backward in time to obtain the desired data. Now I am going to talk about the two key dimensions that should be used in constructing a nonexperimental research design: the time dimension and the research objective dimension.) The Time Dimension in Research Nonexperimental research can be classified according to the time dimension. the computer program (such as SPSS) does this for you.g. This technique shows the relationship between a categorical IV and a quantitative DV after statistically controlling for one or more quantitative control/extraneous variables. • A second type of statistical control is called ANCOVA (or analysis of covariance). 2. in longitudinal or prospective research data are collected at two or more time points moving forward. 3. (Note that these dimensions eliminate the need for the terms correlational and causalcomparative in nonexperimental research. In particular. . the computer will actually do the ANCOVA for you. actually.. This technique shows the correlation between two quantitative variables after statistically controlling for one or more quantitative control/extraneous variables. you would only include females in your research study (or you would only include males in your study). Statistical control (it's based on the following logic: examine the relationship between the IV and the DV at each level of the control/extraneous variable. motivation and test grades) you will be able that the relationship is not due to gender because you have made it a constant (by only including one gender in your study). • If you use this strategy. • One type of statistical control is called partial correlation. the computer will do it for you. if you want to control for gender using this strategy.• Technically speaking. If there is still a relationship between your IV and DV (e. Figure 11.3 shows and summarizes the three key ways that nonexperimental research data can vary along the time dimension. For example. Page 110 of 179 . you will include in your study participants that are all at the same constant level on the variable that you want to control for. What this means is that if you still see a relationship between your IV and your DV you can conclude that it is not because of the matching variable because you have controlled for that variable. you just have to figure out what you want to control for and collected the data. Holding the extraneous variable constant.

• Predictive nonexperimental research is used to predict the future status of one or more dependent variables (e. Here is a way to depict an indirect effect of X on Y: X -----> I ----->Y • A strength of causal modeling in nonexperimental research is that they develop detailed theories to test. Here is a way to depict a direct effect: X -----> Y • Also used to study indirect effects (effect of one variable on another through an intervening or mediator variable). constructing theoretical models and then checking their fit with the data) is commonly used in nonexperimental research.. • Descriptive nonexperimental research is used to provide a picture of the status or characteristics of a situation or phenomenon (e. Interest is in cause-and-effect relationships. and explanation. Causal modeling (i. prediction. what kind of personality do teachers tend to have based on the Myers-Briggs test?).. • Explanatory nonexperimental research is used to explain how and why a phenomenon operates as it does.e.g. What variables predict who will drop out of high school?). Page 111 of 179 .g. • Causal modeling is used to study direct effects (effect of one variable on another).. The three most common objectives are description. One type of explanatory research that I want to mention in this lecture is called theoretical modeling or causal modeling or structural equation modeling (those are all synonyms).Classifying Non experimental Research by Research Objective The idea here is that nonexperimental can be conducted for many reasons.

causal models with longitudinal data are generally better than causal models with cross-sectional data. crosssectional. or longitudinal)? 2. Also. or explanation)? Your answers to these two questions will lead you to one of the nine cells shown in the above table. prediction. are the data retrospective. What is the primary research objective (i. which means there is no manipulation. and you will recall that experimental research is stronger for studying cause and effect than nonexperimental research.e. Classifying Nonexperimental Research by Time and Research Objective So we talked about two key dimensions for classifying nonexperimental research: the time dimension and the research objective dimension. Here is the resulting Classification Table: If the above table seems complicated. then note that all you really have to do is to remember to answer these two questions: 1.• • A weakness of causal modeling in nonexperimental research is that the causal models are tested with nonexperimental data. Notice that these two dimensions can be crossed.. Page 112 of 179 . which forms a 3-by-3 table. How are your data collected in relation to time (i. which results in 9 types of nonexperimental research. description..e.

Chapter 12 Qualitative Research (Reminder: Don’t forget to utilize the concept maps and study questions as you study this and the other chapters.e. please carefully examine Patton’s excellent summary of the twelve major characteristics of qualitative research. take a quick look at Table 2.) Qualitative research relies primarily on the collection of qualitative data (i. to put things in perspective. and mixed research. That is. Next. • I suggest that. which is shown in Table 12. quantitative. nonnumeric data such as words and pictures).1 (page 362) and below: Page 113 of 179 .1 on page 31 (or go to lecture two because it is also included in the lecture). you start by reviewing the table showing the common differences between qualitative.. to further understand what qualitative research is all about.

note the key characteristics (i. data analysis.e. To get things started. data-collection methods. • Case study.Now you should understand what qualitative research is. In the rest of the chapter.2 on page 363 and below: Page 114 of 179 . purpose. • Ethnography. and report focus) of these four approaches as shown in Table 12. • Grounded theory.. origin. we discuss the four major types of qualitative research: • Phenomenology.

• Here is the foundational question in phenomenology: What is the meaning. the descriptive study of how individuals experience a phenomenon)..Phenomenology The first major approach to qualitative research is phenomenology (i. • Phenomenological researchers often search for commonalities across individuals (rather than only focusing on what is unique to a single individual). For example. • The researcher. searches for the invariant structures of individuals' experiences (also called the essences of their experience). what are the Page 115 of 179 .e. • Conducting in-depth interviews is a common method for gaining access to individuals' life. and essence of the lived experience of this phenomenon by an individual or by many individuals? • The researcher tries to gain access to individuals' life-worlds. it is where consciousness exists. structure. next.worlds. which is their world of experience.

values. that is. • One can study micro cultures (e. the concept of culture is of central importance. you should write a report that provides rich description and a "vicarious experience" of being there for the reader of the report.2) based on a phenomenological research study: In a caring interaction. norms. p. The nurse is viewed as being there only because it is a job and not to assist the client or answer his or her needs. such as the United States of America culture). Any response by the nurse is done with a minimal amount of energy expenditure and bound by the rules. and upset (From Creswell. the client’s needs are not met and the client has negative feelings. and security that the client expresses both physically and mentally are an immediate and direct result of the client’s stated and unstated needs being heard and responded to by the nurse (From Creswell. language.g. • Culture is the system of shared beliefs. and material things that group members use to understand their world. From the same study of nurses. 1. rituals. The relaxation. comfort. Page 116 of 179 . the discovery and description of the culture of a group of people). angry.289). Shown next are two good examples. This giving of oneself may be in response to the client’s request.. but it is more often a voluntary effort and is unsolicited by the client. depressed. an interaction that never happened is labeled as a noncaring interaction. 1998. p. The client is further devalued as a unique person because he or she is scolded. The nurse’s willingness to give of oneself is primarily perceived by the client as an attitude and behavior of sitting down and really listening and responding to the unique concerns of the individual as a person of value. or treated as a nonhuman being or an object. The client perceives the nurse who does not respond to this request for assistance as being noncaring. There is the aspect of the nurse giving of oneself to the client. 1998. such as the culture in a classroom) as well as macro cultures (e.. There are two additional or specialized types of ethnography.g. treated as a child. scared. frustrated. Here is a description of a caring nurse (from Exhibit 12. the nurse’s existential presence is perceived by the client as more than just a physical presence. Because of the devaluing and lack of concern. See if you get the feeling the patients had when they described caring and noncaring nurses.. afraid. Therefore. Here it is: The nurse’s presence with the client is perceived by the client as a minimal presence of the nurse being physically present only. Ethnography The second major approach to qualitative research is ethnography (i. • Here is the foundational question in ethnography: What are the cultural characteristics of this group of people or of this cultural scene? • Because ethnography originates in the discipline of Anthropology. practices.289). The nurse is too busy and hurried to spend time with the client and therefore does not sit down and really listen to the client’s individual concerns. a description also was provided of a noncaring nurse. Ethnology (the comparative study of cultural groups).e.• • • essences of peoples' experience of the death of a loved one? Here is another example: What are the essences of peoples' experiences of an uncaring nurse? After analyzing your phenomenological research data.

. Case Study Research The third major approach to qualitative research is case study research (i.e. Intrinsic case study (where the interest is only in understanding the particulars of the case). describes the whole and its parts) description of the case and its context.e. a group.e. • Here is the foundational question in case study research: What are the characteristics of this single case or of these comparison cases? • A case is a bounded system (e... Here are some more concepts that are commonly used by ethnographers: • Ethnocentrism (i.. 2..e. Robert Stake classifies case study research into three types: 1. social scientific view) and etic terms (i. in addition to its parts and their interrelationships).e. questionnaires). The final ethnography (i.e. The case study final report should provide a rich (i. • Etic perspective (i. a person. identifying so completely with the group being studied that you are unable to be objective). specialized words used by people in a group).. it involves describing the group as a whole unit.g.e.. vivid and detailed) and holistic (i. the insider's perspective) and emic terms (i. Grounded Theory Page 117 of 179 .. Because the roots of case study are interdisciplinary. judging others based on your cultural standards).. many different concepts and theories can be used to describe and explain the case. documents. outsider's words or specialized words used by social scientists). • Going native (i. Multiple methods of data collection are often used in case study research (e.2. • Holism (i. observation.e. You must avoid this problem if you are to be a successful ethnographer! • Emic perspective (i. interviews.e. the external. 3.. the report) should provide a rich and holistic description of the culture of the group under study. An ethnohistory is often done in the early stages of a standard ethnography in order to get a sense of the group's cultural history.e..e. Ethnohistory (the study of the cultural past of a group of people).g. an activity.. the idea that the whole is greater than the sum of its parts. a process). the detailed account and analysis of one or more cases). Collective case study (where interest is in studying and comparing multiple cases in a single research study). Instrumental case study (where the interest is in understanding something more general than the case)..

reading new examples in the published literature will help to further your understanding of these four important approaches to qualitative research.e.. Is the theory clear and understandable?). and finalizing the grounded theory). When collecting and analyzing the researcher needs theoretical sensitivity (i. 2. Can the theory be applied to produce real-world results?). • Here is the foundational question in grounded theory: What theory or explanation emerges from an analysis of the data collected about this phenomenon? • It is usually used to generate theory (remember from earlier chapters that theories tell you "How" and "Why" something operates as it does. • Control (i. Data analysis often follows three steps: 1.e.. Axial coding (i. the development of inductive. Open coding (i. "bottom-up. • Grounded theory can also be used to test or elaborate upon previously grounded theories.e. Is the theory abstract enough to move beyond the specifics in the original research study?). • Generality (i. 3. In addition. Final note: The chapter includes many examples of each of the four types of qualitative research to help in your understanding (i.e. Four important characteristics of a grounded theory are • Fit (i. as long as the approach continues to be one of constantly grounding any changes in the new data. Does the theory correspond to real-world data?).e. reading transcripts line-by. ethnography." theory that is "grounded" directly in the empirical data).e.. phenomenology.e. when no new concepts are emerging from the data and the theory is well validated). theories provide explanations). organizing the concepts and making them more abstract). developing the story..e...e. Selective coding (i. The grounded theory process is "complete" when theoretical saturation occurs (i.e. • Understanding (i. and grounded theory). Page 118 of 179 ... case study. Data collection and analysis continue throughout the study. being sensitive about what data are important in developing the grounded theory)...line and identifying and coding the concepts found in the data). The final report should include a detailed and clear description of the grounded theory.. focusing on the main ideas.e.The fourth major approach to qualitative research is grounded theory (i.

Chapter 13 Historical Research (Reminder: Don’t forget to utilize the concept maps and study questions as you study this and the other chapters.) What is Historical Research? Historical research is the process of systematically examining past events to give an account of what has happened in the past. • It is not a mere accumulation of facts and dates or even a description of past events. • It is a flowing, dynamic account of past events which involves an interpretation of the these events in an attempt to recapture the nuances, personalities, and ideas that influenced these events. • One of the goals of historical research is to communicate an understanding of past events. Significance of Historical Research The following gives five important reasons for conducting historical research (based on Berg, 1998): 1. To uncover the unknown (i.e., some historical events are not recorded). 2. To answer questions (i.e., there are many questions about our past that we not only want to know but can profit from knowing). 3. To identify the relationship that the past has to the present (i.e., knowing about the past can frequently give a better perspective of current events). 4. To record and evaluate the accomplishments of individuals, agencies, or institutions. 5. To assist in understanding the culture in which we live (e.g., education is a part of our history and our culture). Historical Research Methodology There is no one approach that is used in conducting historical research although there is a general set of steps that are typically followed. These include the following steps although there is some overlap and movement back and forth between the steps: 1. Identification of the research topic and formulation of the research problem or question.

2. Data collection or literature review. 3. Evaluation of materials.

4. Data synthesis. 5. Report preparation or preparation of the narrative exposition. Each of these steps is discussed briefly below. Identification of the Research Topic
Page 119 of 179

and Formulation of the Research Problem or Question This is the first step in any type of educational research including historical research. • Ideas for historical research topics can come from many different sources such as current issues in education, the accomplishments of an individual, an educational policy, or the relationship between events. Data Collection or Literature Review This step involves identifying, locating, and collecting information pertaining to the research topic. • The information sources are often contained in documents such as diaries or newspapers, records, photographs, relics, and interviews with individuals who have had experience with or have knowledge of the research topic. • Interviews with individuals who have knowledge of the research topic are called oral histories. • The documents, records, oral histories, and other information sources can be primary or secondary sources. • A primary source is a source that has a direct involvement with the event being investigated like a diary, an original map, or an interview with a person that experienced the event. • A secondary source is a source that was created from a primary source such as books written about the event. Secondary sources are considered less useful than primary sources. Evaluation of Materials Every information source must be evaluated for its authenticity and accuracy because any source can be affected by a variety of factors such as prejudice, economic conditions, and political climate. There are two types of evaluations every sources must pass. 1. External Criticism–this is the process of determining the validity, trustworthiness, or authenticity of the source. Sometimes this is difficult to do but other times it can easily be done by handwriting analysis or determining the age of the paper on which something was written. 2. Internal Criticism–this is the process of determining the reliability or accuracy of the information contained in the sources collected. This is done by positive and negative criticism. • Positive criticism refers to assuring that the statements made or the meaning conveyed in the sources are understood. This is frequently difficult because of the problems of vagueness and presentism. • Vagueness refers to uncertainty in the meaning of the words and phrases used in the source. • Presentism refers to the assumption that the present-day connotations of terms also existed in the past.
Page 120 of 179

Negative criticism refers to establishing the reliability or authenticity and accuracy of the content of the sources used. This is the more difficult part because it requires a judgment about the accuracy and authenticity of what is contained in the source. • Firsthand accounts by witnesses to an event are typically assumed to be reliable and accurate.

Historians often use three heuristics in handling evidence. These are corroboration, sourcing, and contextualization. • Corroboration, or comparing documents to each other to determine if they provide the same information, is often used to obtain information about accuracy and authenticity. • Sourcing, or identifying the author, date of creation of a document, and the place it was created is another technique that is used to establish the authenticity or accuracy of information. • Contextualization, or identifying when and where an event took place, is another technique used to establish authenticity and accuracy of information. Data Synthesis and Report Preparation This refers to synthesizing, or putting the material collected into a narrative account of the topic selected. • Synthesis refers to selecting, organizing, and analyzing the materials collected into topical themes and central ideas or concepts. These themes are then pulled together to form a contiguous and meaningful whole. Be sure to watch out for these four problems that might be encountered when you attempt to synthesize the material collected and prepare the narrative account. 1. Trying to infer causation from correlated events is the first problem. Just because two events occurred together does not necessarily mean that one event was the cause of the other. 2. A second problem is defining and interpreting key words so as to avoid ambiguity and to insure that they have the correct connotation. 3. A third problem is differentiating between evidence indicating how people should behave and how they in fact did behave. 4. A fourth problem is maintaining a distinction between intent and consequences. In other words, educational historians must make sure that the consequences that were observed from some activity or policy were the intended consequences.

Page 121 of 179

or any other type of assumptions. It is the third major research paradigm. Today. we have provided tables that show the strengths and weaknesses of quantitative research and qualitative research. (Pragmatism was started by the great American philosophers Charles Sanders Peirce. William James. • The philosophy of pragmatism says that researchers should use the approach or mixture of approaches that works the best in a real world situation. proponents of mixed research attempt to use what is called the fundamental principle of mixed research. Proponents of mixed research typically adhere to the compatibility thesis as well as to the philosophy of pragmatism. that is.Chapter 14 Mixed Research: Mixed Method and Mixed Model Research (Reminder: Don’t forget to utilize the concept maps and study questions as you study this and the other chapters. Here they are for your convenience: Page 122 of 179 . they can both be used in a single research study. • To aid you in applying this fundamental principle. • According to this fundamental principle. regardless of any philosophical assumptions. what works is what is useful and should be used. the researcher should use a mixture or combination of methods that has complementary strengths and nonoverlapping weaknesses. • The compatibility thesis is the idea that quantitative and qualitative methods are compatible.) This chapter is about mixed research. Mixed research is research in which quantitative and qualitative techniques are mixed in a single study. In short. paradigmatic assumptions. and John Dewey). adding an attractive alternative (when it is appropriate) to quantitative and qualitative research.

Page 123 of 179 .

you will see where you want to go in planning a mixed research study. Page 124 of 179 .Here is a list of the strengths and weaknesses of mixed research. Looking at the strengths.

The Research Continuum Research can be viewed as falling along a research continuum with “monomethod” research placed on the far left side. and “partially Page 125 of 179 . “fully mixed” research placed on the far right side.

Here are the two mixed model research subtypes: within-stage and across-stage mixed model research. quantitative and qualitative approaches are mixed within one or more of the stages of research.mixed” located in the center.. quantitative) questions.2 (shown below): Page 126 of 179 .. In within-stage mixed model research. 1. qualitative) questions and closed-ended (i. You should be able to take any given research study and place it somewhere on the continuum. In across-stage mixed model research. quantitative and qualitative approaches are mixed across at least two of the stages of research. • An example of within-stage mixed model research would be where you used a questionnaire during data collection that included both open-ended (i. 2. Across-stage mixed model research designs are easily seen by examining designs 2 through 7 in Figure 14.e. Mixed Model Research In mixed model research quantitative and qualitative approaches are mixed within or across the stages of the research process.e. Types of Mixed Research Methods There are two major types of mixed research: they are mixed model research and mixed method research.

• Capital letters denote priority or increased weight.e..2. Paradigm emphasis (i. In other words the quantitative phase was primary and the qualitative phase was supportive (and occurred first). In order to understand the designs. It is a 2-by-2 matrix. • QUAN and quan both stand for quantitative research..3. concurrent versus sequential) and 2. • A plus sign (+) indicated the concurrent collection of data. equal status versus dominant status). Time order (i. you need to ask yourself two questions: 1.• Here is an example of across-stage mixed model research: A researcher wants to explore (qualitative objective) why people take on-line college courses. Then in the quantitative phase the researcher does a quantitative study of predictors of dropping out. Do you want to operate largely within one dominant paradigm or not (i. using quantitative statistical methods. you need to first understand the notation that is used. in Figure 14.. Mixed method research designs are classified according to two major dimensions: 1.. Mixed Method Research In mixed method research.e. one before the other)? Page 127 of 179 . It’s like including a quantitative mini-study and a qualitative ministudy in one overall research study. Note that this is design 2 shown above in Figure 14. and 2. the researcher also reports the responses as percentages and examines the relationships between sets of categories or variables through the use of contingency tables. a qualitative phase and a quantitative phase are included in the overall research study. • Lowercase letters denote lower priority or weight. Perhaps a researcher does an open-ended survey to find some important categories or variables that students say are important reasons for dropping out of on-line courses. Below.e.. you can see the specific mixed method designs that result from crossing time order and paradigm emphasis.3. and it includes nine specific mixed method designs. • In order to use Figure 14. • An arrow (→) represents a sequential collection of data.e. the overall study is primarily quantitative but it is preceded by a qualitative phase. and then the researcher quantifies the results by counting the number of times each type of response occurs (quantitative data analysis). The researcher conducts open-ended interviews (qualitative data collection) asking them why they take on-line courses. • QUAL and qual both stand for qualitative research. do you want to use a dominant status design or an equal status design?).e. at roughly the same time) or sequentially (i. • For example: qual→QUAN is a dominant status. sequential design where. Do you want to conduct the phases concurrently (i.

always. Page 128 of 179 .4 (in the text. Stages of Mixed Research Process There are eight stages in the mixed research process. as shown in Figure 14. • Your goal is to pragmatically design a study that fits your particular needs and circumstances. is to answer you research question(s) and then to design a study that will help you to do that.3. Our designs are provided to get you started. • • It is important to understand that you are not limited to the mixed method or mixed model designs provided in this chapter. and here for your convenience). You goal.Your answers to these two questions will lead you to one of the designs in Figure 14. You should feel free to mix and match the designs into a design that best fits your needs. This includes designing studies that are a mix of mixed model and mixed method designs.

interpretation and validation of the data should be done throughout the data collection process.4: Page 129 of 179 . researchers often follow these steps in different orders.It is important to note that although the steps in mixed research are numbered. I will very briefly comment on each of the eight (nonlinear) steps: (1) Determine whether a mixed design is appropriate • Do you believe that you can best answer your research question(s) through the use of mixed research? • Do you believe that mixed research will offer you the best design for the amount and kind of evidence that you hope to obtain as you conduct your research study? (2) Determine the rationale for using a mixed design • The five most important rationales or purposes for mixed research are shown below in Table 14. depending on what particular needs and concerns arise or emerge during a particular research study. • For example.

e. that mixed research can help researchers to a lot of important things as they attempt to understand the world. and archived research data). focus groups. (4) Collect the data • Keep in mind the six major methods of data collection that we discussed in chapter 6: tests.e. converting qualitative data into quantitative data). • You might want to use the technique of qualitizing (i. Page 130 of 179 . (5) Analyze the data • You can use the quantitative data analysis techniques (Chapters 15 and 16) and qualitative data analysis techniques (Chapter 17)..4. converting quantitative data into qualitative data). in this lecture. and secondary or already existing data (such as personal and official documents.• You can see in Table 14. (3) Select the mixed method or mixed model research design • We have already shown you. • Remember that you can also build more unique and/or more complex designs than the ones we have shown as you plan a study that will help you to answer your research question(s).. • You might want to use the technique of quantitizing (i. interviews. physical data. questionnaires. observation. the basic mixed model designs and the basic mixed method designs.

• A couple of strategies to use during data interpretation are reflexivity (i. I highly recommend the following: Onwuegbuzie. and data interpretation continues throughout your research study. 228-248) and validity strategies used in qualitative research (pp. In Chapter 8 we discussed validity strategies used in quantitative research (pp. In A. and your report should also reflect mixing. Tashakkori & C. and negative-case sampling (i. Teddlie (Eds. A framework for analzing data in mixed methods research. which involves self-awareness and critical self-reflection by the researcher on his or her potential biases and predispositions as these may affect the research process and conclusions). mixed research is the newest research paradigm in educational research. attempting to locate and examine cases that disconfirm your expectations and tentative explanations). that is. & Teddlie. as you discuss your results you must relate the quantitative and qualitative parts of your research study to make sense of the overall study and to capitalize on the strengths of mixed research. (2003). A.e. Thousand Oaks. Handbook of mixed methods in social and behavioral research (pp.e.• For more information on data analysis in mixed research. Page 131 of 179 . you want to make sure that you continually use strategies that will provide valid data and help you to make defensible interpretations of your data.. and you should mix these in a way that best works for your mixed research study. that is. It offers much promise. • Remember that mixing MUST take place somewhere in mixed research if it is to truly be mixed research. (7) Interpret the data • Data interpretation begins as soon as you enter the field or collect the first datum (datum is the singular of data). CA: Sage. • Remember that data interpretation and data validation go hand-in-hand. (6) Validate the data • Data validation is something that should be done throughout your research study because if your data are not trustworthy then you study is not trustworthy. In conclusion. • Writing the report also can be started during data collection rather than waiting until the end.. and we expect to see much more methodological work and discussion about mixed research in the future as more researchers and book authors become aware of this important approach to empirical research. (8) Write the research report. 351-383). 249-256).J. • You should consider using quantitative and qualitative validity strategies in your study..). C.

1. a set of data with the "cases" going down the rows and the "variables" going across the columns) is shown in Table 15.1) into a statistical program such as SPSS. the use of statistics to describe.. Frequency Distributions One useful way to view the data of a variable is to construct a frequency distribution (i. • A data set (i. you are ready to obtain all the descriptive statistics that you want (i.e. the field of statistics can be divided into descriptive statistics and inferential statistics (and there are further subdivisions under inferential statistics which is the topic of the next chapter). • Once you put your data set (such as the one in Table 15. • An example is shown in Table 15. and sometimes percentages. Page 132 of 179 . of the occurrence of each unique data value are shown).1 (also shown below). summarize.2 in the book and here for your convenience. an arrangement in which the frequencies.Chapter 15 Descriptive Statistics An overview of the field of statistics is shown in Figure 15. and explain or make sense of a given set of data). which will help you to make some sense out of your data). This chapter is about descriptive statistics (i. As you can see...e.e.e..

999.999. line graphs.. and scatterplots. Bar Graphs Page 133 of 179 .e. 30. Note that the categories developed for a grouped frequency distribution must be mutually exclusive (the property that intervals do not overlap) andexhaustive (the property that a set of intervals or categories covers the complete range of data values).e. • Some common graphical representations are bar graphs.000-24.000-29. 40.000-44.• When a variable has a wide range of values.999. 25. where the data values are grouped into intervals and the frequencies of the intervals are shown). • • • Graphic Representations of Data Another excellent way to describe your data (especially for visually oriented learners) is to construct graphical representations of the data (i. An example of a grouped frequency distribution is shown on pate 437. histograms.999.000-39. pictorial representations of the data in twodimensional space).999.. one possible set of grouped intervals would be 20. you may prefer using a grouped frequency distribution (i. 35.000-34. For the above frequency distribution.

the data set shown on page 435). we often want to see the shape of the distribution of quantitative variables. • Bar graphs are typically used for categorical variables. • The height of the bars usually represent the frequencies for the categories that sit on the X axis. • Note that. • Here is a bar graph of one of the categorical variables included in the data set for this chapter (i. Histograms A histogram is a graphic that shows the frequencies and shape that characterize a quantitative variable. • Here is a histogram for a quantitative variable included in the data set for this chapter: Page 134 of 179 . having your computer program provide you with a histogram is a simple way to do this. by tradition.e. the X axis is the horizontal axis and the Y axis is the vertical axis..A bar graph uses vertical bars to represent the data. • In statistics.

.e. For example.15) used in factorial experimental designs to depict the relationship between two categorical independent variables and the dependent variable.Line Graphs A line graph uses one or more lines to depict information about one or more variables.. Page 135 of 179 . on the horizontal axis) and the dependent variable is represented by the Y axis (i. • Typically.4): • • • Line graphs are used for many different purposes in research.g. Yet another line graph is shown on page 468 in the next chapter.e. As you can see in the Figures just listed. on the vertical axis). • Here is an example of a line graph (Figure 15. • A simple line graph might be used to show a trend over time (e. Scatterplots A scatterplot is used to depict the relationship between two quantitative variables. the independent or predictor variable is represented by the X axis (i. line graphs have in common their use of one or more lines within the graph (to depict the levels or characteristics of a variable or to depict the relationships among variables)...g. This line graph shows that the "sampling distribution of the mean" is normally distributed. with the years on the X axis and the population sizes on the Y axis). Figure 9. you will see a line graph (e. if you will turn to page 290.

you first put your numbers in ascending or descending order.• Here is an example of a scatterplot showing the relationship between two of the quantitative variables from the data set for this chapter: Measures of Central Tendency Measures of central tendency provide descriptive information about the single numerical value that is considered to be the most typical of the values of a quantitative variable.. it is also the fiftieth percentile. If you have an odd number of numbers. and the mean. three is the median for the numbers 1. 3. 9). 1. • Three common measures of central tendency are the mode. 4. the median is the center number (e. the median. The median is the center point in a set of numbers. • Then you check to see which of the following two rules applies: • Rule One.g. Page 136 of 179 . • To get the median by hand. The mode is simply the most frequently occurring number.

and mode are affected by what is called skewness (i. median. Therefore. and mode as you “walk up the curve” for negatively and positively skewed curves.e.e. which showed a normal curve. all you need is the mean and the median): 1.. When the variable is skewed to the right (i.e. 2. The mean is the arithmetic average (e. and Mode The mean. positively skewed). A Comparison of the Mean..e. median. the mean shifts to the left the most. negatively skewed). the median is the average of the two innermost numbers (e. to where it is pulled out the most. this happens: mean > median > mode. 3.. and the mode the least affected by the presence of skew in the data. 3. Therefore. If you go to the end of the curve. and the mode the least affected.5 is the median for the numbers 1. If you have an even number of numbers. you will see that the order goes mean.g. • Here is Figure 15.. 3. the data are skewed to the left. 2.. 7). If the mean is less than the median. Rule One.• Rule Two. the mean is shifted to the right the most. a negatively skewed curve. and a positively skewed curve: • • • • • • Look at the above figure and note that when a variable is normally distributed. Page 137 of 179 . When the variable is skewed to the left (i. Median. and mode are the same number. is equal to 3). when the data are positively skewed.. the average of the numbers 2. You can use the following two rules to provide some information about skewness even when you cannot see a line graph of the data (i. when the data are negatively skewed. this happens: mean < median < mode. and 4. median.6. the mean.g. lack of symmetry) in the data. the median is shifted to the right the second most. the median shifts to the left the second most.

Note that measures of variability should be reported along with measures of central tendency because they provide very different but complementary and important information. • For example the range in Set A shown above is 7.. for the data 3. • When you have no variability. • Zero stands for no variability at all (e. 93. 3.g.2. 99. We will discuss three indices of variability: the range. 3. 100 If you said Set B is more spread out. 3. If the mean is greater than the median. • (Basically.e. All of the measures of variability should give us an indication of the amount of variability in a set of data. one that is highly variable and one that is not very variable. the data are skewed to the right. 10. Range A relatively crude indicator of variability is the range (i. Rule Two. 100 • Set B.. For example. 99.g. the variance and standard deviation will equal zero). An easy way to get the idea of variability is to look at two sets of data. 96... and the range in Set B shown above is 90..) • The variance tells you (exactly) the average deviation from the mean. which of these two sets of numbers appears to be the most spread out." • The standard deviation is just the square root of the variance (i. and then plug the relevant numbers into the variance formula. the variance. Page 138 of 179 . in "squared units. To fully interpret one (e. a standard deviation). you set up the three columns shown. 29. the same number). get the sum of the third column. which is the difference between the highest and lowest numbers). 52.g. they are more variability. it brings the "squared units" back to regular units).4 shows you how to easily calculate. 87. that is. Measures of Variability Measures of variability tell you how "spread out" or how much variability is present in a set of numbers. They tell you how different your numbers tend to be. then you are right! The numbers in set B are more "spread out". 92. 3.e. it is helpful to know about the other (e. Table 15. 3. by hand. Set A or Set B? • Set A. 99. the variance and standard deviation. the numbers are a constant (i. • Higher values for both of these indicators indicate a larger amount of variability than do lower numbers. 69.. and the standard deviation.e. Variance and Standard Deviation Two commonly used indicators of variability are the variance and the standard deviation. a mean). 98.

g.8 which shows these and some additional types of standard scores. Therefore.7 percent rule. • Note: percentile ranks are a different type of score. • Approximately 68% of the cases will fall within one standard deviation of the mean. • Approximately 95% of the cases will fall within two standard deviations of the mean. then the numbers tend to be about 1500 units from the mean. if you converted any set of scores (e. . 95. • SAT has a mean of 500 and a standard deviation of 100. 99. You can determine the mean of the type of standard scores below by simply looking under Mean. Two commonly used measures of relative standing are percentile ranks and Z-scores. then an easy rule to apply to the data is what we call “the 68. If the standard deviation is 1500. . then the numbers tend to be about 7 units from the mean.) Virtually everyone in education is already familiar with the normal curve (a picture of one is shown in Figure 15.• The standard deviation tells you (approximately) how far the numbers tend to vary from the mean. Here is Figure 15. • Approximately 99. • Z-Scores: have a mean of 0 and a standard deviation of 1. Measures of Relative Standing Measures of relative standing are used to provide information about where a particular score falls in relation to the other scores in a distribution of data. then that new set WILL have a mean of zero and a standard deviation of one. because they only have ordinal measurement properties.. If data are normally distributed. (If the standard deviation is 7. Page 139 of 179 . the set of student grades on a test) to z-scores. You can determine the standard deviation by looking at how much the scores increase as you move from the mean to 1 SD.7% of the cases will fall within three standard deviations of the mean. • IQ has a mean of 100 and a standard deviation of 15. the concept of standard deviation is not relevant." That is .7 on page 449).

Mean -----------------------Standard Deviation Z-score = Page 140 of 179 .Percentile Ranks A percentile rank tells you the percentage of scores in a reference group (i. • A SD of 2 says a score falls two standard deviations above the mean. • A SD of -3. if your percentile rank is 93 then you know that 93 percent of the scores in the reference group fall below your score. just use the following formula: Raw score . in the norming group) that fall below a particular raw score.5 says the score falls three and a half standard deviations below the mean. To transform a raw score into z-score units.e. • For example. Z-Scores A z-score tells you how many standard deviations (SD) a raw score falls from the mean..

you know that the mean for IQ scores is 100 and the standard deviation for IQ scores is 15 (because we told you this in the book and because you can see it by examining Figure 15. let’s convert a z-score of three to an IQ score • New score=3(15) + 100 (remember.e.8). Note that once you have a set of z-scores. compare across the rows. remember to use the following two rules: • Rule One. Contingency Tables When all of your variables are categorical. I introduce two additional techniques that you also can use for examining relationships among variables: contingency tables and regression analysis. For example..100 Z-score = --------------. and when 100 is added you get 145). compare down the columns. Examining Relationships Among Variables We have been talking about relationships among variables throughout your textbook. • Rule Two. you can get your z-score.. you can use contingency tables to see if your variables are related. • An example is shown in Table 15..g. and analysis of covariance (e. see pages 286-291). partial correlation (e. see page 341)..2 on page 44). Therefore. • A contingency table is a table displaying information in cells formed by the intersection of two or more categorical variables. At this point in this chapter on descriptive statistics. When interpreting a contingency table. Page 141 of 179 . 115 . we have already talked about correlation (e.. If the percentages are calculated down the columns. see Figure 2.g.= 15 1 An IQ of 115 falls one standard deviation above the mean. Therefore. if your IQ is 115. • For example. the mean of IQ scores is 100 and the standard deviation of IQ scores is 15).= 15 15 -------. you can convert to any other scale by using this formula: New score = Z-score(SD of new scale) + mean of the new scale.. the new score (i.For example.6.g. see pages 274-275 and pages 341-342).g. the IQ score converted from the z-score of 3 using the formula I just provided) is equal to 145 (3 times 15 is 45. If the percentages are calculated across the rows. • When you follow these rule you will be comparing the appropriate rates (a rate is the percentage of people in a group who have a specific characteristic).. analysis of variance which is used for factorial designs (e.

there is one quantitative dependent variable and two or more independent variables. • In multiple regression.9 in book and below). I show you the components of the regression equations (e.234. Page 142 of 179 . there is one quantitative dependent variable and one independent variable. On pages 455-459.000. it crosses the Y axis at $9.g.234. • Here is the simple regression equation showing the relationship between starting salary (Y or your dependent variable) and GPA (X or your independent variable) (two of the variables in the data set included with this chapter on page 435). Here are the important definitions: • Regression equation-The equation that defines the regression line (see Figure 15.56 + 7. so be careful. Ŷ = 9.56). you will often hear the announcers compare rates.. it crosses the Y axis a little below $10. the Yintercept and the regression coefficients). Regression Analysis Regression analysis is a set of statistical procedures used to explain or predict the values of a quantitative dependent variable based on the values of one or more independent variables.234. specifically.56 is the Y intercept (look at the above regression line.85 (X) • The 9. The failure of some researchers to follow the two rules just provided has resulted in misleading statements about how categorical variables are related. • In simple regression.• • When you listen to the local and national news.638.

85 is the simple regression coefficient. for what we called in earlier chapters statistical control).. Now. the regression coefficient is now called a partial regression coefficient. controlling for the other independent variables in the equation.• • • • The 7. • The main difference is that in multiple regression.e. which tells you the average amount of increase in starting salary that occurs when GPA increases by one unit.151. and this coefficient provides the predicted change in the dependent variable given a one unit change in the independent variable. In other words.5) and see what the predicted starting salary is. (Check on your work: it is $35.00 for GPA in the above equation and solve it. you can plug in a value for X (i.e. If you put in a 3.11 Now plug in another number within the range of the data (how about a 3. I show a multiple regression equation with two independent variables. starting salary) and easily get the predicted starting salary. you will see that the predicted starting salary is $32.970. you can use multiple regression to control for other variables (i.54) On pages 458-459. (It is also the slope or the rise over the run).. Page 143 of 179 .638.

However.e. Here is the link to all of the concept maps.. I will also be available to answer any questions you have.1 (p. Looking at Table 16. • The goal is to go beyond the data at hand and make inferences about population parameters. just select the one for this chapter:http://www.edu/coe/bset/johnson/dr_johnson/2conceptmaps. numerical characteristics of populations. Sampling Distributions Page 144 of 179 . Inferential statistics is defined as the branch of statistics that is used to make inferences about the characteristics of a populations based on sample data. we use the Greek letter mu (i.e.southalabama. and estimation is further divided into point and interval estimation. 434) and also shown in the previous lecture.Chapter 16 Inferential Statistics (REMINDER: as you read the lectures." • As you can see. to symbolize the sample mean. Please start this chapter by taking a look (again) at the divisions in the field of statistics that were shown in Figure 15.. it’s a good idea to also look at the concept map for each chapter.htm) This is probably the most challenging chapter in your book. you can understand it. some form of randomization must is assumed).1 (p. such as means and correlations) and English letters to symbolize sample statistics (i. It just takes attention and effort. it will become clear to you. (called X bar). • In order to use inferential statistics. inferential statistics is divided into estimation and hypothesis testing. numerical characteristics of samples.e. it is assumed that either random selection or random assignment was carried out (i. For example. such as means and correlations). • This shows the "big picture... µ) to symbolize the population mean and the Roman/English letter X with a bar over it.464 and shown below) you can see that statisticians use Greek letters to symbolize population parameters (i.e. After you carefully study the material. The concept maps help to give you the big picture and see how the concepts are related.

it is important to remember that a sampling distribution can be obtained for any statistic. and continue this process until you have calculated the means for all possible samples. • Sampling distribution of the difference between two means. note that the mean of the sampling distribution of the mean is equal to the population mean! That tells you that repeated sampling will. • The computer program that a researcher uses (e. • A sampling distribution is defined as "The theoretical probability distribution of the values of a statistic that results when all possible random samples of a particular size are drawn from a population. they do not collect all possible samples. In other words.g. For example. • Sampling distribution of the regression coefficient. • Also. • The smaller the standard error." (For simplicity you can view the idea of "all possible samples" as taking a million random samples. the less the amount of variability present in a sampling distribution. you could also obtain the following sampling distributions: • Sampling distribution of the percentage (or proportion). and then the statistical program selects the appropriate sampling distribution. over the long run. researchers typically select only one sample from the population of interest. calculate the mean.. Although I just described the sampling distribution of the mean.1 on page 468).. It is important to understand that researchers do not actually empirically construct sampling distributions! When conducting research. • Sampling distribution of the correlation. randomly select another sample. That's because the use of a sampling distributions is what allows us to make "probability" statements in inferential statistics. The spread or variance shows you that sample means will tend to be somewhat different from the true population mean in most particular samples.One of the most important concepts in inferential statistics is that of the sampling distribution. If you wanted to generate this distribution through the laborious process of doing it by hand (which you would NOT need to do in practice). produce the correct mean. The standard deviation of a sampling distribution is called the standard error. and you can construct a line graph to depict your sampling distribution of the mean (e. SPSS and SAS) uses the appropriate sampling distribution for you. • The sampling distribution of the mean is normally distributed (as long as your sample size is about 30 or more for your sampling). • Sampling distribution of the variance. you would randomly select a sample. just view it as taking a whole lot of samples!) • A one specific type of sampling distribution is called the sampling distribution of the mean. such as the sample size in your study). the standard error is just a special kind of standard deviation and you learned what a standard deviation was in the last chapter. calculate the mean. see Figure 16. Page 145 of 179 . That is.g. This process will give you a lot of means. • The computer program will look at the type of statistical analysis you select (and also consider certain additional information that you have provided.

In other words. • A point estimate is the value of your sample statistic (e.. Estimation The key estimation question is "Based on my random sample..e. all you need to do is to use the value of your sample statistic as your "best guess" (i. If the manager says it will cost you $500 then she is providing a point estimate. then your best guess or your point estimate for the population of adults in the U. whenever you engage in point estimation.S.) So please remember that the idea of sampling distributions (i. Page 146 of 179 . Oftentimes. There are actually two types of estimation.1: estimation and hypothesis testing.000.. If the manager says it will cost somewhere between $400 and $600 then she is providing an interval estimate. • For example.SPSS will take care of generating the appropriate sampling distribution for you if you give it the information it needs. • Again. a point estimate is a single number. Now. as your estimate) of the (unknown) population parameter. and an interval estimate is a range of numbers.e. In the above example. I'm going to cover the two branches of inferential statistics (i.e. your sample mean or sample correlation).000.. will be $45.• (It's kind of like the Greyhound Bus analogy: Leave the driving to us. • They can be first understood through the following analogy: Let's say that you take your car to your local car dealer's service department and you ask the service manager how much it will cost to repair your car.g. we like to put an interval around our point estimates so that we realize that the actual population value is somewhat different from our point estimate because sampling error is always present in sampling... and it is used to estimate the population parameter (e. what is my estimate of the population parameter?" • The basic idea is that you are going to use your sample data to provide information about the population. the population mean or the population correlation).. estimation and hypothesis testing) that were shown in Figure 15.g. the idea of probability distributions obtained from repeated sampling) underlies our ability to make probability statements in inferential statistics. if you take a random sample from adults living an the United States and you find that the average income for the people in your sample is $45. you used the value of the sample mean as the estimate of the population mean.

over repeated sampling).• • An interval estimate (also called a confidence interval) is a range of numbers inferred from the sample that has a known probability of capturing the population parameter over the long run (i.e. See Figure 16.471. p.) Here it is for your convenience: Page 147 of 179 .2. for a picture of twenty different confidence intervals randomly jumping around the population mean from sample to sample..

e. Cook. several additional procedures for this special case are discussed in Shadish.. in other words. you can use the estimation approach described in this chapter.000 is wider than the interval $43.000. 2002. (Note: if you expect the null to be true. Hypothesis Testing Hypothesis testing is the branch of inferential statistics that is concerned with how well the sample data support a null hypothesis and when the null hypothesis can be rejected in favor of the alternative hypothesis. • 95% confidence intervals are popular with many researchers. In this case. then you will be able to be "95% confident" that it will include the population parameter. You might ask: So why don’t we just use 99% confidence intervals rather than 95% intervals. You might find that the confidence interval is $43.e. If you have the computer program give you a 99% confidence interval.000. it will capture the true parameter 99% of the time in the long run). 90% confidence intervals or 99% confidence intervals).g. since you will make fewer mistakes? • The answer is that for a given sample size. you can be "95% confident" that the average income is somewhere between $43..) Specifically.000 (used earlier as a point estimate) and surround it by a 95% confidence interval.000 and $47.S.000 to 50. For example. then you can be "99% confident" that the confidence interval provided will include the population parameter (i. if you have the computer provide you with a 95 percent confidence interval (based on your data). the interval $40. we usually hope to “nullify” the null hypothesis and tentatively accept the alternative hypothesis.000 to $47. • The alternative hypothesis is the logical opposite of the null hypothesis and says there is a relationship in the population. your “level of confidence” is 95%. and Campbell’s book Experimental and QuasiExperimental Designs.000. 52-53) • Here is the key question that is answered in hypothesis testing: "Is the value of my sample statistic unlikely enough (assuming that the null hypothesis is true) for me to reject the null hypothesis and tentatively accept the alternative hypothesis?" • Note that it is the null hypothesis that is directly tested in hypothesis testing (not the alternative hypothesis). you might take the point estimate of annual income of U. adults of $45.000 to $47. want to use other confidence intervals (e. That is. you may. • First note that the null hypothesis is usually the prediction that there is no relationship in the population. Page 148 of 179 . However.. For example. pp. less precise) than a 95% confidence interval. (You can't do this with a point estimate. the 99% confidence interval will be wider (i.• • • • The "beauty" of confidence intervals is that we know their probability (over the long run) of including the true population parameter. at times. • We use hypothesis testing when we expect a relationship to be present.

e. the null hypothesis is assumed to be true (i.. in research. researchers assume that the null hypothesis is true until the evidence suggests that it is not likely to be true. "strong evidence to the contrary" is found in a small probability value. the jury is told to assume that a person is innocent until they have heard all of the evidence and can make a decision.1 An Analogy From Jurisprudence The United States criminal justice system operates on the assumption that the defendant is innocent until proven guilty beyond a reasonable doubt. the researcher rejects the null hypothesis in the face of strong evidence to the contrary. in the procedure called hypothesis testing the researcher states the null and alternative hypotheses.. reread Exhibit 16. The researcher's null hypothesis might be that a technique of counseling does not work any better than no counseling. the decision to reject or not reject the null hypothesis is based on probability. rejects the assumption of no relationship). Page 149 of 179 . which says there is a relationship in the population). . In jurisprudence. and they make a decision about guilt or innocence. However. . inferential statistics gives researchers the probability of their making a mistake. and the researcher brings a null hypothesis to "trial" when he or she believes there is some evidence against the null hypothesis (i. things are still not completely settled because a mistake could have been made. the jury decides what constitutes reasonable doubt. in hypothesis testing. In short . Likewise. If this probability is not low. Likewise. If this probability is low. decisions of guilt or innocence are sometimes overturned or found to be incorrect. the researcher is able to reject the null hypothesis and accept the alternative hypothesis. In hypothesis testing. Then if the probability value is small. That is. The researcher is kind of like a prosecuting attorney.To get the idea of null hypothesis testing in your head. No matter what decision is made. which says the research result is unlikely if the null hypothesis is true. In hypothesis testing. The researcher uses inferential statistics to determine the probability of the evidence under the assumption that the null hypothesis is true. • • • • Here is the main point: In the United States System of Jurisprudence. so researchers sometimes make a mistake. in hypothesis testing. the researcher actually believes that the counseling technique does work better than no counseling). That is. When the researcher rejects the null hypothesis (i. the researcher is not able to reject the null hypothesis. this assumption is called the null hypothesis.. 473 and shown below).1 (p. he or she tentatively accepts the alternative hypothesis (i.. the jury rejects the claim of innocence (rejects the null) in the face of strong evidence to the contrary and makes the opposite conclusion that the defendant is guilty. In the courtroom.e. the researcher rejects the null hypothesis and goes with the alternative hypothesis and makes the claim that statistical significance has been found. it is assumed that there is no relationship) until evidence clearly calls this assumption into question. Exhibit 16. The prosecuting attorney brings someone to trial when he or she believes there is some evidence against the accused. In the courtroom.e.e. a defendant is "presumed innocent" until evidence calls this assumption into question. Similarly.

Now take a look at the research questions and the null and alternative hypotheses shown below and in Table 16. • You can also see in the table that hypotheses can be tested for many different kinds of research questions such as questions about means. when do you actually reject the null hypothesis and make the decision to tentatively accept the alternative hypothesis? • Earlier I mentioned that you reject the null hypothesis when the probability of your result assuming a true null is very small. • When you look at the table be sure to notice that the null hypothesis has the equality sign in it and the alternative hypothesis has the "not equals" sign in it. Then. and regression coefficients. you reject the null when the evidence would be unlikely under the assumption of the null. it is the level that you set so that Page 150 of 179 .2 (p. That is. if your probability value is less than or equal to your significance level. you set a significance level (also called the alpha level) to use in your research study.474). • It is essential that you understand the difference between the probability value (also called the p-value) and the significance level (also called the alpha level). correlations. You may be wondering. • The probability value is a number that is obtained from the SPSS computer printout. • In particular. and it tells you the probability of your result or a more extreme result when it is assumed that there is no relationship in the population (i. when you are assuming that the null hypothesis is true which is what we do in hypothesis testing and in jurisprudence). you reject the null hypothesis. which is the point at which you would consider a result to be very unlikely.. • The significance level is just that point at which you would consider a result to be "rare. A significance level is not an empirical result. It is based on your empirical data." You are the one who decides on the significance level to use in your research study.e.

05) then you will fail to reject the null.3. make a substantive. . The significance level that is usually used in education is . you must interpret your results.05) then you will reject the null hypothesis and tentatively accept the alternative hypothesis. in case you don't have your book handy.480) and shown below. You just compare your probability value with your significance level. and determine the practical significance of your result.3 (p. • Be sure to note the final step shown in the table. It boils down to this: if your probability value is less than or equal to the significance level (e. If it is. You must memorize the definitions of probability value and significance level right away because they are at the heart of hypothesis testing. Here is Table 16. the process just boils down to seeing whether you probability value is less than (or equal to) your significance level.g. Page 151 of 179 . if it is > .. At the most simple level. you are happy because you can reject the null hypothesis and make the claim of statistical significance. because after conducting a hypothesis test. If not (i.• • • you will know what probability value will be small enough for you to reject the null hypothesis. real-world decision..) This full process of hypothesis testing is summarized in Table 16.05. (Still don’t forget the last step of determining practical significance.e.

At the end of step four you will know whether your result is statistically significant.Step 5 shows that you must decide what the results of your research study actually mean. • Statistical significance does not tell you whether you have practical significance. Page 152 of 179 .

482) and here for your convenience. the costs of using a statistically significant intervention in the real world..85 would probably be practically significant. above the two columns) you will see that the null hypothesis is either true or not true in the empirical world.g. An effect size indicator can aid in your determination of practical significance and should always be examined to help interpret the strength of a statistically significant relationship. • This idea is shown below and in Table 16.e. • • • • Looking at the top of the table (i. etc.e. An effect size indicator is defined as a measure of the strength of a relationship. a correlation of . The next idea is for you to realize that you will either make a correct decision about statistical significance or you will make an error whenever you conduct a hypothesis test. in your opinion.. For example. That is. a correlation of . Practical significance requires you to make a non-quantitative decision and to think about many different factors such as the size of the relationship. It is a decision that YOU make. When the null is false you want to reject it.15 would probably not be practically significant. even if it was statistically significant. The four logical possibilities of hypothesis testing are shown in the table. to be of practical use. Page 153 of 179 .. If you look at the side of the table (i. but when it is true you do not want to reject it. A finding is practically significant when the difference between the means or the size of the correlation is big enough. whether an intervention would transfer well to the real world. beside the two rows) you will see that you must make a decision to either fail to reject or to reject the null hypothesis.5 (p.• • • • If a finding is statistically significant then you can claim that the evidence suggests that the observed result (e. On the other hand. there probably is some nonzero relation present in the population. your observed correlation or your observed difference between two means) was probably not just due to chance.

You need to memorize the definitions of Type I and Type II errors. fail to reject the null) or you can make the incorrect decision (rejecting the true null). Explain the idea of Type I and Type II errors here.1 (p.e.. a person is presumed to be innocent (i. and after working with many examples of hypothesis testing they will become easier to ponder. Which error has occurred when an innocent person is found guilty? Which error has occurred when a guilty person is found innocent by the jury? (The answers are below.e. Page 154 of 179 . I apply the process of hypothesis testing (which is also called "significance testing") to the data set given in Table 15. Exercise: In law.e. 435) and shown again here (below).. rejecting the false null) or you can make the incorrect decision (failure to reject the false null). The incorrect decision is called a Type II error or a "false negative" because you have erroneously concluded that there is no effect or relationship in the population.) Hypothesis Testing in Practice In this last section of the chapter. When the null hypothesis is false you can also make the correct decision (i.. that is the null hypothesis).• • • • When the null hypothesis is true you can make the correct decision (i. The incorrect decision is called a Type I error or a "false positive" because you have erroneously concluded that there is an effect or relationship in the population.

• • • Since we are now using this data set for inferential statistics.05 for all of my statistical tests below. (The answers to the earlier questions about the two types of errors are in the first case a Type I error was made and in the second case a Type II error was made. I want to point out the reason WHY we use hypothesis or significance testing: We do it because researchers do not want to interpret findings that are not statistically significant because these findings are probably nothing but a reflection of chance fluctuations.) • Before I test some hypotheses. Note that there are three quantitative variables and two categorical variables (can you list them?). Page 155 of 179 . we will assume that the 25 people were randomly selected. Also note that I will use the significance level of .

05. and by thinking about the practical importance of the result. • The males’ mean is $34. the population mean for males equals the population mean for females) Alternative Hypothesis H1: µ M ≠ µ F (i.92. • Now I would need to look at the actual means and interpret them for substantive and practical significance. • I conclude that males earn more than females.049 is less than my significance level of .e.16. Here is an example using our “recent college graduate” data set: Page 156 of 179 .076.048 (I got this off of my SPSS printout). • Since my probability value of .e. • Again. I also conclude that this difference is practically significant. and because this is an important issue in society. t-Test for Independent Samples One frequently used statistical test is called the t-test for independent samples... Here is an example of the t-test for independent samples using our recent college graduate data set: • Research Question: Is the difference between average starting salary for males and the average starting salary for females significantly different? Here the hypotheses (note that they are stated in terms of population parameters): • • Null Hypothesis Ho: µ M = µ F (i.33 and the females’ mean is $31. • I conclude that the difference between the two means is statistically significant. And then I will also interpret the results by looking at the data. • To help in judging how different the means are. I also calculated an effect size indicator called eta-squared which was equal to . This tells me that gender explains 16% of the variance in starting salary in my data set. I reject the null hypothesis and accept the alternative. significance becomes very easy because you do the same procedure every single time. I will get the p-value and compare it to my preset significance level of . • I can simply look at these means and see how different they are.05 to see if the relationship is statistically significant. We do this when we want to determine if the difference between two groups is statistically significant.333. the population mean for males does not equal the population mean for females) • The probability value was . Determining the practical significance is probably the hardest part. after practice.Note that in all of the following examples I will be doing the same thing. One-Way Analysis of Variance One-way analysis of variance is used to compare two or more group means for statistical significance. looking at an effect size indicator.

Research Question: Is there a statistically significant difference in the starting salaries of education majors, arts and sciences majors, and engineering majors? Here the hypotheses (note that they are stated in terms of population parameters): Null Hypothesis. Ho: µ E = µ A&S = µ B (i.e., the population means for education students, arts and sciences students, and business students are all the same) Alternative Hypothesis. the same) H1: Not all equal (i.e., the population means are not all

• •

The probability value was .001 (I got this off of my SPSS printout). • Since .001 is less than .05, I reject the null hypothesis and accept the alternative. I conclude that at least two of the means are significantly different. • The effect size indicator, eta-squared, was equal to .467 which say that almost 47 percent in the variance of starting salary was explained or accounted for by differences in college major. • Now I need to find out which of the three means are different. • In order to decide which of these three means are significantly different, I must follow the “post hoc testing” procedure explained in the next. Notice that is I had done an ANOVA with an independent variable that was composed of only two groups, I would not need follow-up tests (which are only needed when there are three or more groups). Post Hoc Tests in Analysis of Variance Here are the three average starting salaries for the three groups examined in the previous analysis of variance (i.e., these are the three sample means): • Education: $29,500 • Arts and Sciences: $32,300 • Business: $36,714.29 The question in post hoc testing is "Which pairs of means are significantly different?" In this case that results in three post hoc tests that need to be conducted: 1. First, is the difference between education and arts and sciences significantly different" • Here are the null and alternative hypotheses for this first post hoc test: • Null Hypothesis Ho: µ E = µ A&S (i.e., the population mean for education majors equals the population mean for arts and sciences majors)

• • •

Alternative Hypothesis H1: µ E ≠ µ A&S (i.e., the population mean for education majors does not equal the population mean for arts and sciences majors) The Bonferroni "adjusted" p-value, which I got off the SPSS printout, was .233. Since .233 is > .05, I fail to reject the null that the population means for education and arts and sciences are equal. In short, this difference was not statistically significant.
Page 157 of 179

2. Second, is the difference between education and business significantly different? • Here are the null and alternative hypotheses for this first post hoc test: • Null Hypothesis Ho: µ E = µ B (i.e., the population mean for education majors equals the population mean for business majors) • Alternative Hypothesis H1: µ E ≠ µ B (i.e., the population mean for education majors does not equal the population mean for business majors) • The adjusted p-value was .001. • Since .001 is < .05, I reject the null that the two population means are equal. • I make the claim that the difference between the means is statistically significant. • I also claim that the salaries are higher for business than for education students in the populations from which they were randomly selected. • Because this finding could affect many students’ choices about majors and because it may also reflect the nature of salary setting by the private versus public sectors, I also conclude that this difference is practically significant. 3. Third, is the difference between arts and sciences and business significantly different? • Here are the null and alternative hypotheses for this first post hoc test: • Null Hypothesis Ho: µ B = µ A&S (i.e., the population mean for business majors equals the population mean for arts and sciences majors) • Alternative Hypothesis H1: µ B ≠ µ A&S (i.e., the population mean for business majors does not equal the population mean for arts and sciences majors) • The adjusted p-value was .031. • Since .031 is < .05, I reject the null hypothesis that the two population means are significantly different. • I make the claim that this difference between the means is statistically significant. • I also claim that the salaries are higher form arts and sciences than for education students in the populations from which they were randomly selected. • Because this finding could affect students’ choices about majoring in business versus arts and sciences, I believe that this finding is practically significant. In short, based on my post hoc tests, I have found that two of the differences in starting salary were statistically significant, and, in my view, these differences were also practically significant. The t-Test for Correlation Coefficients This test is used to determine whether an observed correlation coefficient is statistically significant. Here is an example using our “recent college graduate” data set: • Research Question: Is there a statistically significant correlation between GPA (X) and starting salary (Y)? • Here are the hypotheses:

Null Hypothesis.

H0: ΡXY = 0 (i.e., there is no correlation in the population)
Page 158 of 179

• • • • • • •

H1: ΡXY ≠ 0 (i.e., there is a correlation in the population) The observed correlation in the sample was .63. The probability value was .001. Since .001 is < .05, I reject the null hypothesis. The observed correlation was statistically significant. I conclude that GPA and starting salary are correlated in the population. If you square the correlation coefficient you obtain a “variance accounted for” effect size indicator: .63 squared is .397 which means that almost 40 percent of the variance in starting salary is explained or accounted for by GPA Because the effect size is large and because GPA is something that students can control through studying, I conclude that this statistically significant correlation is also practically significant. Alternative Hypothesis.

The t-Test for Regression Coefficients This test is used to determine whether a regression coefficient is statistically significant. The multiple regression equation analyzed in the last chapter is shown here again, but this time we will test each of the two regression coefficients for statistical significance. = 3,890.05 + 4,675.41 (X1) + 26.13(X2) where, is predicted starting salary 3,890.05 is the Y intercept (or predicted starting salary when GPA and GRE Verbal are zero) 4,675.41 is the regression coefficient for grade point average X1 is grade point average (GPA) X2 is GRE Verbal Research Question One: Is there a statistically significant relationship between starting salary (Y) and GPA (X1) controlling for GRE Verbal (X2)? That is, is the first regression coefficient statistically significant?
• •

Here are the hypotheses: Null Hypothesis. H0: βYX1.X2 = 0 (i.e., the population regression coefficient expressing the relationship between starting salary and GPA, controlling for GRE Verbal is equal to zero; that is, there is no relationship) Alternative Hypothesis. H1 : βYX1.X2 ≠ 0 (i.e., the population regression coefficient expressing the relationship between starting salary and GPA, controlling for GRE Verbal is NOT equal to zero; that is, there IS a relationship) The observed regression coefficient was 4,496.45. The probability value was .035
Page 159 of 179

• •

I conclude that the relationship expressed by this regression coefficients is practically significant. I conclude that the relationship expressed by this regression coefficient is statistically significant. Page 160 of 179 . controlling for GPA (X1)? That is.X1 = 0 (i.6 (p.035 is < . In this case it is equal to .. A good measure of effect size for regression coefficients is the semi-partial correlation squared (sr2) .10. there is no relationship) Alternative Hypothesis.05.014 Since .15. The probability value was . is the second regression coefficient statistically significant? • • • Here are the hypotheses: Null Hypothesis. Research Question Two: Is there a statistically significant relationship between starting salary (Y) and GRE Verbal (X2).13. the population regression coefficient expressing the relationship between starting salary and GRE Verbal. that is.492) is statistically significant. that is. It was equal to . In this case it is equal to . I conclude that the observed relationship in the contingency table shown in Table 16. • Since . which tells us that the relationship is moderately large. • The effect size indicator used for this contingency table is Cramer’s V.• • • Since . there IS a relationship) The observed regression coefficient was 26.X1 ≠ 0 (i.046 is < . A good measure of effect size for regression coefficients is the semi-partial correlation squared (sr2) . H1 : βYX2.e.. • • • • • The Chi-Square Test for Contingency Tables This test is used to determine whether a relationship observed in a contingency table is statistically significant. I conclude that the relationship expressed by this regression coefficient is practically significant.05. controlling for GPA is equal to zero.046.e.05. controlling for GPA is NOT equal to zero. I conclude that the relationship expressed by this regression coefficient is statistically significant. which means that 15% of the variance in starting salary is uniquely explained by GRE Verbal Because GRE Verbal is also something we can work at (as well as take preparation programs for) and because the effect is explains15% of the variance in starting salary. the population regression coefficient expressing the relationship between starting salary and GRE Verbal. H0: βYX2. which means that 10% of the variance in starting salary is uniquely explained by GPA Because GPA is something we can control and because the effect is explains a good amount of variance in starting salary. • Research Question: Is the observed relationship between college major and gender statistically significant? • The probability value was .014 is < .496.

you make a decision. and. we are done. Remember. My goal in this last section was to show that every single time we do one of these tests. you do the same thing. when reading journal articles look out for those probability values (to see if they are less than . finally. You have now come a long way toward understanding the logic of significance testing. You get your probably value.05). Believe it or not. I would also conclude that this relationship is practically significant.• Because the effect size indicator suggested a moderately large relationship and because of the importance of these variables in real world politics. compare it to your significance level. and also look for effect sizes and statements about whether a finding is practically significant Congratulations! Page 161 of 179 .

Then. observational notes. Data Entry and Storage Qualitative researchers usually transcribe their data. a list of all the codes that are developed and used in the research study). Page 162 of 179 . recording reflective notes about what you are learning from your data). or category names. whenever you find a meaningful segment of text in a transcript.e.. the cyclical process of collecting and analyzing data during a single research study). These are shown here for your convenience. • The term we use to describe this process is interim analysis (i. • Interim analysis continues until the process or topic the researcher is interested in is understood (or until you run out of time and resources!). To experience the process of coding.) into word processing documents. compare your results with the results shown in Table 17. they type the text (from interviews. • Coding is defined as marking the segments of data with symbols. After you are finished.e. the codes are reapplied to new segments of data each time an appropriate segment is encountered. you assign a code or category name to signify that particular segment. • It is these transcriptions that are later analyzed.2 and then try to segment and code the data. • The idea is to write memos to yourself when you have ideas and insights and to include those memos as additional data to be analyzed. • It is here that you carefully read your transcribed data.3. Coding and Developing Category Systems This is the next major stage of qualitative data analysis.. segmenting the data). etc. Again. During coding. you must keep a master list (i. Memoing Throughout the entire process of qualitative data analysis it is a good idea to engage in memoing (i. line by line.) The purposes of this chapter are to help you to grasp the language and terminology of qualitative data analysis and to help you understand the process of qualitative data analysis. Interim Analysis Data analysis tends to be an ongoing and iterative (nonlinear) process in qualitative research. descriptive words. When you locate meaningful segments. typically using one of the qualitative data analysis computer programs discussed later in this chapter.Chapter 17 Qualitative Data Analysis (Reminder: Don’t forget to utilize the concept maps and study questions as you study this and the other chapters. look at Table 17. memos.. You continue this process until you have segmented all of your data and have completed the initial coding..e. you code them. and divide the data into meaningful analytical units (i. that is.e.

.. qualitative research is very much an interpretative process! Now look at how I coded the above data.• Don't be surprised if your results are different from mine. As you can see. Page 163 of 179 .

Page 164 of 179 .

You will also continue to refine and revise your codes. • Intracoder reliability refers to consistency within a single coder.” “some. That's fine. it is often done in "qualitative" research. Therefore.” and so on. For example. you may find that the same segment of data gets coded with more than one code. These sets of codes are called co-occurring codes. in looking at language development in children you might be interested in age or gender.” “a few. • For example. and yes. and it commonly occurs. • Many qualitative researchers like to develop the codes as they code the data. • Inductive codes are codes that are developed by the researcher by directly examining the data.” “almost all. you might count the number of times a word appears in a document or you might count the number of times a code is applied to the data. • When reading "numbers" in qualitative research. These codes are called inductive codes. the same lines or segments of text may have more than one code attached to them. • Intercoder reliability refers to consistency among different coders. Oftentimes you may have an interest in the characteristics of the individuals you are studying. Enumeration Enumeration is the process of quantifying data. • You may decide to use a set of already existing codes with your data.Qualitative research is more defensible when multiple coders are used and when high interand intra-coder reliability are obtained. you will attempt to summarize and organize your data. • These codes that apply to the entire document or case are called facesheet codes. Co-Occurring and Facesheet Codes As you code your data. After you finish the initial coding of your data. then the reason could be that many people used the word or it could be that only one person used the word many times. • A priori codes are codes that are developed before examining the current data. • Enumeration is very helpful in clarifying words that you will want to use in your report such as “many. Page 165 of 179 . The numbers will help clarify what you mean by frequency. This next major step of summarizing your results includes such processes as enumeration and searching for relationships in the data. you may use codes that apply to the overall protocol or transcript you are coding. • Co-occurring codes are codes that partially or completely overlap. you should always check the basis of the numbers. These are called a priori codes. Inductive and a Priori Codes There are many different types of codes that are commonly used in qualitative data analysis. In other words. For example. if one word occurs many times and the basis is the total number of words in all the text documents.

The idea is that some ideas or themes are more general than others.2 on page 512) is Frontman and Kunkel's hierarchical classification showing the categorization of counselors' construal of success in the initial counseling session (i. and thus the codes are related vertically. what factors do counselors view as being related to success). etc.e. • One interesting example (shown in Figure 17. oranges. kiwi. the category of fruit has many types falling under it (e. • Here is a part of their hierarchical category system: Page 166 of 179 . Their classification system has four levels and many categories.Creating Hierarchical Category Systems Sometimes codes or categories can be organized into different levels or hierarchies... grapefruit.). • For example.g.

Page 167 of 179 .Showing Relationships Among Categories Qualitative researchers have a broad view of what constitutes a relationship. The hierarchical system just shown is one type of relationship (a hierarchy or strict inclusion type).

Page 168 of 179 .3 you can see a typology. In Figure 17. Also. Typologies (also called taxonomies) are an example of Spradley's "strict inclusion" type of relationship. of teacher roles in dealing with high school dropouts.• Several other possible types of relationships that you should be on the lookout for are shown in Table 17.6 (p. 514) and shown below for your convenience. see if you can think of some types of relationships that Spradley did not mention. • For practice. developed by Patton. see if you can think of an example of each of Spradley's types of relationships.

" "Complainer"). Patton provided very descriptive labels of the nine roles shown in the matrix (e.7 (p.g. Then Patton used the strategy of crossing two one-dimensional typologies to form a two dimensional matrix.7: Page 169 of 179 .517 and here for your convenience).. In Table 17. Patton first developed two separate dimensions or continuums or typologies in his data: (1) teachers' beliefs about how much responsibility they should take and (2) teachers' views about effective intervention strategies. you can see another set of categories developed from a developmental psychology qualitative research study. This is an example of Spradley's "sequence" type of relationship.Patton's example is interesting because it demonstrates a strategy that you can use to relate separate dimensions found in your data. "Ostrich." "Counselor/friend. • These categories are ordered by time and show the characteristics (subcategories) that are associated with five stages of development in old age that were identified in this study. resulting in a new typology that relates the two dimensions. Here is Table 17. • As you can see.

or outline to show how something works or clarify the relationship between the parts of a whole. we discuss another tool for organizing and summarizing your qualitative research data. In particular. • An example of a network diagram based on qualitative research is shown in Figure 17. • Patton’s typology of teacher roles shown above is an example of a matrix.3." • Developing a matrix is an excellent way to both find and show a relationship in your qualitative data.2. Page 170 of 179 . variables.. Figure 11. Drawing Diagrams Diagramming is the process of making a sketch. One type of diagram used in qualitative research that is similar to the diagrams used in causal modeling (e. on page 512 and Figure 17.4 and below for your convenience. look again at Figure 17. • A matrix is a rectangular array formed into rows and columns. on page 516. • A network diagram is a diagram showing the direct links between categories. • There are many types of diagrams that can be used in qualitative research.5 on page 352) is called a network diagram. • You can see examples of many different types of matrices (classifications usually based on two or more dimensions) and diagrams in Miles and Huberman's (1994) helpful book titled "Qualitative Data Analysis: An Expanded Sourcebook. or events over time. it was about the process of diagramming. • The use of diagrams are especially helpful for visually oriented learners. For some examples.g.In the next section of the chapter. drawing. It is also helpful to develop matrices to depict your data.

especially in Table 8.) Corroborating and Validating Results As shown in the depiction of data analysis in qualitative research in Figure 17.1. (More information about writing the qualitative report is given in the next chapter.As you can see. Page 171 of 179 . there are many interesting kinds of relationships to look for in qualitative research and there are many different ways to find. corroborating and validating the results is an essential component of data analysis and the qualitative research process. • Corroborating and validating should be done throughout the qualitative data collection.2 which is reproduced here for your convenience. analysis. • Many strategies are provided in Chapter 8. depict. Otherwise. and write-up process. and present the results in your qualitative research report. there is no reason to conduct a research study. • This is essential because you want to present trustworthy results to your readers.

• The availability of computer packages (that are specifically designed for qualitative data and analysis) has significantly reduced the need for the traditional filing technique. Page 172 of 179 . qualitative data were analyzed "by hand" using some form of filing system.Computer Programs for Qualitative Data Analysis In this final section of the chapter. • Traditionally. we discuss the use of computer programs in qualitative data analysis.

com Nvivo NUD-IST http://www. THEN. finding relationships. Here is a table not included in your book that provides the links to the major qualitative software programs. You now know the basics of qualitative data analysis! Page 173 of 179 . creating classification systems. and Ethnograph. NOT. Bonus Table: Websites for Qualitative Data Analysis Programs Program name AnSWR (freeware) ATLAS Ethnograph Website address http://www.com HyperResearch http://researchware.com • • • Qualitative data analysis programs can facilitate most of the techniques we have discussed in this chapter (e. One highly useful tool available in computer packages is Boolean operators which can be used in performing complex searches that would be very time consuming if done manually. and EXCEPT. currently. free of charge.gov/hiv/software/answr.cdc.. OR. ATLAS. and producing graphics). IF.qsrinternational.g. storing and coding. with demonstration copies of these packages.de/ http://qualisresearch. • Most of these companies will provide you.• The most popular qualitative data analysis packages. I concluded the chapter by listing several advantages and disadvantages of computer packages for qualitative data analysis. For example. you can search for the co-occurrence of codes which is one way to begin identifying relationships among your codes. attaching memos.htm http://atlasti.com http://www. enumeration. are NUDIST. Boolean operators are words that are used to create logical combinations such as AND.qsrinternational.

Specific instances of sexual behavior should be referred to with terms such as same gender.. clear. 2. and male-female. General Principles Related to Writing the Research Report. Choose accurate and clear words that are free from bias. female-female. Do not to equate people with their disability. • Simple.Chapter 18 Writing the Research Report (Reminder: Don’t forget to utilize the concept maps and study questions as you study this and the other chapters. General Principles Related to Writing the Research Report We begin this section with some general writing tips and by listing some sources on writing. African American). capitalize it (e. male-male. Write about your research participants in a way that acknowledges their participation. words such as “research participants” or children or adults are preferable. Writing Quantitative Research Reports Using the APA Style. The bottom line is to avoid sexist language. and bisexual women or men.g. 3. • Racial and Ethnic Identity. Writing Mixed Research Reports. Keeping in mind the above guidelines. avoid the impersonal term "subject" or subjects. Ask participants about their preferred designations and use them. 3. refer to a participant as a person who has cancer rather than as a cancer victim. • For example. and direct communication should be your most important goal when you write a research report. 4. • Sexual Orientation. There are four main sections in this chapter: 1. Writing Qualitative Research Reports. Terms such as homosexual should be replaced with terms such as lesbians. When writing this term. • Disabilities. gay men. Language The following three guidelines will help you select appropriate language in your report: 1. in the APA Publication Manual: • Gender. For example. especially. Page 174 of 179 . you should give special attention to the following issues which are explained more fully in our chapter and. Avoid labeling people whenever possible.) The purpose of this final chapter is to provide useful advice on how to organize and write a research paper that has the potential for publication. 2. One way to do this is to be very specific rather than less specific.

i.). Physical Measurements Page 175 of 179 . you can now use italics directly rather than using underlines to signal what is to be italicized.e.f. Acceptable terms are boy and girl. and ending with a period. Call people eighteen and older men and women. and place the second heading on the left side in upper. italicized. Numbers • Use words for numbers that begin a sentence and for numbers that are below ten.• Age. male adolescent and female adolescent.. Quotations of 40 or more words should be displayed in a free standing block of lines without quotation marks. do the first two levels as just shown for two levels.and lowercase letters and in italics. young man and young woman.. The author. Headings • The APA Manual and our chapter specifies five different levels of headings and the combinations in which they are to be used in your report. c. indented. year.. use italics infrequently. center the first level and use upper. and try to use conventional abbreviations (such as IQ. Abbreviations • Use abbreviations sparingly. • As a general rule. Editorial Style Italics. Here is an example of how to use three levels of headings: Method Procedure Instruments. and specific page from which the quote is taken should always be included. • See the APA Publication Manual for exceptions to this rule. do not use all caps). Older person is preferred to elderly.and lowercase letters.g. etc. The third level should be in upper. (Start the text on this same line) Quotations • Quotations of fewer than 40 words should be inserted into the text and enclosed in double quotation marks. Here is an example: Method Procedure • • If you are using three levels of headings. If you are submitting a paper for publication.and lowercase letters (i. e..e. • If you are using two levels of headings.

Turner.79. .A. • Note that the use of an equal sign is preferred when reporting probability values.• APA recommends using metric units for all physical measurements. Abstract 3. L. Here is an example. R.03). Method 5. See your book and the APA manual for specifics (e. Presentation of Statistical Results • Provide enough information to allow the reader to corroborate the results. Discussion 7. References I will make a few brief comments on each of these below.” Reference List • All citations in the text must appear in the reference list. You can also use other units. Results 6. The text shows the specifics. Title page 2. DC: Author. • Here are two examples: American Psychological Association. • Use only one space between the end of a sentence and the beginning of the next sentence. Typing • Double space all material.). as long as you include the metric equivalent in parentheses. & Johnson. (1994). A model of mastery motivation for at-risk preschoolers. an analysis of variance significance test of four group means would be presented like this: F(3. Introduction 4. . “Mastery motivation has been found to affect achievement with very young children (Turner & Johnson. then use p < . Publication manual of the American Psychological Association (4th ed. Journal of Educational Psychology. 2003). 32) = 8.B.001. • Here is one example: "Smith (1999) found that .000 Reference Citations in the Text • APA format is an author-date citation method.001 rather than p = .. • Use 1-inch margins. See page 456 of the text or the APA Manual for the specific format to follow. 495-505. p ═ . Washington." • Frequently you will put references at the end of sentences. (2003). 95(3).g. Page 176 of 179 . Writing Quantitative Research Reports Using the APA Style There are seven major parts to the research report: 1. • If a probability value is less than ..

1. Introduction • This section is not labeled. 4. It should present the research problem and place it in the context of other research literature in the area. Page 177 of 179 . What conclusion and theoretical implications can be drawn from the study? 4. Results • This does not start on a separate page in your manuscript.g. What are some suggestions for future research in this area? 7. list materials used and how they can be obtained). 3. For a manuscript submitted for review. their characteristics. 5. and figures is only in the textbook (and. How has it helped solve the study problem? 3.• Discussion of author notes. • Be sure to state whether your hypotheses were supported. 2. Title Page • Your paper title should summarize the main topic of the paper in about 10 to 12 words... • Also. and Procedure (e. Discussion • This is where you interpret and evaluate your results presented in the previous section. Apparatus or Materials or Instruments (e.g. it is typed on a separate page. References • Center the word References at the top of the page and double-space all entries. • The most common subsections are Participants (e. • Tables and figures are expensive but can be used when they effectively illustrate your ideas. answer the following questions: 1. • Be sure to report the significance level that you are using (e. "An alpha level of .. Abstract • This should be a comprehensive summary which is about 120 words. footnotes. of course.g. What are the limitations of the study? 5.05 was used in this study") and report your observed effect sizes along with the tests of statistical significance. figure captions. in the APA Publication Manual). • It is where you report on the results of your data analysis and statistical significance testing. list the number of participants..g. provide a step-by-step account of what the researcher and participants did during the study so that someone could replicate it). What does the study contribute? 2. 6. and how they were selected). tables. Method • This section does not start on a separate page in a manuscript being submitted for review.

" "explore a process. why the study was designed as it was. providing quotes. etc. Even if your research is exploratory. • Title Page and Abstract. following interpretative statements with examples. data from multiple sources. -You will need to find an appropriate balance between description and interpretation in order to write a useful and convincing results section. -Effective ways to organize the results section are organizing the content around the research questions. and so forth) that back up your assertions. You should provide a clear and descriptive title. descriptions. You may also make suggestions for future research here.. matrices. you must always provide data (i. -Several specific strategies are discussed in the chapter (e. with whom it was done. The overriding concern when writing the results section is to provide sufficient and convincing evidence.. to help communicate your ideas in a qualitative research report.g.e. how the data were collected and analyzed. Remember that assertions must be backed up with empirical data. Clearly explain the purpose of your study and situate it in any research literature that is relevant to your study. • Discussion. • Results. and the most important findings.). figures. In qualitative research. a typology created in the study. Writing Mixed Research Reports • First. The bottom line is this: It's about evidence. etc." • Method. Page 178 of 179 . the key themes. research questions will typically be stated in more open-ended and general forms such as the researcher hopes to "discover. It is important that qualitative researchers always include this section in their reports. where it was done." or "describe the experiences. -It can also be very helpful to use diagrams. The goals are exactly the same as before. This section includes information telling how the study was done. know your audience and write in a manner that clearly communicates. its key methodological features. tables." "explain or understand.Writing Qualitative Research Reports We recommend that qualitative researchers also follow the guidelines given above when writing manuscripts for publication. The abstract should describe the key focus of the study. quotes. -We state that regardless of the specific format of your results section. We recommend that qualitative researchers use the same seven major parts that were discussed for the quantitative research report. • Introduction. and what procedures were carried out to ensure the validity of the arguments and conclusions made in the report. it is important to fit your findings back into the relevant research literature. or around a conceptual scheme used in the study. You should state your overall conclusions and offer additional interpretations in this section of the report.

NOTE: in all cases. Write essentially two separate subreports (one for the qualitative part and one for the quantitative part). if you are writing a mixed research report. • Page 179 of 179 .g. 3. Organize the introduction. at a minimum the findings must be related and “mixed” in the discussion section). 4... method.g. and results by research question.The suggestions already discussed in this chapter for quantitative and qualitative also apply for mixed research. method and results) by research paradigm (quantitative and qualitative). • In general. mixing must take place somewhere (e. try to use the same seven headings discussed above. Organize some sections (e. • Here are a few organization options: 1. 2.

Sign up to vote on this title
UsefulNot useful

Master Your Semester with Scribd & The New York Times

Special offer for students: Only $4.99/month.

Master Your Semester with a Special Offer from Scribd & The New York Times

Cancel anytime.