Common flaws Running head: COMMON FLAWS


Scientific Research in Education: Common Flaws John Koetsier University of British Columbia

Common flaws Abstract


Research studies are difficult to do right and easy to do wrong. There are many potholes to avoid, and many factors can impact a study’s validity and reliability. To find and understand some of the common problems, I’m going to look at three different types of studies, see what the researchers did, how they did it, and what problems they encountered. The studies are Beck and Fetherston’s The effects of incorporating a word processor into a year three writing program (2003), Schweingruber and Brandenburg’s Middle School Students’ Technology Practices and Preferences: Re-examining Gender Differences (2001), and Haye’s A comparison of fifth graders’ frequency using web-based activities versus traditional activities for self-directed enrichment (2003).

Common flaws In the first study, Natalie Beck and Tony Fetherston studied the effects of


teaching writing with a word processor in primary grades. For six weeks, they studied both how students felt about using word processing technology versus paper and pencil, and what effects technology had on the quality of their writing. As a result, they concluded that students who used word processors wrote significantly better than students using pencil and paper.

Unfortunately, the quality of the study was severely and negatively undermined by several design and procedural decisions. Together those flaws cause it to miss the standard for research that is generalizable to other settings and can be counted upon when creating programs and curricula.

In brief, the problems with the study include a very small sampling size only seven students – which basically eliminates any opportunity for external validity. The sample cannot possibly be representative enough. And – not that it matters that much with such a small sample - the researchers used convenience sampling rather than random sampling.

In addition, the short six-week study ensured that researchers could not compensate for the effects of novelty … any new technique employed in an educational setting might result in a temporary bump in performance as the sheer newness galvanizes student attention and effort. Oddly, in what must be a rare problem for a study with a novelty issue, maturation was also a problem, since the students apparently used the word processing software

Common flaws previous to the initiation of the study.


Finally, and perhaps most importantly, the design was pre-experimental. There was no control group receiving a placebo and equal but different treatment. The sample group essentially was its own control group.

In the second study, Miller, Schweingruber, and Brandenburg looked at middle school students’ use of technology in America - specifically at male/female differences. They administered a 512-question survey to students in Texas middle schools, and used the results to argue that historical differences are disappearing as technology – particularly the web becomes more prevalent.

The conclusions are valid and supported by subsequent research, but the methodology (particularly the sampling) could have been significantly improved. Therefore, the study is not as generalizable as it could have been, and follow-up research was required.

Problems included a significantly skewed urban/suburban mix that is heavily weighted in favor of urban students and against rural students, who were entirely excluded. In addition, ethnicity was a factor that was not addressed at all in the study, even though the schools from which students were drawn were in a city and state that over-represented certain racial groups. A final sampling complication was the fact that schools subjects were drawn from

Common flaws were significantly undersized compared to the average middle school.


In addition to sampling concerns, the author’s assertion that the web is predominantly responsible for male/female technology preferences becoming more similar is problematic, as there are many potentially confounding variables. Finally, high mortality in the course of the study due to data collection problems adds yet another question mark.

In the third study, teacher Karen Hayse engaged in action research to guide a school district’s recommended practice with regard to using web-based versus traditional enrichment resources. Working with a single class of fifth graders over a period of 10 weeks, Hayse introduced 15 web resources and 15 traditional resources as activities that students could explore and use during non-graded personal enrichment time every third school day. Students self-reported which resources they used.

Hayse discovered that students preferred web resources to traditional resources most of the time, with web resources being the clear leader initially, trailing off halfway through the 10 weeks, and then regaining popularity in the final few weeks. Hayse also noticed, anecdotally, that giving students a choice between different types of enrichment activities seemed to result in students choosing to engage in enrichment more often, regardless of which type they chose.

Common flaws There are a number of concerns with this study, starting with sampling.


Specifically, Hayse has apparently used convenience sampling, probably with her own class. Clearly, there are no guarantees of representativeness. Another is a lack of pre-testing. It would be important to know whether before the study started there was already a student preference for technology and web-based resources, and a simple survey could have provided helpful insight when interpreting the study data.

A concern I have is that there may have been a novelty effect … that students who had previously only been exposed to traditional enrichment activities in the classroom may have chosen web activities simply due to their newness. A longer study would have reduced any novelty effects that might be operating.

Hayse mentions that she has controlled for a number of factors, including types of activities and opportunities to work with peers, but I wonder if the web resources were as potentially social or perceived as potentially social by the students as the traditional resources. The spike in non-traditional resource usage came after a student asked friends to play a trivia game; was a similar thing possible with the web resources? It’s difficult to say without being able to examine the actual websites.

A number of other questions suggest themselves: while “neither or both” were options, as students could choose to use any combination of resources

Common flaws including no resources, they do not show up in the data in Table 1. It seems


unlikely that over 10 weeks these options were never chosen. Also, students self-reported use of activities at the end of the day: this may not be the most accurate method of collecting data. And finally, while not necessarily to be expected in action research, it would still be ideal to have a better study design than pre-experimental.

Looking over these three studies, it seems clear that sampling is an enormous challenge and frequent source of external validity concerns. Each of the studies had, to varying degrees, sampling problems. This probably shouldn’t be too much of a concern, as finding subjects and convincing them to participate in studies is difficult, time-consuming, and potentially expensive. However, it is worth researchers time to expend considerable time and effort on this specific facet of their studies, since without a good random or appropriately stratified sample, results are not generalizable anyways. In other words, garbage in, garbage out.

Secondly, an appropriate degree of control over the variables in the study is critical. Knowing that students had used the particular type of word processing software that they were testing should have impelled Beck and Fetherston to find other subjects. And while I can’t prove it without access to the resources that Hayse used, I suspect that while equivalent on the surface, and in terms of topic, they may not have been equivalent in terms of presentation and use by students.

Common flaws


In conclusion, academic research is a difficult process to do well, and pitfalls exist at every stage. These three studies illuminate some of the common issues, and provide insight for researchers about what to avoid and minimize in study design in order to maximize internal and external validity.

Common flaws References


Beck, N., & Fetherston T. (2003). The effects of incorporating a word processor into a year three writing program. Information Technology in Childhood Eduction Annual, 2003, 139-161.

Hayse, K. (2003). A comparison of fifth graders’ frequency using web-based activities versus traditional activities for self-directed enrichment. Retrieved from tm on March 5, 2008.

Miller, L.M., Schweingruber, H., and Brandenburg, C.L. (2001). Middle School Students’ Technology Practices and Preferences: Re-examining Gender Differences. Journal of Educational Multimedia & Hypermedia, 10(2), 125-140.

Sign up to vote on this title
UsefulNot useful

Master Your Semester with Scribd & The New York Times

Special offer: Get 4 months of Scribd and The New York Times for just $1.87 per week!

Master Your Semester with a Special Offer from Scribd & The New York Times