Professional Documents
Culture Documents
Pages
Falchikov & Boud 1989 57 5332 0.47 33% College Other Researchers
Prof John O'Neill
Ross 1998 11 1.63 115% second language
Prof Adrian Simpso
Falchikov & Goldfinch 2000 48 4271 1.91 135% College
Revolution School
Kuncel, Crede & Thomas 2005 29 56265 3.1 219% GPA Hattie's Rankings
Kuncel, Crede & Thomas 2005 29 0.6 42% self evaluation High Impact Strate
The Story?
Dr. Kristen Dicerbo looked at each of these studies here. A summary of what she found: Hatties Defenses
References
'Mabe & West (1982) This is a review of 55 studies. It is framed around the idea of
understanding the validity of self-evaluation by correlating self-given grades to other grades.
Are the self-given grades accurate? Are they related to more objective measures of
achievement? They did NOT investigate whether changing the self-evaluation influences
achievement. The average correlation between self-evaluation and achievement was .29,
although across studies it ranged from -.26 to .80. The authors identify a number of ways to
make the self-evaluations more accurate.
Fachikov & Boud (1989) This review examined 57 studies. They employed the commonly-
used effect size measure. However, again, there weren’t control and experimental groups or
studies about the effect of changing someone’s self-grade/self-expectation. Rather, the self-
grade was coded as the experimental group and teacher grade as the control. The mean effect
size is .47, with a range from -.62 to 1.42 across studies. They also report the mean correlation
between self-graded and other-graded was .39.
Ross (1998) This study examines self-assessment in the context of whether a self-assessment
can be used for placement in language classes as opposed to giving placement tests. They report
the correlation between self-report and objective scores across 60 studies reviewed as .63. They
then report an effect size, but it is an effect size for the correlation coefficient, not the
traditional meta-analysis effect size that compares a control and experimental group. This effect
size (g) is 1.63. Again, they don’t compare doing it versus not or any effect on achievement.
Falchikov & Goldfinch (2000) This study is actually about the relationship between peer
grades and teacher grades. The overall correlation was .69, with a range from .14 to .99.
Regardless, this study does not seem to fall into the category of self-assessment.
Kuncel, Crede, & Thomas (2005) This paper again looked at the reliability and validity of
self-assessment BUT just by looking at whether the GPA and SAT scores that students were
reporting were their real scores. In other words, this isn’t even really a judgment of their own
expectations, but whether they remember and accurately report known scores. They compare
reported to actual results from 37 different samples. So, sure the effect size for reported versus
actual GPA was 1.38, but that just means college students can pretty accurately report their
already known GPAs. Interestingly, they were quite poor at reporting SAT scores, with effect
sizes of .33 for Verbal and .12 for Math.'
I've read each of the meta-analyses and confirm Dr. Dicerbo's analysis.
Kuncel (2005) measured the student’s memory of their GPA score from a year or so previously;
which is a measure of memory or honesty; not of students' predicting their future scores.
'Since it is often difficult to get results transcripts of student PREVIOUS GPA’s from High
School or College, the aim of this study is to see whether self-reported grades can be used as a
substitute. This obviously has time-saving administration advantages.'
'We conceive of the present study as an investigation of the validity of peer marking.'
'The intent of this review is to develop general conclusions about the validity of self-evaluation
of ability.'
Note: they measured over 20 different categories of achievement from scholastic, athletic, managerial to
practical skills.
Falchikov and Boud (1989, p396) compared staff with student self-marking as the experimental
group. They conclude (p420),
'most studies found positive effect sizes, indicating overrating on the part of student markers.'
Dicerbo concludes,
'It is clear that these studies only show that there is a correlation between students’ expected
grades and their actual grades. That’s it (and sometimes not even that). They just say kids are
pretty good judges of their current levels. They do not say anything about how to improve
achievement. These studies are not intervention studies. In fact, if I were looking at all the
studies about “influences” on achievement, I would not include this line of research. It is not
about influencing achievement ...
In later work, Hattie is calling this finding self-expectation. The self-grading research seems to
have gotten turned into the idea that these studies imply we should help kids prove those
expectations wrong or that raising kids’ expectations will raise their achievement ...
That is not what the studies that produced the 1.44 effect size studied. They looked at the
correlation of self-report to actual grades, often in the context of whether self-report could be
substituted for other kinds of assessment. None of them studied the effect of changing those
self-reports. As we all know, correlation does not imply causation. This research does not imply
that self-expectations cause grades.'
'Precise coding and combination of data are critical for the production of a meta-analysis. If
data examining fundamentally different samples or variables are unintentionally combined, it
may jeopardise the findings. The result would be a mixing of potentially different studies that
could yield an uninterpretable blend. Stated simply, this is the old debate about comparing
oranges versus apples.'
I contacted Professor Kuncel to make sure I interpreted his study correctly, he replied that the conclusion
of the study was:
Another major issue is the Ross (1998) study uses English Second Language students, a very small or
abnormal subset of the total student population. The page on Effect Size goes into detail about how
abnormal populations give rise to larger effect sizes. So inferences should NOT be made about the general
student population.
Professor Pierre-Jérôme Bergeron (2017) insightfully identifies the overriding problem here,
'in addition to mixing multiple and incompatible dimensions, Hattie confounds two distinct
populations:
The studies about self- report are clearly NOT about influencing academic success.
'It is also with correlations that he obtains the so-called effect of self-reported grades, the
strongest effect in the original version of Visible Learning. However, this turns out to be a set of
correlations between reported grades and actual grades, a set which does not measure
whatsoever the increase of academic success between groups who use self-reported grades and
groups who do not conduct this type of self-examination.'
Professor Bergeron also warns of the conversion of correlation to an effect size - see Effect Size.
Schulmeister & Loviscach (2014) Critical comments on the study "Making learning visible" (Visible
Learning).
Confirm Bergeron's concerns over the use of correlation and the conversion to an effect size.
Also, they note the value of d = 0.60 cited by Hattie from the Kuncel, Crede & Thomas study, cannot be
found in the study nor reconstructed from the study (p6).
'The paper [Kuncel (2005)] should not have been included in the analysis. This example does
raise questions regarding the remaining average effect sizes.' (p220).
No comments:
Post a Comment
Enter your comment...
Publish Preview
Home
Total Pageviews
28,759
About Me
George Lilley
View my complete profile
Blog Archive
▼ 2016 (1)
▼ January (1)
An investigation of the evidence John Hattie prese...