Professional Documents
Culture Documents
net/publication/264429157
CITATIONS READS
34 1,345
3 authors, including:
All content following this page was uploaded by Holger Gaertner on 19 December 2014.
To cite this article: Holger Gaertner, Sebastian Wurster & Hans Anand Pant (2014) The
effect of school inspections on school improvement, School Effectiveness and School
Improvement: An International Journal of Research, Policy and Practice, 25:4, 489-508, DOI:
10.1080/09243453.2013.811089
Taylor & Francis makes every effort to ensure the accuracy of all the information (the
“Content”) contained in the publications on our platform. However, Taylor & Francis,
our agents, and our licensors make no representations or warranties whatsoever as to
the accuracy, completeness, or suitability for any purpose of the Content. Any opinions
and views expressed in this publication are the opinions and views of the authors,
and are not the views of or endorsed by Taylor & Francis. The accuracy of the Content
should not be relied upon and should be independently verified with primary sources
of information. Taylor and Francis shall not be liable for any losses, actions, claims,
proceedings, demands, costs, expenses, damages, and other liabilities whatsoever or
howsoever caused arising directly or indirectly in connection with, in relation to or arising
out of the use of the Content.
This article may be used for research, teaching, and private study purposes. Any
substantial or systematic reproduction, redistribution, reselling, loan, sub-licensing,
systematic supply, or distribution in any form to anyone is expressly forbidden. Terms &
Conditions of access and use can be found at http://www.tandfonline.com/page/terms-
and-conditions
Downloaded by [FU Berlin] at 23:26 06 August 2014
School Effectiveness and School Improvement, 2014
Vol. 25, No. 4, 489–508, http://dx.doi.org/10.1080/09243453.2013.811089
states of Berlin and Brandenburg (Germany), both inspected and uninspected schools
were surveyed with respect to school improvement activities over a 1-year period.
The main finding is that principals’ and teachers’ perceptions of school quality were
highly stable, irrespective of the introduction of school inspections. The results show
school inspections had a comparatively low impact on the aspects of school quality
measured here.
Keywords: school inspection; impact; school improvement
Introduction
“All around the world schools are inspected and the assumption is that this in a positive way
contributes to the quality of schools and education systems” (Ehren & Visscher, 2006, p. 53)
the areas of instruction and management (Matthews & Sammons, 2004). However,
independent research has not only addressed the positive effects but also the undesirable
side effects of inspections, such as the extreme pressure laid on staff and school manage-
ment in the lead-up to an inspection or the stress and anxiety of teachers (Cuckle &
Broadhead, 1999; Gaertner et al., 2009; Gray & Gardner, 1999; in summary, De Wolf &
Janssens, 2007).
Most existing studies investigate the effect of school inspections retrospectively within
a one-shot design in which principals and teachers of inspected schools are asked about
their perceptions of the process and future planned measures based on inspection results
(Cuckle & Broadhead, 1999; Cullingford & Daniels, 1999; Gray & Gardner, 1999).
Evaluations using a comparison groups design with repeated measurement points compar-
ing, for example, school improvement in inspected and uninspected schools are rare
(Luginbuhl, Webbink, & De Wolf, 2007).
Attempts to demonstrate the effect of school inspections on the performance of
students have so far revealed either a negative effect or none at all. Both Cullingford
and Daniels (1999) and Rosenthal (2004) show that student performance, as reflected in
Downloaded by [FU Berlin] at 23:26 06 August 2014
Method
The following section outlines how school inspections are conducted in Berlin and
Brandenburg, followed by a description of the design of the study, the sample, and finally
the instruments and evaluation methods used.
from processes in other jurisdictions, such as England, and needs to be considered when
interpreting the following results.
Overall, inspections in both states, Berlin and Brandenburg, are carried out in a highly
uniform way. There are differences in procedural details or on individual evaluative
criteria. However, for the research questions considered here, there are no relevant
differences to be found in the procedure.
Initial empirical findings are available on the effect of the processes in Berlin and
Brandenburg (Gaertner & Wurster, 2009a, 2009b). These findings are based on surveys of
principals and teachers from inspected schools and addressed the following questions:
How are inspection results communicated within schools? How are inspection reports
evaluated within schools? Which measures to improve quality are planned and imple-
mented and how? Statements from the principals and teachers surveyed indicate that
inspection results are communicated and evaluated in all schools. Also, in all schools
measures were implemented based on the inspection report. The vast majority of the
conclusions of the inspection were accepted.
Activities in schools sometimes take place in the lead-up to an inspection, for
Downloaded by [FU Berlin] at 23:26 06 August 2014
example, preparing the documents required. Suggestions for future school improvement,
based on the inspection report, mainly come from staff, from school management, and
from parents. Negative and undesirable effects of school inspections, such as stress or
increased conflict, were rarely reported. However, additional stress was noted among
some of the surveyed principals. General acceptance of school inspections and their
intended goals is high (Gaertner & Wurster, 2009a, 2009b).
In the following sections, these statements about the effect of school inspections will
be compared to the results of a control-group design centring on the perception of change
in various aspects of school quality in uninspected, as against inspected, schools.
Design
This study used a control-group design. That is, the survey included principals and
teachers at schools that had not yet been inspected, so that developments at these schools
could be compared with developments at schools that already had been inspected. This
was achieved by surveying the schools a second time, 1 year after the first survey.
Measurement 1 (M1) took place at the start of the 2008/9 school year; Measurement 2
(M2) took place at the start of the 2009/10 school year (see Table 1). The school groups
involved were as follows:
● Schools that were inspected in the 2006/7 school year; that is, 2 years before the
first survey (S1).
● Schools that were inspected in the 2007/8 school year; that is, 1 year before the first
survey (S2).
● Schools that were inspected during the survey period (S3).
● Schools that had not yet been inspected (Control Group; CG).
Table 1. Design.
Sample
All state schools in Berlin and Brandenburg are inspected in a roughly five-year cycle.
Each year, a random selection of schools is chosen for inspection. This means that the
inspected schools in each year represent a random sample of all state schools in Berlin and
Brandenburg. A proportional school inspection approach, where schools are inspected
according to their previous performance, as, for instance, in The Netherlands (Inspectie
van het Onderwijs [The Dutch Inspectorate of Education], 2006), is not implemented in
Downloaded by [FU Berlin] at 23:26 06 August 2014
Germany.
At M1, letters were sent out to all schools in Berlin and Brandenburg that had been
inspected in the 2006/7 and 2007/8 school years. Also, letters were sent to all schools that
were to be inspected in the 2008/9 school year. A random sample was drawn from the as-
yet-uninspected schools, so that the size of the sample matched the number of schools
inspected in previous years.
Survey respondents were of two kinds: (a) principals (1 per school) and (b) teachers
who were members of the school council. In both states, the school council is the highest
decision-making body within the school, made up of teachers, parents, and pupils.
Depending on the size of the school, between two and four teachers are appointed to
the school council. It is assumed that both groups (that is, teachers who were members of
the school council and the principals) have access to information that well qualifies them
to answer questions about school quality.
The mean response rate for both measurements was 21.7% (1,241 of 5,706
approached teachers and principals took part at both measurement points). Comparing
the response rates among the four groups shows that the rates for groups S2 (25.9%) and
S3 (25.9%) are slightly higher, and for S1 (17.5%) and the control group (14.6%) slightly
lower. A chi² test testing for a uniform distribution of response rates, however, did not
reveal significant differences between the groups (chi²(3,95) = 6.89; ns). The question of
whether a differential selection bias may have influenced the results presented below will
be addressed in the discussion.
In order to determine whether there are noteworthy differences between the participat-
ing and nonparticipating schools, a missing-value analysis was carried out. For the
purpose of the missing-value analysis, the results of standardized achievement tests
were available for almost all the participating (n = 467) and nonparticipating schools
(n = 765). In addition, results from the school inspections were available where the
schools had been inspected.
Responses were available from 467 of the total 1,232 schools approached. Of these
1,232 schools, information about student achievement is available for 1,067. These data
were collected as follows: In the case of primary schools, information is available from
standardised achievement tests in mathematics and German in Grade 3 from the same
school year as the survey (2008/9). These results were aggregated at school level for
German and mathematics separately. These data are available to the same extent in Berlin
496 H. Gaertner et al.
and Brandenburg. For secondary schools, the results of central school-leaving examina-
tions at the end of Grade 10 are available for the 2008/9 school year. These examinations
are carried out differently in Berlin and Brandenburg. In Berlin, the average pass rate for
the examination at each school is used, while in Brandenburg the examination grades are
again aggregated at the school level for the centrally examined subjects German and
mathematics.
Table 2 shows the results of a comparison between achievement results from partici-
pating and nonparticipating schools. The t tests for independent samples carried out in
each case show no differences between schools for any of the student achievement data.
The effect sizes confirm that there are no differences between those schools participating
and those not participating in this study with respect to student achievement.1
For the schools already inspected, the results of school inspection reports were also
available. The results of school inspections were averaged in order to obtain an overall
evaluation for each school. Table 2 shows that the participating schools do not differ from
the nonparticipating schools with respect to their evaluation by the school inspectorate.
These results show that inspected schools were included in our study independent of their
Downloaded by [FU Berlin] at 23:26 06 August 2014
inspection result.
In summary, the analysis of nonparticipants showed that, with respect to both student
achievement and inspection results, there was no difference between the schools taking
part in the survey and those not taking part.
The following analysis will focus on comparing the control group and inspected
school groups. Thus, data from Berlin and Brandenburg will be collapsed.
Additionally, responses from teachers and principals will be treated as comparable
indicators at the school level and will therefore be aggregated. To check whether there
was sufficiently large interrater agreement within a school, the interrater agreement
index agw(J) was calculated (Brown & Hauenstein, 2005). The mean interrater agree-
ment on all constructs and all schools was agw(J) = .72. According to the categorisation
of Brown and Hauenstein (2005), this represents an acceptable interrater agreement.
After aggregation, the following sample size is realized at school level: nS1 = 94;
nS2 = 141, nS3 = 144; nCG = 88. Although earlier findings on the perceptions of
changes initiated by school inspection showed that these changes can be perceived
differently by school management and teachers (Gaertner & Wurster 2009a, 2009b),
the agreement indices obtained suggest that perceptions regarding the aspects of
quality surveyed here coincide to a high degree.
Final central examination result Berlin (Grade 10) –1.66 227 .10 .01
Central examination result Brandenburg (Grade 10) – mathematics .11 139 .92 < .001
Central examination results Brandenburg (Grade 10) – German 1.17 139 .25 .01
Standardized statewide assessment in elementary school in Berlin .27 696 .79 < .001
and Brandenburg (Grade 3) – mathematics
Standardized statewide assessment in elementary school in Berlin .39 697 .70 < .001
and Brandenburg (Grade 3) – German
Note: Following Cohen (1988), eta² = .01 represents a small, .06 a medium, and .14 a large effect.
School Effectiveness and School Improvement 497
Instruments
The 16 dependent variables relate to aspects of school quality that match the
quality framework used by both states. This quality framework serves as a basis for both
school inspections themselves and quality improvement. Although the quality framework
itself is a purely normative guideline for school quality issued by the school authority, its
content closely follows that of school effectiveness research (Scheerens & Bosker, 1997). In
particular, the following general effectiveness-enhancing factors from Scheerens and Bosker
(1997) can be found in the education department’s guidelines: achievement orientation,
educational leadership, consensus among staff, curriculum quality, school climate, parental
involvement, and classroom management (MBJS, 2008; SenBJW, 2013).
Here, validated instruments were mapped to dimensions in the quality framework.
These quality dimensions are: (a) School results, (b) School culture, (c) School manage-
ment, (d) Professionalism and professional development, and (e) Quality improvement.
For example, the dimension “School results” was mapped to the scales Student satisfac-
tion, Teacher satisfaction, and Strain on teachers (see Table 3). Some of the scales
Downloaded by [FU Berlin] at 23:26 06 August 2014
deployed were used in the Programme for International Student Assessment (PISA)
2003, where an attempt was made to operationalize quality attributes at the organisational
level using the perceptions of principals (Organisation for Economic Co-operation and
Development, 2005). Only aspects relating to school quality were operationalized and
none relating to lesson quality. Hence, the potential impact of school inspections on
instruction improvement cannot be addressed in this study.
In order to check whether the constructs used here in fact relate to the assessments
used by the school inspectorate, correlations were calculated for the inspected schools
between the assessment outcomes from school inspection and the respective instruments
used for this study.
Analyses were feasible for 311 Brandenburg schools. The low participation rates of
inspected schools in Berlin left too few cases to carry out this analysis for the Berlin
subsample. The correlation matrix compares scale results with the results of the school
inspection in Brandenburg (see Appendix 1). The correlations are largely as expected.
However, one should note that the aspects surveyed cannot be mapped one to one, since
school inspections combine many aspects into a single score (Gaertner & Pant, 2011).
Also, some assessments demonstrate a very low variance, limiting the size of possible
correlations. Therefore, Appendix 1 shows (in brackets) the correlation after correction for
attenuation based on the known reliability of the scales used. One can assume that the
inspectorate’s assessments also show some unreliability, and that doubly corrected corre-
lations are likely to be higher (Gaertner & Pant, 2011). In sum, one can assume that the
questionnaire instruments used cover similar content to the inspections and that the
perception of the surveyed principals and teachers is positively correlated to the assess-
ment made by the school inspectorate.
Statistical analysis
In the following, independent repeated-measures analyses of variance are conducted for
every dependent variable. Whereas a statistically significant main effect suggests that
perceptions are changing in the same direction in all groups, a significant group-by-time
interaction suggests that perceptions in the groups are shifting in different ways. Thus, if
there is a systematic difference in the changes between the control and inspected school
groups, this would be manifest in a significant interaction. Finally, individual comparisons
can show which groups specifically differ from one another.
498 H. Gaertner et al.
Table 3. Instruments.
School results
Student commitment and Students like attending this school 7 .88
satisfaction
Teacher commitment and Teachers are committed to their work 4 .78
satisfaction
Strain on teachers1 Work-related stress negatively impacts on my private life 7 .81
School culture
Active involvement of Parents are involved in the supervision of homework 7 .83
parents
Staff involvement Staff make important decisions collectively 3 .81
Co-operation with external Associations/Clubs 10 .56*
partners
School management
Instructional leadership2 I can provide good advice to teachers who are having 6 .80
difficulties with their lessons
Downloaded by [FU Berlin] at 23:26 06 August 2014
Time management and use I try to improve scheduling to ensure work time is used 5 .57
of time optimally
Management of Study periods prepared for several classes 5 .75
substitutions
Classroom management We think it is important to create as much active 3 .67
working time as possible
Professionalism and professional development
Professional development Staff regularly participate in professional development 10 .84
programmes as a matter of course
Co-operation within school Co-ordination within teaching departments is effective 9 .83
Quality improvement
Quality improvement as Our school charter comprises mid-term improvement 7 .84
part of school charter goals for school
Monitoring of teaching We realize lesson observation by principals or 4 .50*
practice departmental heads
School-wide use of self- We use self-evaluation techniques at our school 5 .81
evaluation
Documentation of student Systematic information on student achievement is 4 .86
achievement gathered and made available
Notes: *dichotomous items – response coding Yes = 1/No = 0; otherwise response coding 1 = Totally disagree, or
Totally untrue; 2 = Disagree slightly, or Somewhat untrue; 3 = Agree slightly, or Somewhat true; 4 = Totally
agree, or Totally true. 1Answered only by teachers; 2answered only by principals.
Results
Changes in perceptions of school quality
Table 4 shows the descriptive results for measurements at M1 and M2. Effect sizes for
checking group differences at M1 are also given (univariate analyses of variance). As can
Downloaded by [FU Berlin] at 23:26 06 August 2014
Table 4. Descriptives (mean, SD) for each dependent variable for every school group at M1 and M2 and effect sizes for differences between the groups at M1
(eta²).
M1 M2
Student commitment and satisfaction 3.19 (.36) 3.24 (.40) 3.20 (.30) 3.14 (.37) .01 3.18 (.38) 3.20 (.41) 3.22 (.32) 3.15 (.37)
Teacher commitment and satisfaction 3.54 (.31) 3.52 (.37) 3.55 (.31) 3.55 (.33) .003 3.54 (.34) 3.48 (.43) 3.56 (.34) 3.59 (.31)
Strain on teachers1 1.84 (.36) 1.79 (.33) 1.88 (.35) 1.83 (.41) .008 1.78 (.54) 1.82 (.48) 1.80 (.34) 1.88 (.48)
Active involvement of parents 2.05 (.50) 2.12 (.49) 2.21 (.44) 2.12 (.51) .01 2.10 (.51) 2.15 (.52) 2.22 (.45) 2.17 (.51)
Staff involvement 3.35 (.45) 3.37 (.42) 3.46 (.39) 3.42 (.46) .01 3.39 (.50) 3.38 (.48) 3.42 (.41) 3.50 (.40)
Co-operation with external partners .77 (.16) .79 (.15) .78 (.13) .75 (.16) .01 .77 (.16) .79 (.16) .79 (.14) .76 (.16)
Instructional leadership2 3.23 (.48) 3.27 (.43) 3.29 (.42) 3.22 (.38) .002 3.26 (.48) 3.27 (.43) 3.29 (.42) 3.22 (.38)
Time management and use of time 3.53 (.27) 3.61 (.24) 3.63 (.24) 3.53 (.27) .03** 3.53 (.30) 3.56 (.30) 3.62 (.23) 3.53 (.28)
Management of substitutions 2.65 (.37) 2.65 (.35) 2.68 (.37) 2.71 (.38) .006 2.66 (.42) 2.66 (.41) 2.71 (.34) 2.66 (.44)
Classroom management 3.34 (.40) 3.43 (.35) 3.44 (.36) 3.31 (.40) .02* 3.38 (.38) 3.42 (.40) 3.46 (.37) 3.37 (.40)
Professional development 2.87 (.31) 2.90 (.33) 2.94 (.31) 2.90 (.31) .005 2.83 (.36) 2.89 (.33) 2.91 (.29) 2.87 (.33)
Co-operation within school 2.77 (.42) 2.82 (.40) 2.87 (.37) 2.84 (.44) .01 2.85 (.43) 2.81 (.45) 2.89 (.41) 2.81 (.46)
Quality improvement as part of school charter 3.12 (.36) 3.08 (.40) 3.08 (.39) 3.01 (.41) .007 3.12 (.42) 3.17 (.39) 3.12 (.36) 3.03 (.47)
Monitoring of teaching practice .74 (.18) .75 (.17) .68 (.19) .66 (.23) .04** .71 (.21) .72 (.21) .77 (.19) .69 (.20)
School-wide use of self-evaluation 2.81 (.44) 2.75 (.48) 2.76 (.43) 2.68 (.49) .01 2.84 (.53) 2.84 (.47) 2.86 (.46) 2.77 (.51)
Documentation of student achievement 2.84 (.54) 2.92 (.53) 3.00 (.51) 2.94 (.61) .01 2.82 (.60) 2.97 (.59) 3.09 (.45) 3.01 (.58)
Notes: 1Answered only by teachers; 2answered only by principals; following Cohen (1988), eta² = .01 represents a small, .06 a medium, and .14 a large effect.
School Effectiveness and School Improvement
499
500 H. Gaertner et al.
be seen in the descriptive results for M1 and the effect sizes, the school groups barely
differ at M1. Pretest differences exist only on three scales (time management and use of
time, classroom management, monitoring of teaching practice). Even for these three
scales, however, the effect sizes point only to small differences between the groups
(Cohen, 1988). In sum, these results confirm that initially the school groups do not differ
significantly with regard to principals’ and teachers’ perceptions of school quality indi-
cators. In the following, we examine which significant changes occurred at M2.
Table 5 summarizes the results of the repeated-measures analysis of variance. As can
be seen, there is little perceived change overall over the survey period. Significant main
effects are found for only 4 out of 16 variables (Active involvement of parents, Quality
improvement as part of school charter, School-wide use of self-evaluation, Documentation
of student achievement). A significant interaction between the repeated-measures factor
and the school-group factor can be found for only one variable (Monitoring of teaching
practice).
Downloaded by [FU Berlin] at 23:26 06 August 2014
Partial
Dependent variable Source of variance F df dferror p eta²
Student commitment and satisfaction time .03 1 463 .95 < .001
time*school group .91 3 463 .44 .006
Teacher commitment and satisfaction time .03 1 463 .86 < .001
time*school group 1.17 3 463 .32 .008
Strain on teachers1 time .02 1 381 .96 < .001
time*school group 1.04 3 381 .38 .008
Active involvement of parents time 4.09 1 460 .04* .01
time*school group .36 3 460 .78 .002
Staff involvement time 1.11 1 461 .29 .002
time*school group 1.39 3 461 .24 .01
Co-operation with external partners time 1.59 1 461 .21 .003
time*school group .60 3 461 .61 .004
Instructional leadership2 time .90 1 287 .34 .003
time*school group .15 3 287 .93 .002
Time management and use of time time 1.56 1 461 .21 .003
time*school group .61 3 461 .61 .004
Management of substitutions time .01 1 461 .93 < .001
time*school group .65 3 461 .58 .004
Classroom management time 2.14 1 462 .14 .005
time*school group .67 3 462 .57 .004
Professional development time 2.76 1 461 .10 .006
time*school group .34 3 461 .79 .002
Co-operation within school time .59 1 460 .44 .001
time*school group 1.52 3 460 .21 .01
Quality improvement as part of school time 4.60 1 456 .03* .01
charter time*school group 1.08 3 456 .36 .007
Monitoring of teaching practice time 1.04 1 456 .31 .002
time*school group 9.58 3 456 < .001** .06
School-wide use of self-evaluation time 10.59 1 458 .001** .02
time*school group .69 3 458 .56 .005
Documentation of student achievement time 4.29 1 458 .04* .01
time*school group 1.03 3 458 .38 .007
1 2
Notes: Answered only by teachers; answered only by principals; following Cohen (1988), eta² = .01 represents
a small, .06 a medium, and .14 a large effect.
School Effectiveness and School Improvement 501
The significant main effects on the four variables mentioned are all characterized by
an increased perception of the aspect of school life in question either in all four groups or
in three out of four. In all four cases, the effect sizes are small (eta² < = .02). Overall, these
results indicate a high degree of stability within the perception of principals and teachers
for the aspects examined in this study.
The only significant interaction is found for Monitoring of teaching practice. The
descriptive results in Table 2 show that this effect is due to opposite changes within the
school groups. While the perceived extent of Monitoring of teaching practice increases
before and during the year of the inspection, it decreases afterwards. The effect size for
this interaction is of medium size (eta² = .06).
When looking at the underlying items of this scale, it becomes apparent that this effect
is primarily attributable to changes in only one item, namely: Has your school used the
following practices to capture the teaching practices of teachers? Classroom visits by
school authorities or other external persons (see Table 6). During the year of the
inspection, a significant increase is perceived, in the years after the inspection, a decrease.
The effect size for this interaction is large (eta² = .23).
Downloaded by [FU Berlin] at 23:26 06 August 2014
Discussion
The aim of this study was to complement existing research on the effects of school
inspections (Cuckle & Broadhead, 1999; De Wolf & Janssens, 2007; Gray & Gardner,
1999) by examining perceptions of changes to relevant aspects of school quality, depend-
ing on school inspections, using a repeated measure control-group design. To our knowl-
edge, no such analysis of the effect of school inspections has as yet been published.
Earlier results on the effect of school inspections in Berlin and Brandenburg suggest
that schools engage with the inspection reports and are motivated to make use of them.
That is, statements from school representatives indicate that schools in fact process the
evaluative knowledge about their school’s strengths and weaknesses and derive concrete
development measures, as hypothesised in Landwehr’s model (2011).
However, the results emerging from this study suggest that principals and teachers
tend to judge the aspects of school quality as highly stable over the years. In most cases,
there is no change at all within the survey period.
The four significant main effects found suggest that in the period investigated, there
were some general efforts to develop school quality as operationalized here. These include
that the perception of (a) parental involvement in school processes, (b) work on the school
charter, (c) the use of self-evaluation, and (d) the documentation of student achievement
increased. This general trend, however, bears no measurable relation to when a school
inspection takes place, that is, this development occurs either before, during, or after an
inspection. Thus, these developments would appear to be influenced by other factors not
covered in the present design. Reasons for this could be that, within the period investi-
gated, certain reforms were implemented, such as increased autonomy for schools, the
establishment of resources for self-evaluation, or the introduction of centralised student
achievement tests and school-leaving examinations.
Based on our research hypothesis, this study looked for significant group-by-time
interactions particularly to find differential changes in perception effected by a school
inspection. The results yield only one significant interaction, and this is probably an
artefact. When considering this particular item, it seems likely that the surveyed principals
and teachers were thinking of the inspectors themselves that visited their classroom when
responding to the item. So, there is probably no real change in the practice of monitoring
Downloaded by [FU Berlin] at 23:26 06 August 2014
502
H. Gaertner et al.
Table 6. Descriptives (mean, SD) for Monitoring of teaching practice for every school group at M1 and M2.
M1 M2
Items
Has your school used the following practices to capture the
teaching practices of teachers? S1 S2 S3 CG S1 S2 S3 CG
Performance assessment through tests and examinations .90 (.20) .89 (.23) .91 (.17) .83 (.28) .92 (.22) .88 (.25) .91 (.22) .88 (.28)
Peer review of lesson plans, assessment tools and lessons .58 (.34) .57 (.33) .55 (.37) .53 (.37) .56 (.42) .55 (.38) .62 (.37) .53 (.39)
Classroom visits by the school principal or head of department .84 (.24) .82 (.28) .83 (.27) .81 (.32) .87 (.29) .89 (.23) .87 (.26) .87 (.27)
Classroom visits by school authorities or other external persons .59 (.35) .63 (.29) .34 (.32) .33 (.34) .31 (.37) .43 (.38) .64 (.37) .34 (.38)
School Effectiveness and School Improvement 503
teaching practice within schools. If this assumption is true, there are no different devel-
opments at all between the control group and the inspected school groups across all 16
dependent variables. Altogether, the results suggest that school inspections have relatively
little impact on the perception of the school quality indicators measured here.
Interpretation
What do these results mean for assessing the effect of school inspections? The design used
here is founded on the hypothesis that feedback about a school’s strengths and weaknesses
can stimulate school improvement (as indicated in the model of Landwehr, 2011, or Ehren
and Visscher, 2008). These changes should have an effect on the perceptions of principals
and members of the school management board, since these groups have a comprehensive
view of day-to-day school events. Yet, the high degree of stability in the results appears to
suggest that inspections have a relatively small impact on the variables measured here.
Reasons for this are discussed below.
Downloaded by [FU Berlin] at 23:26 06 August 2014
Target agreements
A further reason for the high level of stability could be that the system of target agreements
was still in development during the period investigated. Only half of the inspected schools
had agreed on specific targets with school authorities. Results from a previous study on
inspections in Berlin and Brandenburg showed that the results of the inspection report were
certainly relevant here in cases where targets had been agreed on (Gaertner & Wurster,
2009a, 2009b). The present findings therefore call for future research on the implementation
and the effects of target agreements. At present, it does not appear that this process changes
the organisational attributes of school quality on a broad scale.
504 H. Gaertner et al.
Wrong hypothesis
Another reason for the relatively high stability of results could be that the original
hypothesis for this study must, at least in part, be rejected. The hypothesis presupposed
that the effects of a school inspection would largely come about after the inspection, that
is, after the school receives feedback of the inspection results and starts working with this
information. Other studies indicate that greater effects can be found in preparation for the
Downloaded by [FU Berlin] at 23:26 06 August 2014
school inspection (Gaertner & Wurster, 2009a, 2009b; Gray & Gardner, 1999; Plowright,
2007). Because this study provides no evidence that schools actually develop measures
based on the inspection report, this alternative hypothesis should be addressed in future
studies. If the main effects of school inspection are primarily preparation effects, this
would have implications for the current design of school inspection processes. For
example, the German state of Baden-Württemberg already notifies schools of inspections
well in advance (1 to 2 years), explicitly so that they can prepare (with support) for the
inspection. Such a view exploits the “momentum” of preparation systematically to
stimulate school improvement ahead of an inspection. This approach would better serve
the improvement function of inspections and place less emphasis on the accountability
function (in the sense of measuring an “objective” school reality). In terms of Landwehr
(2011), this approach is in accordance with the fourth function of his model, namely the
enforcement of standards, because even before the inspection is carried out, schools work
systematically to ensure they meet the criteria that will be measured.
Conclusion
In sum, we can conclude from these results that, within the political and institutional
context described, no impact on changes in the perception of school quality can be found
that are attributable to school inspections. In relation to Landwehr’s model (2011), this
means: (a) The inspection process described generates knowledge about the quality of
schools; (b) this knowledge is, however, only rarely used for autonomous school improve-
ment, at least with respect to the quality attributes at organisational level; (c) school
authorities do use the newly generated knowledge. However, it is unclear at present
whether the process of target agreements is in fact completely implemented and causes
the expected effects; (d) it may be plausible that the main effects of an inspection arise in
the lead-up to an inspection rather than the follow-up, corresponding to the assumed
fourth function of school inspection, namely enforcing norms. This was not investigated
here, but results from the previous investigation of the inspection process in Berlin and
Brandenburg support this view (Gaertner & Wurster, 2009a, 2009b).
Thus, the results confirm Landwehr’s (2011) point that inspections do not derive their
legitimacy directly from their contribution to school improvement, but from their con-
tribution to accountability. As other authors have noted, school improvement seems to be
School Effectiveness and School Improvement 505
driven by internal evaluations (Nevo, 2001; Vanhoof & Van Petegem, 2007). Robust
accountability regarding the quality of a school is, however, only obtained through a
complimentary external view, such as an inspection.
results in the two states involved do (not yet) lead to serious consequences for the schools,
as is the case in other countries, and the inspection reports are not made public. This
means that the results should not be generalised to contexts where the inspectorate
operates under very different conditions.
Note
1. The available achievement data are not appropriate for a longitudinal design because tests vary
substantially between years and no linkage is present. Therefore, the achievement data cannot
be used to test the effect of inspection on student achievement.
Notes on contributors
Dr. Holger Gaertner is a postdoctoral research scientist at the Institute for School Quality
Improvement (ISQ) at the Freie Universität Berlin, Germany. His research focuses on quality and
impact of school inspection, as well as school self-evaluation.
Sebastian Wurster is a research associate at the Institute for Quality Development in Education
(IQB) at the Humboldt-University of Berlin, Germany. His research interests are educational
governance and internal/external evaluation of schools.
Prof. Dr. Hans Anand Pant is Managing Director of the Institute for Quality Development in
Education (IQB) at the Humboldt-University of Berlin, Germany. His main research interests
include educational accountability systems, as well as the implementation of innovations in schools.
References
Böttcher, W., & Kotthoff, H.-G. (2007). Schulinspektion zwischen Rechenschaftslegung und
schulischer Qualitätsentwicklung: Internationale Erfahrungen [School inspections between
accountability and school-quality improvement: International experiences]. In W. Böttcher &
H.-G. Kotthoff (Eds.), Schulinspektion: Evaluation, Rechenschaftslegung und
Qualitätsentwicklung (pp. 9–20). Münster, Germany: Waxmann.
Brown, R. D., & Hauenstein, N. M. A. (2005). Interrater agreement reconsidered: An alternative to
the rwg indices. Organizational Research Methods, 8, 165–184.
Cohen, J. (1988). Statistical power analysis for the behavioral sciences (2nd ed.). New York, NY:
Erlbaum.
506 H. Gaertner et al.
Creemers, B. P. M., & Kyriakides, L. (2008). The dynamics of educational effectiveness. New York,
NY: Routledge.
Cuckle, P., & Broadhead, P. (1999). Effects of Ofsted inspection on school development. In C.
Cullingford (Ed.), An inspector calls. Ofsted and its effect on school standards (pp. 176–187).
London, UK: Kogan Page.
Cullingford, C., & Daniels, S. (1999). Effects of Ofsted inspections on school performance. In C.
Cullingford (Ed.), An inspector calls. Ofsted and its effect on school standards (pp. 59–69).
London, UK: Kogan Page.
De Wolf, I. F., & Janssens, F. J. G. (2007). Effects and side effects of inspections and accountability
in educations: An overview of empirical studies. Oxford Review of Education, 33, 379–396.
Dedering, K., & Mueller, S. (2011). School improvement through inspections? First empirical
insights from Germany. Journal of Educational Change, 12, 301–322.
Ehren, M. C., & Visscher, A. J. (2006). Towards a theory on the impact of school inspections.
British Journal of Educational Studies, 54, 51–72.
Ehren, M. C., & Visscher, A. J. (2008). The relationship between school inspections, school
characteristics and school improvement. British Journal of Educational Studies, 56, 205–227.
Gaertner, H. (2009). Jahresauswertung der Schulvisitationen des Schuljahres 2008/9 in
Brandenburg [Annual evaluation of school inspection for 2008/9 school year in
Brandenburg]. Berlin, Germany: ISQ. Retrieved from http://www.isq-bb.de/uploads/media/
Downloaded by [FU Berlin] at 23:26 06 August 2014
Jahresauswertung_Schulvisitation_2008_9.pdf
Gaertner, H. (2010). Jahresauswertung der Schulvisitationen des Schuljahres 2009/10 in
Brandenburg [Annual evaluation of school inspection for 2009/10 school year in
Brandenburg]. Berlin, Germany: ISQ. Retrieved from http://www.isq-bb.de/uploads/media/
Jahresauswertung_Schulvisitation_2009_10.pdf
Gaertner, H., Hüsemann, D., & Pant, H. A. (2009). Wirkungen von Schulinspektion aus Sicht
betroffener Schulleitungen. Die Brandenburger Schulleiterbefragung [Effects of school inspec-
tions from the perspective of principals]. Empirische Pädagogik, 23, 1–18.
Gaertner, H., & Pant, H. A. (2011). How valid are school inspections? Problems and strategies for
validating processes and results. Studies in Educational Evaluation, 37, 85–93.
Gaertner, H., & Wurster, S. (2009a). Befragung zur Wirkung von Schulinspektion in Berlin.
Ergebnisbericht [Survey on the impact of school inspection in Berlin]. Berlin, Germany:
Institut für Schulqualität der Länder Berlin und Brandenburg.
Gaertner, H., & Wurster, S. (2009b). Befragung zur Wirkung von Schulvisitation in Brandenburg.
Ergebnisbericht [Survey on the impact of school inspection in Brandenburg]. Berlin, Germany:
Institut für Schulqualität der Länder Berlin und Brandenburg.
Gray, C., & Gardner, J. (1999). The impact of school inspections. Oxford Review of Education, 25,
455–468.
Grek, S., Lawn, M., Lingard, B., & Varjo, J. (2009). North by northwest: Quality assurance and
evaluation processes in European education. Journal of Education Policy, 24, 121–133.
Hofman, R. H., Dijkstra, N. J., & Hofman, W. H. A. (2009). School self-evaluation and student
achievement. School Effectiveness and School Improvement, 20, 47–68.
Inspectie van het Onderwijs. (2006). Proportional supervision and school improvement from an
international perspective. A study into the (side) effects of utilizing school selfevaluations for
inspection purposes in Europe. Utrecht, The Netherlands: OPHIS/New Impuls.
Kyriakides, L., & Campbell, R. J. (2004). School self-evaluation and school improvement: A
critique of values and procedures. Studies in Educational Evaluation, 30, 23–36.
Landwehr, N. (2011). Thesen zur Wirkung und Wirksamkeit der externen Schulevaluation [Theses
on the impact and effectiveness of the external evaluation of schools]. In C. Quesel, V. Husfeldt,
N. Landwehr, & N. Steiner (Eds.), Wirkungen und Wirksamkeit der externen Schulevaluation
(pp. 35–70). Bern, Switzerland: h.e.p.
Luginbuhl, R., Webbink, D., & De Wolf, I. (2007). Do school inspections improve primary school
performance? The Hague, The Netherlands: CPB.
MacBeath, J. (1999). Schools must speak for themselves. London, UK: Routledge.
MacBeath, J., & Mortimore, P. (Eds.). (2001). Improving school effectiveness. Maidenhead, UK:
Open University Press.
Matthews, P., & Sammons, P. (2004). Improvement through inspection: An evaluation of the impact
of Ofsted´s work (HMI 2244). London, UK: Office for Standards in Education.
School Effectiveness and School Improvement 507
Ministerium für Bildung, Jugend und Sport. (2008). Schulvisitation im Land Brandenburg.
Handbuch zur Schulvisitation [School inspections in Brandenburg. Guide for school inspec-
tions] (2nd ed.). Potsdam, Germany: Author. Retrieved from http://www.isq-bb.de/uploads/
media/Handbuch_Schulvisitation_2.0_2008.pdf
Nevo, D. (2001). School evaluation: Internal or external? Studies in Educational Evaluation, 27,
95–106.
Organisation for Economic Co-operation and Development. (2005). PISA 2003 technical report.
Paris, France: Author.
Ozga, J. (2011, September). Key concepts in the governing by inspection project. Paper presented at
the European Conference on Educational Research (ECER), Berlin, Germany.
Plowright, D. (2007). Self-evaluation and Ofsted inspection. Developing an integrative model of
school improvement. Educational Management Administration & Leadership, 35, 373–393.
Rosenthal, L. (2004). Do school inspections improve school quality? Ofsted inspections and school
examination results in the UK. Economics of Education Review, 23, 143–151.
Scheerens, J., & Bosker, R. J. (1997). The foundations of educational effectiveness. Oxford, UK:
Pergamon.
Senatsverwaltung für Bildung, Jugend und Wissenschaft. (2013). School inspection in Berlin. Berlin,
Germany: Author. Retrieved from http://www.berlin.de/imperia/md/content/sen- bildung/schulqua
litaet/schulinspektion/handbuch_schulinspektion_english.pdf?start&ts=1368429527&file=hand
Downloaded by [FU Berlin] at 23:26 06 August 2014
buch_schulinspektion_english.pdf
Shaw, I., Newton, D. P., Aitkin, M., & Darnell, R. (2003). Do OFSTED inspections of secondary
schools make a difference to GCSE results? Britisch Educational Research Journal, 29, 63–75.
Vanhoof, J., & Van Petegem, P. (2007). Matching internal and external evaluation in an era of
accountability and school development: Lessons from a Flemish perspective. Studies in
Educational Evaluation, 33, 101–119.
Downloaded by [FU Berlin] at 23:26 06 August 2014
Appendix 1. Correlation between results of school inspection and perceptions of principals and teachers
Professional
508
School results Satisfaction .38** .35** –.13 .36** .22** .12 .37 .24** .11 .12 .19* .27** .12 .12 .21** .09
(.41) (.40) (–.14) (.40) (.24) (.16) (.41) (.32) (.13) (.15) (.21) (.30) (.13) (.17) (.23) (.10)
School culture Active participation .41** .23** –.18* .32** .16* .08 .24 .08 .17* –.02 .18* .25** .27** .05 .31** .07
(.44) (.26) (–.2) (.35) (.18) (.11) (.27) (.11) (.20) (–.02) (.20) (.27) (.29) (.07) .34) (.08)
Co-operation with .24** .22** –.11 .09 .09 .20* .01 –.01 .02 .02 .18* .19* .13 .21* .19* .16
partners
(.26) (.25) (–.12) (.10) (.10) (.27) (.01) (–.01) (.02) (.02) (.20) (.21) (.14) (.30) (.21) (.17)
School Managerial .18* .39** –.10 .23** .45** .16* .35 .17* .19* .16 .34** .37** .32** .17* .31** .14
management responsibility of the
school management
(.19) (.44) (–.11) (.25) (.50) (.21) (.39) (.23) (.22) (.20) (.37) (.41) (.35) (.24) (.34) (.15)
Quality management .23** .32** –.01 .21* .14 .09 .30 .19* .24** .10 .28** .29** .32** .27** .28** .17*
(.25) (.36) (–.01) (.23) (.16) (.12) (.34) (.25) (.28) (.12) (.31) (.32) (.35) (.38) (.31) (.18)
Instructional .37** .31** –.21* .25** .29** .16* .12 .29** .11 .18* .20* .23** .16 .14 .19* .21*
organisation
(.39) (.35) (–.23) (.27) (.32) (.21) (.13) (.38) (.13) (.22) (.22) (.25) (.17) (.20) (.21) (.23)
Professional Professionalization .27** .23** –.04 .20* .20* .04 .35 .20* .18* .09 .30** .31** .24** .15 .33** .25**
development (.29) (.25) (–.04) (.21) (.22) (.05) (.39) (.26) (.21) (.11) (.33) (.34) (.26) (.21) (.37) (.27)
Quality School charter .12 .22* –.06 .09 .15 –.06 .62* .07 –.01 .05 .28** .14 .24** .15 .17 .10
improvement (.13) (.25) (–.07) (.10 (.17) (.08) (.69) (.09) (–.01) (.06) (.31) (.15) (.26) (.21) (.19) (.11)
Quality improvement .09 .19* –.05 .04 .08 –.09 .29 .02 .06 .05 .06 .09 .19* –.01 .26** –.06
(.10) (.22) (–.05) (.04) (.09) (.12) (.32) (.03) (.07) (.06) (.07) (.10) (.21) (–.01) (.29) (–.06)
Notes: In brackets: correlations corrected for attenuation based on the reliability of the scales used. Underlined: correlations where the underlying variables match most in terms of content;
*p < .05, **p < .01.