You are on page 1of 10

AUQA Occasional Publication

Proceedings of the Australian Universities Quality Forum 2003

Student Surveys and Quality Assurance Professor Paul Ramsden, Pro Vice-Chancellor (Teaching and Learning), The University of Sydney The University of Sydney, NSW 2006, Australia In this presentation, I consider some issues to do with the use of survey information in reporting on and enhancing quality in Australian higher education. Although my principal focus will be on examples from the Course Experience Questionnaire, most of what I have to say is applicable to the Graduate Destination Survey and by extension to internal surveys of student experience and student satisfaction as well. My main proposition is that survey results on their own are of limited use in quality assurance. This is partly because the data are often misrepresented and used without a research-based understanding of the problems of interpreting this kind of information. On the other hand, the way in which a university articulates survey data with information from other sources, and the robustness of the links between these results and a universitys quality management system, are important indicators of quality. 1. Satisfaction or the Quality of Learning?

My remarks in this section refer to the Course Experience Questionnaire (CEQ). The CEQ, in its various forms, is a survey instrument for measuring the quality of students experiences of degree programs. It is the only properly validated instrument in the world that is derived from an articulated theory of the relations between student experiences of teaching and the quality of their learning outcomes. Its focus is not on what teachers or universities do (Does the teacher make the goals of the course clear?) but on whether students experience a teaching and learning environment that provides the conditions for effective learning (Do the students think that the goals are clear?) This is very far from being a trivial distinction. It exemplifies the difference between a learner-focused and a teacher-focused perspective on improving and auditing the quality of university teaching. Richardson (2003) has recently prepared an invaluable review of the expansive CEQ literature. He mentions three issues that are particularly relevant to this symposium. First, the CEQ is quite different in its origins and focus from surveys of student experience that concentrate on physical, administrative, and social support services, including facilities such as computing and library resources. Now these are material aspects of the quality of the student experience, but they play a different part in it from the dimensions of the CEQ, which are more central to the process of learning and the quality of learning outcomes. The new dimensions developed by McInnis, Griffin, James and Coates (2001) converge on these support aspects of the student experience. They are different in another way: they derive not from theory but from empirical methods such as focus groups with stakeholders and analysis of open responses. Their origin is pragmatic and they do not profess to be functionally associated with the quality of learning outcomes. The third issue is the most important. The CEQ is not, repeat not, primarily a survey of student satisfaction. Evidence of satisfaction is provided by a single item, but this is mainly a check on the validity of the other dimensions. If there were no association between experiences of good teaching and course satisfaction, for example, we would wonder whether we were measuring the right things in our good teaching scale.

Proceedings of the Australian Universities Quality Forum 2003

AUQA Occasional Publication

Several instruments exist to measure student satisfaction, but the satisfaction approach, as Richardson puts it,
Privileges satisfaction as a notion that is coherent, homogenous and unproblematic. The limited amount of research on this topic suggests that student satisfaction is a complex yet poorly articulated notion that is influenced by a wide variety of factors which are not intrinsically linked to the quality of teaching (p. 16).

These views are borne out in the evidence of other investigations of the student experience such as the University of Sydneys Academic Board Reviews of faculties. The quality of teaching and learning is ingrained in students experiences. It dominates their oral descriptions and evaluations; the students consider that the administrative and support aspect of the facultys provision plays a background role. They also view satisfaction as a dependent rather than an independent variable in the equation. Perhaps the most fundamental difficulty, also identified in Richardsons review, is the question of whether it makes sense to see satisfaction as an important goal of higher education in its own right. Certainly, few academics or employers would think so. Unquestionably, we would not regard academic satisfaction, though it is a tangible correlate of research performance, as a criterion for measuring research outcomes. This is not to say that surveys of student or graduate satisfaction are irrelevant, but rather that they do not tell us enough about the quality of the core business of a university. 2. Misrepresenting CEQ and GDS Results the process of winning without to data presentation, defining without actually cheating (see notable examples of chartsmanship CEQ data presented in Australian

Stephen Potter coined the word Gamesmanship to describe actually cheating. John Brignell extended the concept Chartsmanship as the art of using graphs to mislead http://www.numberwatch.co.uk/chartmanship.htm). There are some and its cousins in the Graduate Destination Survey (GDS) and Universities Quality Agency performance portfolios.

I want to give a few examples of these activities first, before going on to say something about the underlying reasons for them and the implications that flow for quality assurance (QA). The examples come from portfolios that are freely available on university web sites. They are simply for illustrative purposes and are in no way an assessment of these universities QA arrangements. 2.1 Data for Their Own Sake

Portfolios sometimes present CEQ or GDS data for no obvious reason. Table 1 shows some CEQ results appearing in one universitys draft performance portfolio. The results are inexplicably broken down by gender (but not field of study, which has a much stronger effect). They show almost no change over four years. What is the university trying to tell us?
Table 1 Average score on key CEQ scales (on 15 range), all respondents
CEQ scale Good teaching Gender Females Males All students Females Males All students Females Males All students 1997 3.3 3.3 3.3 3.7 3.7 3.7 3.8 3.8 3.7 1998 3.4 3.3 3.4 3.7 3.6 3.7 3.7 3.7 3.8 1999 3.3 3.3 3.3 3.7 3.6 3.7 3.8 3.8 3.8 2000 3.4 3.3 3.3 3.7 3.7 3.7 3.7 3.7 3.8

Generic skills

Overall satisfaction

AUQA Occasional Publication

Proceedings of the Australian Universities Quality Forum 2003

2.2

Inappropriate Level of Aggregation

CEQ and GDS results are often presented at the level of the whole university. However, employment figures and CEQ scores are susceptible to large field-of-study effects and aggregated results are of little value in making inter-institutional comparisons. The chart shown simply illustrates the fact that CEQ results aggregated to university level are hardly distinguishable between universities and change little from one year to the next. The portfolio did not attempt to justify on benchmarking grounds the reasons for the institutions chosen, nor did it explain why the chart showed two years of data.
Figure 1 CEQ surveys in 1999 and 2000: Undergraduate

2.3

Careful Selection of First Point in Time Series

When trying to demonstrate an improvement or increase in some measure, it is expedient to select as the first data point a time when the results were unusually low. Then by definition you can show that there is an upward trend. Why choose 1997 as the first point on the diagram shown below? The survey began in 1993; and on the same page of the universitys portfolio, equivalent data from an internal survey begin with 1998.
Figure 2 CEQ results

Proceedings of the Australian Universities Quality Forum 2003

AUQA Occasional Publication

The items are all from the Good Teaching Scale. Using 1997 as the first year disguises decline in some cases and, in other cases, distracts us from realising that many of the changes are trivially small and almost certainly due to normal variation. The chart has been further improved by a little work on the vertical axis, although the units in which it is measured are not marked. The designer has chosen a range as close as practicable to the variation, producing the interesting 2.6 to 3.4 scale. 2.4 Assertions Inconsistent with the Evidence

It is common to see universities making assertions that are unsupported by the CEQ evidence. The two portfolios in which Tables 2 and 3 appear make much of their universities emphasis on generic graduate attributes and they use the tables to demonstrate their excellent performance in this area. I may be missing something, but I cannot see any differences of practical significance, while some of the differences are in favour of the national figures anyway. 2.4.1 CEQ and CSQ Results of Core Skills, 19972001
Note: CEQ responses were measured using a 5-point scale with only 2 anchors, 1 (strongly disagree) and 5 (strongly agree); CSQ responses were measured using a similar 5-point scale with all 5 anchors (1=strongly disagree, 2=disagree, 3=neutral, 4=agree, 5=strongly agree).

Table 2 CEQGeneric Skills Scale: University X compared with the national average
1997 Uni X Item 2 Item 5 Item 9 Item 10 Item 11 Item 25 3.9 3.8 3.3 3.5 3.8 3.6 Nat 3.8 3.8 3.2 3.5 3.8 3.7 1998 Uni X 3.9 3.9 3.3 3.6 3.8 3.7 Nat 3.8 3.8 3.3 3.5 3.8 3.7 1999 Uni X 3.9 3.9 3.5 3.6 3.9 3.7 Nat 3.8 3.8 3.3 3.6 3.8 3.7 2000 Uni X 3.9 3.8 3.4 3.6 3.8 3.7 Nat 3.8 3.8 3.3 3.6 3.8 3.8 2001 Uni X 3.9 3.8 3.5 3.6 3.9 3.7 Nat

Table 3 Generic Skills Scale


Percentage agreement University X GCCA Internal CEQ CEQ 62 63 62 62 61 65 62 64 62 62 National GCCA CEQ CEQ Percentage broad agreement University X GCCA Internal CEQ CEQ 86 87 87 86 85 90 87 89 87 86 National CEQ

Year 1997 1998 1999 2000 2001

AUQA Occasional Publication

Proceedings of the Australian Universities Quality Forum 2003

2.5

Evidence Contradicting Interpretation

Perhaps the most daring form of chartsmanship is to present information that contradicts your interpretation, while saying that it supports it. If you go this far, it is prudent to present the results in an appendix and to hope that the auditors dont bother to read it. Figure 3 shows results from an internal student survey that used the same items as the Good Teaching CEQ Scale. The portfolio asserts that there has been improvement over time. You dont need fancy statistics to see that the scores on three items have gone up a bit, two have gone down a bit, and one has stayed more or less the same. This is a generous definition of improvement. Universities should either heed the advice of the Australian ViceChancellors Committee/Graduate Careers Council of Australia (GCCA) code of practice, applying the rule that Differences in CEQ scores which can be considered worthy of note are those that exceed onethird of the relevant standard deviation, or qualify their claims.
Figure 3 CSQ results

2.6

Incorrect Use of GDS Statistics

Misuses of the GDS employment statistics are to be found in The Good Universities Guide, which specialises in the artful use of rank ordering. This has the effect of making minuscule differences in employment rates by field of study have a massive impact on the overall rating. Another typical error that appears in both The Good Universities Guide and in at least one portfolio is to treat all graduates as if they had been full-time students. The effect is to make courses with many part-time students appear as if they are more successful in obtaining work for their students (the graduates simply continue in their existing jobs, which says nothing about their quality). Table 4 also uses the wrong level of aggregation the employability of graduates depends on the mix of fields of study in a university.

Proceedings of the Australian Universities Quality Forum 2003

AUQA Occasional Publication

Table 4 Main destination of all graduates: Australian residents


2001 Uni X, % In full-time employment Seeking fulltime, working part-time Not working 86.6 Nat, % 85.5 2000 Uni X, % 87.6 Nat, % 86.0 1999 Uni X, % 87.1 Nat, % 83.4 1998 Uni X, % 82.3 Nat, % 82.2

8.3

8.5

7.6

8.4

8.0

9.8

10.6

10.2

5.1

5.9

4.8

5.6

4.9

6.8

7.2

7.6

3.

Avoiding the Errors and Using the Results Sensibly

It is easy to find examples of incorrect and doubtful uses of CEQ and GDS statistics. Rather than censure the universities for trying to pull the wool over the auditors eyes, we should realise that the oversights stem from an understandable desire to please the examiners through focusing only on good results, assisted by inadequate knowledge of the measurement properties of surveys. It is not enough merely to avoid the blunders of overinterpretation and incorrect comparators, or to abstain from the finagling of statistics and egregious chartsmanship. Energetic application of understanding based on good technical knowledge of the measurement issues plus intelligent use of the principles of QA in teaching and learning is required. This in turn needs a certain attitude of mind founded on openness to evidence and the confidence to admit that you might have been wrong. In short, an evidence-based approach, making use of the best available researchrather than one erected on the shifting sands of prejudice, hunches, opinion and guessworkis required. Formal training in this field devotes plentiful attention to overcoming impulsive tendencies to treat survey data as if they had the same properties of validity and accuracy as, say, a concrete measure such as a metre rule. The interpreter must realise that the results are subject to variation from many sources other than the quality of teaching and learning and that this variation must be accounted for before any changes over time or differences between universities are considered to be real. It is generally meaningless to make comparisons except within fields of study. Comparing universities except within fields is a most hazardous occupation. The differences between universities are generally small and probably unimportant. Similarly, much of the divergence over time is simply random variation. CEQ and GDS scores go up and down. Temperatures go up and down. Rainfall patterns vary. Death rates change. Confirming a real direction of change even with physical data is difficult and the conclusions are often contentious, even among experts, especially when people try to derive linear trends. With survey data the sources of error are more numerous and circumspection must be the rule. There are other issues, some of them unresolved, that restrict the CEQs applicability: For example, it is true that larger academic units tend to get lower scores; and it may be true that students in larger universities are more likely to experience a more impersonal teaching environment. Thus size and quality may be confused. There is no easy answer to how they might be disentangled. While there is no evidence at all that brighter students have more negative experiences than duller ones (the evidence in fact points in the opposite direction) it is also probable, though not proven, that first-generation higher education studentsthose whose parents have not attended a universityhave lower expectations, and that they are therefore more likely to experience better teaching and learning. There also appears to be some influence of ageyounger students, including

AUQA Occasional Publication

Proceedings of the Australian Universities Quality Forum 2003

those straight from school, seem to be more critical than older ones. Thus expectations, age, and quality may be confused. The results are heavily lagged: the impact of changes in teaching and the curriculum cannot be expected to become apparent for several years (this is a compelling reason for using the CEQ in student surveyssee below). These difficulties do not mean that we cannot rely on the CEQ, as some people have tirelessly argued. They do mean that its results must be applied within their proper sphere. The next step is to focus on this appropriate domain. If three-year-old Jane tells me that the cat has fallen down the well, I may believe her; but if I hear mewing and see a pair of shiny eyes down there, I am much more likely to think its true. The first rule is that we should never consider CEQ and GDS results in isolation but always in association with other sources of information. For example, what do student (not graduate) surveys say? What do the open comments in the CEQ say? What do employers say about why they employ our graduates? What did accreditation bodies say? What is the feedback from overseas institutions about the standard of our students who go there to pursue further study? Do these sources tell a similar story and, if not, why not? The second rule is to make appropriate comparisons. Universities should choose their benchmark partners carefully. National comparisons are generally less useful than comparisons with direct competitors or other universities of similar standing. It is folly to maintain the fiction that all universities have similar students and provide similar kinds of experiences for them, and that the CEQ is adroit enough to measure a transcendent aspect of quality that is unrelated to a universitys mission. The benchmarks will be more valuable if international universities can be included as well as Australian ones. If the CEQ is used as the basis of one universitys student surveys, the results from ones own students may be compared with the benchmark institutions through adjusting the results mathematically to produce estimates of student survey results at the other universities. The third rule is only to report time series differences that are substantial and can be explained with reference to some specific interventions. Why has the change in the CEQ scores occurred? How do you know it is the result of something you did? Answers to these questions matter just as much as the improved scores. An example may help to explain this. Figure 4 is a chart of some results from a subject taught by two of our award-winning university teachers. They used the CEQ to evaluate the effects of the major changes in assessment and teaching which they introduced progressively over six years. All the changes were derived from research evidence about the links between student learning outcomes and students experiences. The first years data are not arbitrarily chosen, but are from the reference yearthe year before the teaching interventions. The results presented show the impact of an evidence-based approach to improvement. Effect sizes are medium to large; changes have been sustained over six years; the chart shows no signs of attempts to mislead; and the results were accompanied by a methodical and reflective explanation of relations between the modifications and the observed changes. In my view, the claim for improvements in quality is persuasive. The fourth rule is perhaps the most important. Whether writing an account of a QA process or assessing its effectiveness, focus more attention on the use of the results than on the results themselves. It is surprising to see how little this advice has been implemented in universities performance portfolios.

Proceedings of the Australian Universities Quality Forum 2003

AUQA Occasional Publication

Figure 4 Changes in students experiences of animal science, 19962001


5 4.5 4 3.5 3 2.5 2 1.5 1

1996 1997 1998 1999 2000 2001

Appropriate workload

Appropriate assessment

Generic skills Good teaching

Clear goals

I am grateful to Associate Professor Rosanne Taylor and Dr Michelle Hyde for the data on which Figure 4 is based.

In the domain of QA for teaching and learning, narratives that describe the steps a faculty or institution has taken in response to less-than-satisfactory student feedback are often more convincing than claims about CEQ or GDS performance. Naturally, the most satisfactory stories are those that use CEQ or similar data as the springboard for action, then articulate the changes, then provide evidence of their effectiveness (using data from sources other than the CEQ), then locate the entire process in some kind of coherent quality system (I shall say more about this in a moment). A mature academic QA process has an atmosphere of effortless invisibility about it. The fifth rule is a derivative of the fourth. It is best to concentrate on one or two areas related to the core mission of the university and describe what is being done to address them (if the results are unsatisfactory) or to explain why they are perceived to be good (if they are satisfactory or better). In my view, a leading edge report using CEQ data would be reflective and relational: not a list of CEQ strengths and weaknesses, but a reasoned case which identifies specific areas for improvement in relation to appropriate benchmark data. 4. Coherent Quality Systems

As an illustration of how an institution might align CEQ/GDS results within a quality management framework for teaching and learning, I use my own universitys approach to articulating the results with both QA mechanisms and other sources of information. While I am sure that it remains an imperfect arrangement, I think that it presents one model appropriate especially to a research-intensive environment. It has the advantage of being a living systemone which is actually in operation, rather than a theoretical idealconstituted from the best available evidence. The system is designed explicitly around the theoretical model of student learning that underlies the original CEQ. The model forms the basis for a series of performance indicators and QA processes. The idea behind the system can be expressed in a series of principles:

AUQA Occasional Publication

Proceedings of the Australian Universities Quality Forum 2003

Apply a student-focused perspective to auditing and enhancing the quality of teaching and learning. Strive for alignment between collegial and managerial QA processes. Aim for union between the QA system and fundamental academic values. Use an evidence-based approach to QA in teaching and learning. Employ multiple sources of evidence. Use methods and indicators that are consistent with measures of research performance. Always seek to compare with appropriate benchmarks, preferably external benchmarks and international standards. Reward and recognise performance in several ways. Here, of course, I will concentrate on the part of the framework that uses CEQ, GDS and similar survey data. Accounts of the whole system have appeared in Prosser and Barrie (2003) and Ramsden (2003). The University of Sydney introduced an annual survey of coursework students, based on the CEQ but with the addition of questions about university and faculty services and facilities, in 1999. The Student Course Experience Questionnaire (SCEQ) also incorporates at least one item from three of the extended CEQ scales, and includes the entire learning community scale. The SCEQ results provide evidence of the impact of changes to teaching and curriculum much more quickly than the CEQ; they are also available by year of study. Results are made available on a web site to all students and staff of the university; the results and their implications are interrogated in the Academic Boards reviews of faculties and discussed in collegial forums such as the quality assurance working group, and the first year experience working group, on both of which all faculties are represented. Together with CEQ and GDS data, as well as student progress data (in all cases benchmarked with Group of Eight averages), the SCEQ results have a substantial effect on faculty funding for teaching. In 2003, about $4.3 million was distributed to recognise and reward performance in undergraduate teaching, with the SCEQ results forming the most important single factor. Additional dollar incentives are provided through the annual teaching improvement fund, which is a targeted fund available to faculties for addressing recommendations for improvement arising from the Academic Boards reviews; and the scholarship index, a performance-based fund that rewards contributions to the scholarship of teaching and the possession of qualifications in university teaching. The SCEQ results for the University of Sydney are directly comparable with survey results from the Universities of Oxford and Queensland, which use similar instruments. These results at degree level form an important part of our quality enhancement benchmarking activities. A second level of the system, for evaluating units of study, uses the CEQ and SCEQ as the basis of its core questions. These questions directly correspond to the aspects of the learning experience measured by the SCEQ and CEQ, and this alignment encourages academics to focus teaching improvement and curriculum development efforts on relevant aspects of the student learning experience. A survey instrument for evaluating the quality of research higher degree students experiences, based on fundamental work on the quality of research training deriving from the results of an Australian Research Council Discovery project, was introduced in 2002. Its results may provide an input to research training funds in later years. The surveys cohere not only with each other but also with the work of the Universitys Institute for Teaching and Learning. The teaching evaluation system provides a common student-focused perspective as a basis for many of the academic development initiatives undertaken by the Institute. For example, the mandatory three-day program in university teaching for new academics draws extensively on the CEQ and SCEQ results as a way of providing a link between practical teaching skills and the underlying evidence-based perspective on student learning.

Proceedings of the Australian Universities Quality Forum 2003

AUQA Occasional Publication

In these several ways, together with others that would require a more extended treatment than I can give here, the University of Sydney aspires to be a leader among research-intensive universities in the management and evaluation of teaching and learning. The implications of the approach bring us back to the appropriate use of CEQ and similar survey data in university QA. I believe that universities have some way to go in embedding student and graduate survey data in their internal QA processes. The information is freely given to us as a contribution to improving the educational process. We have a responsibility to learn how to use it wisely and truthfully in the best interests of our students. References McInnis, C., Griffin, P., James, R., & Coates, H. (2001). Development of the Course Experience Questionnaire. Retrieved from http://www.detya.gov.au/highered/eippubs/eip01_1/default.htm Prosser, M., & Barrie, S. (2003). Using a student-focused learning perspective to align academic development with institutional quality assurance, in R. Blackwell & P. Blackmore (Eds.) Towards Strategic Staff Development. Milton Keynes: OU Press. Ramsden, P. (2003). Learning to teach in higher education, 2nd Edn. London: Routledge. Richardson, J. T. E. (2003). Instruments for obtaining student feedback: A review of the literature. Appendix 3 of Collecting and Using Student Feedback on Quality and Standards of Learning and Teaching in Higher Education. A Report to the Higher Education Funding Council for England.

An earlier version of this paper was presented at the GCCA Symposium, Graduates: Outcomes, Quality and the Future (Canberra, 2425 March 2003).