You are on page 1of 19

JOURNAL OF RESEARCH IN SCIENCE TEACHING VOL. 46, NO. 8, PP.

865–883 (2009)

PISA 2006: An Assessment of Scientific Literacy


Rodger Bybee,1 Barry McCrae,2 Robert Laurie3
1
Chair, PISA 2006 Science Expert Group, Golden, Colorado 80401
2
Australian Council for Educational Research, Camberwell, Victoria, Australia
3
New Brunswick Department of Education, Fredericton, New Brunswick, Canada

Received 1 July 2009; Accepted 21 July 2009

Abstract: This article introduces the essential features of the science component of 2006 Program for International
Student Assessment (PISA). Administered every 3 years, PISA alternates emphasis on Reading, Mathematics, and Science
Literacy. In 2006, PISA emphasized science. This article discusses PISA’s definition of scientific literacy, the three
competencies that constitute scientific literacy, the contexts used for assessment units and items, the role of scientific
knowledge, and the importance placed on attitude toward science. PISA 2006 included a student test, a student
questionnaire, and a questionnaire for school administrators. The student test employed a balanced incomplete block
design involving thirteen 30-minute clusters of items, including nine science clusters. The 13 clusters were arranged into
thirteen 2-hour booklets and each sampled student was assigned one booklet at random. Mean literacy scores are presented
for all participating countries, and the percentages of OECD students at the six levels of proficiency are given for the
combined scale and for the competency scales. ß 2009 Wiley Periodicals, Inc. J Res Sci Teach 46: 865–883, 2009
Keywords: PISA; Program for International Student Assessment; science assessment; scientific literacy; competencies;
contexts for curriculum and assessment; test design and development

In 1997, the Organization for Economic Cooperation and Development (OECD) created the Program for
International Student Assessment (PISA) as a collaborative effort by member countries and a number of non-
member countries.1 PISA represents a commitment by the governments of OECD member countries to
monitor the outcomes of their education systems in terms of student achievement, within a common
international framework. OECD countries recognize that students must develop skills and knowledge
associated with 21st century priorities (Bybee & Fuchs, 2006). PISA provides policy-oriented international
indicators of the skills and knowledge of 15-year-old students and sheds light on a range of factors that
contribute to successful students, schools, and educational systems (Bussière, Knighton, & Pennock, 2007).
PISA combines assessment of the science, mathematics and reading domains, with information about
students’ home background and their views of learning and learning environments.
PISA surveys have been administered every 3 years since 2000 to students in reading, mathematical, and
scientific literacy. Results from the assessments and accompanying questionnaires make it possible for
countries to regularly monitor and knowledgeably predict their progress in meeting key learning outcomes in
the three assessment domains. In each PISA student assessment, one of these areas is the ‘‘major’’ domain and
the other two are the ‘‘minor’’ domains. Reading was the major domain in 2000, mathematic literacy in 2003,
and scientific literacy in 2006. Reading is again the major domain in 2009. Turner (2009) provides a
comprehensive general introduction to PISA.

Scientific Literacy
Scientific literacy has become the term used to express the broad and encompassing purpose of science
education. The use of the term most likely began with James Bryant Conant in the 1940s (Holton, 1998) and

Correspondence to: R. Bybee; E-mail: rbybee@bscs.org


DOI 10.1002/tea.20333
Published online 26 August 2009 in Wiley InterScience (www.interscience.wiley.com).

ß 2009 Wiley Periodicals, Inc.


866 BYBEE, McCRAE, AND LAURIE

was elaborated for educators in a 1958 article by Paul DeHart Hurd entitled ‘‘Science Literacy: Its meaning
for American Schools.’’ Hurd described scientific literacy as an understanding of science and its applications
to social experience. Science had such a prominent role in society, Hurd argued, that economic, political, and
personal decisions could not be made without some consideration of the science and technology involved
(Hurd, 1958).
Contemporary Clarification of Scientific Literacy
A number of authors, reports, and books have attempted to clarify the curricular orientation and
instructional emphasis of scientific literacy as a purpose of science education (e.g., Bybee, 1997; Koballa,
Kemp, & Evans, 1997; Mayer & Kumano, 2002). DeBoer (2000) has provided an excellent historical and
contemporary review of scientific literacy. In 2006, Robin Millar addressed historic and definitial issues of the
term before outlining the role of scientific literacy in the Twenty First Century Science Course. In discussing
the distinctive features of this curriculum, Millar points out the novel content (e.g., epidemiology, health),
broad qualitative understanding of whole explanations, and a strong emphasis on ideas about science (Millar,
2006). In PISA 2006 Science, the reader will recognize variations on these three features as: context,
competencies, and knowledge about science.
Two essays further clarify issues related to discussions of scientific literacy. In ‘‘Science Education for
the Twenty First Century,’’ Osborne (2007) makes the case that contemporary science curricula and practices
are primarily ‘‘foundationalist’’ in that they emphasize educating for future scientists versus educating future
citizens. The perspective of scientific literacy in PISA 2006 emphasizes educating future citizens.
A second essay by Douglas Roberts published in Handbook of Research on Science Education (Abell &
Lederman, 2007) identifies continuing political and intellectual tensions in science education. The two
politically conflicting emphases can be clarified by a question—Should curriculum emphasize science
subject matter itself, or should it emphasize science in life situations in which science plays a key role?
Curriculum designed to answer the former, Roberts refers to as Vision I; the latter he refers to as Vision II.
Vision I looks within science, while Vision II uses external contexts that students are likely to encounter as
citizens. PISA 2006 Science represents an assessment emphasizing Vision II.

Scientific Literacy as Defined in PISA 2006


In PISA 2006 Science, the essential qualities of scientific literacy include the ability to apply scientific
understandings to life situations involving science. The central point of the PISA 2006 Science assessment
can be summarized as follows: the assessment focused on scientific competencies that clarify what 15-year-
old students should know and be able to do within appropriate personal, social, and global contexts.
In PISA 2006 Science, scientific literacy referred to four interrelated features that involve an
individual’s:

 Scientific knowledge and use of that knowledge to identify questions, to acquire new knowledge, to
explain scientific phenomenon, to draw evidence-based conclusions about science-related issues.
 Understanding of the characteristic features of science as a form of human knowledge and enquiry.
 Awareness of how science and technology shape our material, intellectual, and cultural environments.
 Willingness to engage in science-related issues, and with the ideas of science, as a constructive,
concerned, and reflective citizen (OECD, 2006).

The PISA 2006 Framework for Assessing Scientific Literacy


Translating a definition of scientific literacy into an international assessment required a way of
organizing the domain of scientific literacy so a test could be designed, administered, and the results analyzed
and reported.
PISA 2006 Science situated its definition of scientific literacy and its science assessment questions
within a framework that used the following components: scientific contexts (i.e., life situations involving
science and technology); the scientific competencies (i.e., identifying scientific issues, explaining phenomena
scientifically, and using scientific evidence); the domains of scientific knowledge (i.e., students’
Journal of Research in Science Teaching
PISA 2006 SCIENCE 867

understanding of scientific concepts as well as their understanding of the nature of science); and student
attitudes toward science (i.e., interest in science, support for scientific inquiry, and responsibility toward
resources and environments). These four aspects of the PISA 2006 conception of scientific literacy are
illustrated in Figure 1.
Scientific Contexts
The scientific contexts align with various issues citizens confront. PISA 2006 Science items were framed
within a wide variety of life situations involving science and technology, primarily ‘‘health,’’ ‘‘natural
resources,’’ ‘‘environment,’’ ‘‘hazards,’’ and ‘‘frontiers of science and technology.’’ Students confront life
situations in these contexts at personal, social, and global levels. Table 1 describes the contexts for PISA 2006.
Scientific Competencies
The PISA 2006 science competencies required students to identify scientific issues, explain phenomena
scientifically, and use scientific evidence. These three key scientific competencies were selected because of
their relationship to the practice of science and their connection to key abilities such as inductive and
deductive reasoning, systems-based thinking, critical decision making, transformation of data to tables and
graphs, construction of arguments and explanations based on data, thinking in terms of models, and use of
mathematics. Table 2 describes the several features of the three competencies.
The scientific competencies can be illustrated with a contemporary example. Global climate change has
become one of the most talked about and controversial global issues. As people read or hear about climate
change, they must separate the scientific reasons for change from economic, political, and social issues.
Scientists explain, for example, the origins and material consequences of releasing carbon dioxide into the
Earth’s atmosphere. This scientific perspective has been countered with an economic argument against

Figure 1. Framework for PISA 2006 science assessment.

Journal of Research in Science Teaching


868 BYBEE, McCRAE, AND LAURIE

Table 1
Contexts for the PISA 2006 science assessment
Personal (Self, Family Social Global
and Peer Groups) (The Community) (Life Across the World)
Health Maintenance of health, Control of disease, social Epidemics, spread of
accidents, nutrition transmission, food choices, infectious diseases
community health.
Natural resources Personal consumption of Maintenance of human Renewable and non-renewable,
materials and energy populations, quality of life, natural systems, population
security, production and growth, sustainable use of
distribution of food, energy species
supply
Environment Environmentally friendly Population distribution, Biodiversity, ecological
behavior, use and disposal disposal of waste, sustainability, control of
of materials environmental impact, pollution, production and
local weather loss of soil
Hazards Natural and human-induced, Rapid changes (earthquakes, Climate change, impact of
decisions about housing severe weather), slow modern warfare
and progressive changes
(coastal erosion,
sedimentation), risk
assessment
Frontiers of science Interest in science’s explanations New materials, devices Extinction of species,
and technology of natural phenomena, and processes, genetic exploration of space, origin
science-based hobbies, modification, weapons and structure of the universe
sport and leisure, music and technology, transport
personal technology

Assessing scientific, reading and mathematical literacy: A framework for PISA 2006 (OECD, 2006).

reduction of greenhouse gases. Citizens should recognize the difference between scientific and economic
positions. Further, as people are presented with more, and sometimes conflicting, information about
phenomena, such as climate change, they need to be able to access scientific knowledge and understand, for
example, the scientific assessments of bodies such as the Intergovernmental Panel on Climate Change
(IPCC). Finally, citizens should be able to use the results of scientific studies about climate change to
formulate an informed opinion about its personal, social, and global consequences.

Scientific Knowledge
In PISA 2006 Science, scientific literacy also encompassed both knowledge of science and knowledge
about science itself. The former includes understanding fundamental scientific concepts; the latter includes
understanding inquiry and the nature of scientific explanations.

Table 2
PISA 2006 scientific competencies
Identifying scientific issues
Recognizing issues that are possible to investigate scientifically
Identifying keywords to search for scientific information
Recognizing the key features of a scientific investigation
Explaining phenomena scientifically
Applying knowledge of science in a given situation
Describing or interpreting phenomena scientifically and predicting changes
Identifying appropriate descriptions, explanations, and predictions
Using scientific evidence
Interpreting scientific evidence and making and communicating conclusions
Identifying the assumptions, evidence and reasoning behind conclusions
Reflecting on the societal implications of science and technological developments

Journal of Research in Science Teaching


PISA 2006 SCIENCE 869

Given that only a sample of students’ knowledge of science can be assessed in a large-scale assessment
such as PISA, it is important that clear criteria guide the selection of knowledge to be assessed. Moreover, the
objective of PISA is to describe the extent to which students can apply their knowledge in relevant contexts.
Accordingly, the assessed knowledge of science is from the major fields of physics, chemistry, biology,
Earth and space science, and technology. In choosing which knowledge of science to assess in the
PISA test and questionnaires, the following criteria were used: its relevance to life situations; the importance
of the scientific concepts and their enduring utility; and, its appropriateness to the developmental level of
15-year-old students. Table 3 displays the PISA 2006 knowledge of science categories. The examples listed
in Table 3 convey the meanings of the categories and respect the selection criteria. No attempt is made to
list comprehensively all the knowledge that could be related to each of the categories.
Table 4 displays the categories and content examples of knowledge about science. The first category,
‘‘scientific inquiry,’’ centers on inquiry as the central process of science and various components of that
process. The second category, ‘‘scientific explanations,’’ represents the results of scientific inquiry. These two
categories are thus closely related. The relationship can be thought of as one where the means of science (how
scientists get data or evidence) leads to knowledge claims and finally scientific explanations (how scientists
use data or evidence). The examples listed in Table 4 convey the general meanings of the categories. Here too,
no attempt is made to list comprehensively all the processes or knowledge in each category.
Attitudes toward Science
Attitudes toward science play an important role in scientific literacy. They underlie an individual’s
interest in, attention to, and response to science and technology. Many education systems around the world
include attitudes as an important outcome of science education but few, if any, assess such outcomes. An
important goal of science education is for students to develop interest in and support for scientific inquiry as
well as to acquire and to subsequently apply scientific and technological knowledge for personal, social, and
global benefit. That is, a person’s scientific literacy includes certain attitudes, beliefs, and motivational
orientations that influence personal actions. The inclusion of attitudes and the specific areas selected for PISA

Table 3
PISA 2006 knowledge of science categories
Physical systems
Structure of matter (e.g., particle models, bonds)
Properties of matter (e.g., changes of state, thermal and electrical conductivity)
Chemical changes of matter (e.g., reactions, energy transfer, acids/bases)
Motions and forces (e.g., velocity, friction)
Energy and its transformation (e.g., conservation, dissipation, chemical reactions)
Interactions of energy and matter (e.g., light and radio waves, sound and seismic waves)
Living systems
Cells (e.g., structures and function, DNA, plant and animal)
Humans (e.g., health, nutrition, disease, reproduction, subsystems [such as digestion, respiration, circulation,
excretion, and their relationship])
Populations (e.g., species, evolution, biodiversity, genetic variation)
Ecosystems (e.g., food chains, matter and energy flow)
Biosphere (e.g., ecosystem services, sustainability)
Earth and space systems
Structures of the earth systems (e.g., lithosphere, atmosphere, hydrosphere)
Energy in the earth systems (e.g., sources, global climate)
Change in earth systems (e.g., plate tectonics, geochemical cycles, constructive and destructive forces)
Earth’s history (e.g., fossils, origin and evolution)
Earth in space (e.g., gravity, solar systems)
Technology systems
Role of science-based technology (e.g., solve problems, help humans meet needs and wants, design and conduct
investigations)
Relationships between science and technology (e.g., technologies contribute to scientific advancement)
Concepts (e.g., optimization, trade-offs, cost, risk, benefit)
Important principles (e.g., criteria, constraints, cost, innovation, invention, problem solving)

Journal of Research in Science Teaching


870 BYBEE, McCRAE, AND LAURIE

Table 4
PISA 2006 knowledge about science categories
Scientific inquiry
Origin (e.g., curiosity, scientific questions)
Purpose (e.g., to produce evidence that helps answer scientific questions, such as current ideas, models and
theories to guide inquiries)
Experiments (e.g., different questions suggest different scientific investigations, design)
Data (e.g., quantitative [measurements], qualitative [observations])
Measurement (e.g., inherent uncertainty, replicability, variation, accuracy/precision in equipment and procedures)
Characteristics of results (e.g., empirical, tentative, testable, falsifiable, self-correcting)
Scientific explanations
Types (e.g., hypothesis, theory, model, law)
Formation (e.g., existing knowledge and new evidence, creativity and imagination, logic)
Rules (e.g., logically consistent, based on evidence, based on historical and current knowledge)
Outcomes (e.g., new knowledge, new methods, new technologies, new investigations)

2006 is supported by and builds upon Klopfer’s (1976) structure for the affective domain in science education,
a survey by Piper and Moore (1977), as well as more recent reviews of attitudinal research (e.g., Blosser,
1984; Gardner, 1975, 1984; Gauld & Hukins, 1980; Koballa & Glynn, 2007; LaForgia, 1988; Schibeci, 1984;
Simpson, Koballa, Oliver, & Crawley, 1994). The PISA 2006 Science assessment evaluated students’
attitudes in three areas: interest in science, support for scientific inquiry, and responsibility toward resources
and environment (see Table 5).
Interest in science was selected because of its established relationships with achievement, course
selection, career choice, and lifelong learning. The relationship between (individual) interest in science and
achievement has been the subject of research for more than 40 years although there is still debate about the
causal link (see, e.g., Baumert & Köller, 1998; Osborne, Simon, & Collins, 2003). The PISA 2006 Science
assessment addressed students’ interest in science through knowledge about their engagement in science-
related social issues, their willingness to acquire scientific knowledge and skills, and their consideration of
science-related careers.
Support for scientific enquiry is widely regarded as a fundamental objective of science education and as
such warrants assessing. It is a similar construct to ‘‘adoption of scientific attitudes’’ as identified by Klopfer
(1971). Appreciation of and support for scientific enquiry implies that students value scientific ways of
gathering evidence, thinking creatively, reasoning rationally, responding critically, and communicating
conclusions, as they confront life situations related to science. Aspects of this area in PISA 2006 included
the use of evidence (knowledge) in making decisions, and the appreciation for logic and rationality in
formulating conclusions.
Responsibility towards resources and environments is of international concern, as well as being
of economic relevance. Attitudes in this area have been the subject of extensive research since the 1970s

Table 5
PISA 2006 areas for assessment of attitudes toward science
Interest in science
Indicate curiosity in science and science-related issues and endeavors
Demonstrate willingness to acquire additional scientific knowledge and skills, using a variety of resources and methods
Demonstrate willingness to seek information and have an ongoing interest in science, including consideration of
science-related careers
Support for scientific enquiry
Acknowledge the importance of considering different scientific perspectives and arguments
Support the use of factual information and rational explanations
Express the need for logical and careful processes in drawing conclusions
Responsibility towards resources and environments
Show a sense of personal responsibility for maintaining a sustainable environment
Demonstrate awareness of the environmental consequences of individual actions
Demonstrate willingness to take action to maintain natural resources

Journal of Research in Science Teaching


PISA 2006 SCIENCE 871

(see, e.g., Bogner & Wiseman, 1999; Eagles & Demare, 1999; Rickinson, 2001; Weaver, 2002). In December
2002, the United Nations approved resolution 57/254 declaring the 10-year period beginning on January 1,
2005 to be the United Nations Decade of Education for Sustainable Development (UNESCO, 2003). The
International Implementation Scheme (UNESCO, 2005) identifies environment as one of the three spheres of
sustainability (along with society, including culture, and economy) that should be included in all education
for sustainable development programs.
PISA 2006 Science gathered data about student attitudes both by posing non-contextualized questions in
the student questionnaire, and by posing contextualized questions in the achievement test. The student
questionnaire contained questions in each of the three areas—interest in science, support for scientific
inquiry, responsibility toward resources and environments. Contextualized items were used to gather data
only in relation to interest in learning about science and student support for scientific inquiry. These
embedded items pertained to the context of the unit and were presented immediately after the other items in
the unit. As such, they were an integral part of the unit.
Including attitudes toward science provided an international portrait of students’ general appreciation
of science, their specific scientific attitudes and values, their environmental stewardship, and their
responsibility toward selected science-related issues that have national and international ramifications.
PISA 2006 Science student attitude data were not considered in the calculation of students’ scientific
literacy scores.

The PISA 2006 Survey: Test Design and Development


PISA 2006 included three compulsory instruments: a student test, a student questionnaire, and a school
questionnaire. Participating students completed the test and the student questionnaire, and school principals
completed the school questionnaire. Each of these questionnaires required approximately 30 minutes to
complete.
The student questionnaire was designed to obtain data on contextual indicators that relate student
achievement to important variables. These variables include family backgrounds, aspects of students’ lives,
strategies of self-regulated learning, and aspects of learning and instruction in science.
The school questionnaire obtained data related to aspects of schools, such as the quality of the schools’
human and material resources, public and private control and funding, decision-making processes, staffing
practices, context of instruction including institutional structures and types and class size, and the level of
parental involvement (OECD, 2006).
In addition to these three compulsory PISA 2006 instruments, there were two optional international
questionnaires. One was a 10-minute questionnaire on information technology and communications. This
questionnaire gathers information on the availability and use of information and communications technology
(ICT), including the location where ICTs are mostly used, as well as the type of use; ICT confidence and
attitudes, including self-efficacy and attitudes toward computers and their uses; and, learning background of
ICT, focusing on where students learned to use computers and the Internet. The other international option was
a parent questionnaire. This focuses on a number of topics including the student’s past science activities;
parents’ views on the student’s school; parents’ views on science in the student’s intended career and the need
for scientific knowledge and skills in the job market; parents’ views on science and the environment; the cost
of education services; and, parents’ education and occupation (OECD, 2006).
The following discussion elaborates details concerning design and development of the PISA 2006
student test. Additional information is available in McCrae (2009) and in chapter 2 of the PISA 2006
Technical Report (OECD, 2009). The PISA 2006 science framework provides a foundation for the test, in
particular identification of the scientific competencies which are central to the PISA definition of scientific
literacy. The PISA 2006 Science Expert Group (SEG), comprising internationally recognized experts drawn
from OECD countries, oversaw preparation of the framework, development of the test items, and formulation
of the questionnaire items concerned with science teaching and learning, in accordance with the policy
requirements of the PISA Governing Board (PGB). The PGB is an OECD committee comprising government
representatives from the participating countries. PISA is implemented internationally for the OECD by a
consortium led by the Australian Council for Educational Research (ACER).
Journal of Research in Science Teaching
872 BYBEE, McCRAE, AND LAURIE

Test Items
PISA items are arranged in units—groups of independently scored items (questions) based on a common
stimulus. Many different types of stimulus are used including passages of text, photographs, tables, graphs,
and diagrams, often in combination.
This unit structure enables the employment of contexts that are as realistic as possible and that reflect the
complexity of life situations, while making efficient use of testing time. Using situations about which several
questions can be posed, rather than asking separate questions about a larger number of different situations,
reduces the overall time required for a student to become familiar with the material relating to each question.
A disadvantage of this approach is that it reduces the number of different assessment contexts—hence it is
important to ensure that there is an adequate range of contexts so that bias due to the choice of contexts is
minimized.
In addition to items assessing students’ scientific competencies and knowledge, just over 60% of the
PISA 2006 science units included one or two items designed to assess aspects of students’ attitudes toward
science. The terms ‘‘cognitive items’’ and ‘‘attitudinal items’’ are used to distinguish these two separate types
of items where necessary in this article, but in general cognitive items are the focus of this discussion.
PISA 2006 science units incorporated up to four cognitive items. Each item involved the predominant
use of one of the scientific competencies and required mainly knowledge of science or knowledge about
science. In most cases, more than one competency was assessed within a unit.
Four types of item formats were used in the PISA 2006 science assessment. About one-third of the items
were selected response (multiple-choice) items requiring the selecting of a single response from four options.
Another third were either closed constructed-response items, requiring only a number, word or short phrase as
the answer, or complex multiple-choice items.
In PISA complex multiple-choice items, students are required to choose one of two possible responses
(yes/no, true/false, correct/incorrect, etc.) to several sub-questions and get all these answers correct. Table 6
displays a complex multiple-choice item.
The difficulty of a complex multiple-choice item depends on the difficulty of each sub-question and
on the number of sub-questions. In addition, both of these aspects may be varied to produce an item of a
desired difficulty level. It is important, however, to analyze the performance of each sub-question separately
(as well as together) when the item is pilot-tested to check that each sub-question exhibits sound
psychometric properties, including good discrimination. In addition, if the item is to contribute to a
described scale then its sub-questions must have sufficient content in common to enable an appropriate
descriptor to be prepared, while keeping in mind that each sub-question by itself would be less difficult than
the complex item.
The remaining one-third of the science cognitive items were open constructed-response items that
required a relatively extended written response and frequently required some explanation or justification.
The use of multiple-choice items helps ensure an adequate coverage of the assessment domain and keeps
the cost of coding (marking) down. Open constructed-response items are expensive to code and introduce the
problem of marker reliability, but they increase the range of cognitive processes that can be assessed and so
add validity and authenticity to the assessment. Routitsky and Turner (2003) have shown the importance of
using a variety of test item formats to accommodate the full range of student abilities typically sampled in
PISA, since students at different ability levels from different countries performed differently in PISA 2003
according to the format of the items used.

Table 6
A complex multiple-choice item
Is This an Advantage of Regular Physical Exercise? Yes or No?
Physical exercise helps prevent heart and circulation illnesses. Yes/no
Physical exercise leads to a healthy diet. Yes/no
Physical exercise helps to avoid becoming overweight. Yes/no

What are the advantages of regular physical exercise? Circle ‘‘Yes’’ or ‘‘No’’ for each statement.

Journal of Research in Science Teaching


PISA 2006 SCIENCE 873

Item Development
An international comparative study should draw items from as many participating countries as
possible to ensure wide cultural and contextual diversity. Accordingly, a comprehensive set of guidelines,
accompanied by sample units, were distributed to participating countries to encourage and assist
national submission of science items. Twenty countries submitted a total of 150 units for consideration.
Most of the countries chose to submit their material in English, but submissions were received in five other
languages.
Some units submitted by participating countries had undergone development work, including small-
scale pilot-testing, prior to submission. Nevertheless, most of the units judged as suitable for use in PISA
2006 needed revision by one of the test development teams. Often the units needing revision were far too
open-ended to be able to be coded (marked) with sufficient reliability in a survey the size of PISA. Various
criteria, including overall quality of the unit, amount of revision required, and framework coverage, were
used to select those units referred to test development teams for further development. High importance was
placed on including units from as wide a range of countries as possible.
Test development teams were established in five institutions experienced in item writing and
spread throughout the PISA community: ACER in Australia, Citogroep in the Netherlands, the Institute
of Teacher Education and Learning (ILS) at the University of Oslo in Norway, Leibniz-Institute for
Science Education (IPN) at the University of Kiel in Germany, and the National Institute for Educational
Research (NIER) in Japan. Each team was encouraged to do their initial development of items in the local
language.
Once a unit originating within a test development team was drafted, it was subject to extensive scrutiny
by members of the relevant test development team and, in many cases, was administered to small groups of
students. A combination of think-aloud methods, individual interviews, and group interviews were used to
ascertain the thought processes typically employed by students when attempting to answer the items in
the unit.
As the final step in this phase of item development, batches of units underwent small-scale pilot-testing
with several classes of 15-year-olds. In addition to providing statistical data on item functioning, including an
indication of the relative difficulty of items, the student responses obtained were used to illustrate and further
develop the coding (marking) guides that had been drafted by the item developers.
Items were subject to revision following each step in this process. When substantial revisions were
indicated, items were either discarded or the revised versions were recycled through the various steps. Units
that survived all steps relatively unscathed were then submitted to the international PISA center to undergo
their second phase of development, after being translated into English in cases where they had been prepared
in another language.
The second phase of item development began with a review of each unit by at least one test development
team that was not responsible for its initial development. The scrutiny of items by international colleagues
resulted in further improvements to items and sometimes in eliminating a unit. Along with items developed
from national submissions, surviving items were considered ready for inclusion in pilot studies conducted by
the international center, formal review by national centers, and translation into French (the other PISA source
language).
For each pilot study, test booklets were formed from a number of units developed at different test
development centers. These booklets were trialed with several whole classes of 15-year-old students in
several different schools. Data from the pilot studies were analyzed using standard item response techniques.
Once again, one of the most important outputs of this pilot testing was the generation of additional types of
student responses that were used to enhance coding guides.
Beginning in mid-January 2004, newly developed items (about 500 items in total) were distributed to
participating countries for review. National Project Managers (NPMs) were requested to arrange for national
experts to rate items according to various aspects including whether they related to material included in the
country’s curriculum, their relevance in preparing 15-year-olds for life, how interesting they would appear to
these students, and their authenticity as applications of science or technology. NPMs also were asked to
identify any cultural concerns or other problems with the items and to rate items for retention.

Journal of Research in Science Teaching


874 BYBEE, McCRAE, AND LAURIE

Adaptation and Translation of Items


PISA 2006 was conducted in 57 countries in a total of 42 different languages, requiring the production of
77 national versions of the survey instruments. To achieve international comparability of the data collected in
the survey, it was important to have procedures that ensured, a far as possible, the equivalence of all 77
national versions (Grisay, de Jong, Gebhardt, Berezner, & Halleux-Monseur, 2007). In particular, it was
important to ensure that the difficulty of items was not altered when they were translated from the English and
French source versions.
To comply with PISA translation standards, it was required that the national versions of all test and
questionnaire instruments be developed through a double-translation and reconciliation procedure. That is,
two independent translators should first translate the source material into the target language; then a third
person should reconcile these two translations into a single national version. Furthermore, it was
recommended that countries use the English source version for one of the translations, and the French source
version for the other. As final insurance of the quality of the national versions they had to be submitted to the
PISA consortium for verification against the source versions by an independent translator fluent in English
and French and with native command of the target language.
Field Trial
During 2005, a field trial was conducted in the 57 countries participating in PISA 2006, involving over
95,000 students. The field trial obtained performance data on items being considered for inclusion in the main
survey in 2006, and evaluated operational procedures (translation, booklet production, test administration,
response coding, data handling, etc.) in each country. School sampling was less rigorous than for the main
survey, but each of the various school types in a country had to be represented, and student sampling was
expected to be as similar as possible to the main survey. The SEG selected 61 new units (222 cognitive items)
for inclusion in the field trial after taking into account a wide range of information, including pilot-testing
results and national item reviews and ratings.
Extensive analyses were conducted on student response data collected in the field trial. The standard
item statistics generated by ConQuest (Wu, Adams, Wilson, & Haldane, 2007), as well as various Rasch fit
statistics and diagnostic indicators, were used as primary tools in reviewing item performance. The statistics
included indices of discrimination and fit to the model, point-biserial correlations, the mean ability of students
by response category, a check of category ordering for partial credit items, the expected and observed score
curves by gender and by country, and the expected and observed item characteristic curves by response
category. Of particular importance was item-by-country interaction information that was used to expose any
items that behaved differently when presented in a particular language or culture. This was essential in
ensuring that the main survey test comprised items which would generate a consistent scale that could be
fairly applied for each participating country.
The information from these analyses, together with country ratings of items, formed the basis of item
selection by the SEG for the main survey. Other factors considered included the relevance of each item to the
assessment of scientific literacy, the framework specifications, and information on translation problems
provided by the translation referee.

Main Survey Sampling and Test Design


PISA main surveys are administered to a random sample of students in each participating country. A
two-stage sampling process is used. In the first stage, schools containing 15-year-olds are sampled with
probability proportional to the number of eligible students. A minimum of 150 schools from each country are
chosen. In the second stage, for each sampled school about 30 students are randomly selected from those
eligible. Hence a minimum of 4,500 students per country are selected to participate in PISA. The schools and
students sampled are randomly selected by the international contractor, not by the countries or schools
themselves, and so are truly representative of the population of 15-year-old students in school in each
participating country.
In the PISA 2006 main survey, each sampled student was randomly assigned one of thirteen 2-hour
booklets, with each booklet comprising four clusters of items according to the rotation design shown in
Journal of Research in Science Teaching
PISA 2006 SCIENCE 875

Table 7
Cluster rotation design used to form test booklets for PISA 2006
Booklet Block 1 Block 2 Block 3 Block 4
1 S1 S2 S4 S7
2 S2 S3 M3 R1
3 S3 S4 M4 M1
4 S4 M3 S5 M2
5 S5 S6 S7 S3
6 S6 R2 R1 S4
7 S7 R1 M2 M4
8 M1 M2 S2 S6
9 M2 S1 S3 R2
10 M3 M4 S6 S1
11 M4 S5 R2 S2
12 R1 M1 S1 S5
13 R2 S7 M1 M3

Table 7. S1 to S7 denote science clusters, R1 and R2 denote reading clusters, and M1 to M4 denote
mathematics clusters. The reading and mathematics clusters comprised items from the 2003 survey to allow
the estimation of trends in student performance across the two surveys. For the same reason, eight science link
units from 2003 (a total of 22 items) were distributed across the seven science clusters.
The fully linked design is a balanced incomplete block design with each pair of clusters appearing in one
(and only one) booklet. Furthermore, each cluster appears in each of the four possible positions within a
booklet exactly once and so each test item appeared in four of the test booklets. This ensured that any item
position effects were neutralized.
In common with similar surveys, the design allowed for more material (13 clusters) than any one student
could be assigned (4 clusters), thus enabling greater coverage of the assessment domain than would otherwise
be possible. Nevertheless, it can be seen that all students had to answer at least one cluster of science items,
with the majority doing two clusters and some students doing tests consisting only of science items.
The seven science clusters in the main survey contained a total of 37 units—eight link units from 2003
and 29 units new to PISA in 2006. The new units included a total of 86 cognitive items and 31 embedded
attitudinal items, so that, on average, each of the science clusters contained 15–16 cognitive items, taking into
account the 22 link items from the 2003 survey distributed across the clusters, and 4–5 attitudinal items. Link
units were retained exactly as they appeared in 2003, without embedded attitudinal items because their
inclusion may have compromised the calculation of trends.
The distribution of the 108 main survey cognitive items, according to their competency and knowledge
categories, are summarized in Table 8.

Table 8
Science main survey items (knowledge category by competency)
Scientific Competency
Identifying Explaining Scientific Using Scientific
Item Format Scientific Issues Phenomena Evidence Total

Knowledge of science
Physical systems 15 2 17 (13%) 62 (57%)
Living systems 24 1 25 (23%)
Earth and space systems 12 0 12 (11%)
Technology systems 2 6 8 (7%)
Knowledge about science
Scientific enquiry 24 1 25 (23%)
Scientific explanations 0 21 21 (19%) 46 (43%)
Total 24 (22%) 53 (49%) 31 (29%) 108

Journal of Research in Science Teaching


876 BYBEE, McCRAE, AND LAURIE

Note that the scientific competency and knowledge dimensions as defined in the framework do not give
rise to independent item classifications. In particular, by virtue of its definition, items classified as assessing
the competency ‘‘Explaining scientific phenomena’’ would also be classified as ‘‘Knowledge of science’’
items. This can be seen in Table 8, which also shows that all items classified as ‘‘Identifying scientific issues’’
are ‘‘Knowledge about science’’ items. This was a consequence of a decision taken during test development
to minimize the knowledge of science content in such items so that the ‘‘Identifying scientific issues’’ and
‘‘Explaining scientific phenomena’’ scales were kept as independent as possible, given that competency
subscales were to be used to report science achievement.
It follows from the classification dependencies that the relative weighting of the two knowledge
components in the item set will also largely determine the relative weighting of the three competencies. The
percentage of score points to be assigned to the ‘‘knowledge of science’’ component of the assessment was
determined by the PGB, prior to the field trial, to be about 60%. This decision had a far reaching consequence
in terms of overall gender differences in the PISA 2006 science results, as boys generally outperformed girls
on knowledge of science items and the situation was reversed for knowledge about science items. There were
no statistically significant gender differences in overall science performance in 37 of the participating
countries, including 22 of the 30 OECD countries. In six OECD countries there was a small advantage for
males, and there was a slightly larger advantage for females in two OECD countries (OECD, 2007).

PISA 2006 Science: Selected Results


The PISA 2006 Science survey provides a comparison of the scientific literacy of students in a country
with that of students in 56 other countries. In particular, PISA findings give insights about students’
competency to recognize scientific issues, to think scientifically, and to use scientific data in the
communication and support of a decision. In addition, the survey provides information about students’
scientific knowledge and the attitudes they hold about science.

Combined Scale for Scientific Literacy


The PISA 2006 combined scientific literacy scale was constructed to have a mean score of 500 points
among the OECD countries, with a standard deviation of 100 points. Table 9 shows the mean scores with
standard errors on the combined scientific literacy scale for the 57 participating countries in the PISA 2006
Science assessment. The predominance of countries scoring below the OECD mean of 500 is largely
explained by the inclusion of non-OECD countries on the scale. Countries scoring higher than Hungary had a
mean score significantly above the OECD average, while countries scoring lower than France had a mean
score significantly below the OECD average. The mean scores of the remaining five countries were not
significantly different from the OECD average.
Six Proficiency Levels for Scientific Literacy
In addition to reporting the PISA 2006 Science assessment results on a combined scale, the OECD also
reported results using six proficiency levels representing a comprehensive range of scientific literacy. Level 1
indicates students at the lowest or least proficient level and level 6 students at the highest or most proficient
level.
Based on their scores, students were assigned to the highest proficiency level for which they would be
expected to answer correctly a majority of assessment items spread across that level of difficulty (Turner,
2009). Each level covers a range of 74.6 points on the PISA 2006 Science scale. For example, level one covers
score points ranging from a low of 334.9 to 409.5. Students at level 1 who scored 335 points could be expected
to succeed on 50% of the items in this level. Students at level 1 whose scores were closer to the minimum
threshold for level 2, 410 points, were expected to succeed on 62% of the items in this level.
Descriptions of what students can do at each level are based on the content of the items belonging to the
respective level. For instance, students at level 1 have such a limited knowledge of science that they could
only apply their knowledge to a few, familiar situations. At the other end of the scale, students performing at
proficiency level 6 could consistently identify, explain, and apply their knowledge of science, as well as their
knowledge about science, in a variety of complex life situations.
Journal of Research in Science Teaching
PISA 2006 SCIENCE 877

Table 9
PISA 2006 science literacy scores by country
Country Mean Science Score (Standard Error)
Finland 563 (2.0)
Hong Kong-China 542 (2.5)
Canada 534 (2.0)
Chinese Taipei 532 (3.6)
Estonia 531 (2.5)
Japan 531 (3.4)
New Zealand 530 (2.7)
Australia 527 (2.3)
Netherlands 525 (2.7)
South Korea 522 (3.4)
Liechtenstein 522 (4.1)
Slovenia 519 (1.1)
Germany 516 (3.8)
United Kingdom 515 (2.3)
Czech Republic 513 (3.5)
Switzerland 512 (3.2)
Austria 511 (3.9)
Macao-China 511 (1.1)
Belgium 510 (2.5)
Ireland 508 (3.2)
Hungary 504 (2.7)
Sweden 503 (2.4)
OECD average 500 (0.5)
Poland 498 (2.3)
Denmark 496 (3.1)
France 495 (3.4)
Croatia 493 (2.4)
Iceland 491 (1.6)
Latvia 490 (3.0)
United States 489 (4.2)
Lithuania 488 (2.8)
Slovak Republic 488 (2.6)
Spain 488 (2.6)
Norway 487 (3.1)
Luxembourg 486 (1.1)
Russian Federation 479 (3.7)
Italy 475 (2.0)
Portugal 474 (3.0)
Greece 473 (3.2)
Israel 454 (3.7)
Chile 438 (4.3)
Serbia 436 (3.0)
Bulgaria 434 (6.1)
Uruguay 428 (2.7)
Turkey 424 (3.8)
Jordan 422 (2.8)
Thailand 421 (2.1)
Romania 418 (4.2)
Montenegro 412 (1.1)
Mexico 410 (2.7)
Indonesia 393 (5.7)
Argentina 391 (6.1)
Brazil 390 (2.8)
Colombia 388 (3.4)
Tunisia 386 (3.0)
Azerbaijan 382 (2.8)
Qatar 349 (0.9)
Kyrgyzstan 322 (2.9)

Non-OECD countries are shown in italics.

Journal of Research in Science Teaching


878 BYBEE, McCRAE, AND LAURIE

The PISA 2006 SEG identified level 2 as the baseline proficiency level for scientific literacy. At this
level, students begin to demonstrate a sufficiently high level of scientific literacy to participate effectively and
productively in life situations related to science and technology. The OECD (2007, Figure 2.8) describes
students’ proficiency at level 2 in scientific literacy as having ‘‘adequate scientific knowledge to provide
possible explanations in familiar contexts or draw conclusions based on simple investigations. They are
capable of direct reasoning and making literal interpretations of the results of scientific inquiry or
technological problem solving.’’
Figure 2 shows the percentage of students from OECD countries able to perform tasks at each level or
above on the PISA 2006 Science scale. Among OECD countries, 80.8% of students performed at baseline
level 2 or above, but 5.2% of students were not yet performing at level 1 on the proficiency scale for scientific
literacy.
Scientific Competencies
Among the unique insights gained from PISA 2006 Science is information on student performance on
the three scientific competencies—identifying scientific issues, explaining phenomena scientifically, and
using scientific evidence. Examining country results for the scientific competencies individually suggests
possible areas to strengthen science education in participating countries. The science competencies can be
thought of as a sequence that individuals might go through as they encounter and solve science-related
problems. First, they must identify the scientific aspects of a problem, then apply appropriate scientific
knowledge about that problem, and finally, they have to interpret and make sense of their findings in support of
a decision or recommendation. Tables 10–12 present percentages for OECD students at the different
proficiency levels for each of the scientific competencies, including descriptions for each proficiency level for
each scientific competency.

Concluding Discussion
The PISA 2006 Science assessment emphasized the mastery of scientific competencies, the
understanding of concepts, and in particular the ability to apply those concepts and competencies in a
variety of life situations. PISA 2006 Science assessed students more in terms of the important knowledge and
skills they will need in their adult life and less in terms of mastering their school curricula. This feature is
important to note as it differentiates the PISA 2006 Science assessment from Trends in Mathematics and
Science Study (TIMSS). The knowledge and competencies assessed in PISA are considered prerequisites to
efficient learning in adulthood and for full participation in society (Bussière et al., 2007).
The science content was based on the assessment framework (OECD, 2006) developed by the SEG and
endorsed by the PGB. The framework defined the test domain of scientific literacy, outlined the scope of the
assessment, and specified the structure of the test—including its unit arrangement and the distribution of

Figure 2. Percentage of students in OECD countries able to perform tasks at each proficiency level or above.

Journal of Research in Science Teaching


PISA 2006 SCIENCE 879

Table 10
Identifying scientific issues: Summary descriptions of the six proficiency levels.
Percentage of All
Students Across OECD
Who Can Perform
Level Proficiency at Each Level Tasks at This Level
6 Students at this level demonstrate an ability to understand and articulate 1.3
the complex modeling inherent in the design of an investigation
5 Students at this level understand the essential elements of a scientific 8.4
investigation and thus can determine if scientific methods can be applied in
a variety of quite complex and often abstract contexts. Alternatively,
by analyzing a given experiment they can identify the question being
investigated and explain how the methodology relates to that question
4 Students at this level can identify the change and measured variables in an 28.4
investigation and at least one variable that is being controlled. They can
suggest appropriate ways of controlling that variable. The question being
investigated in straightforward investigations can be articulated
3 Students at this level are able to make judgments about whether an issue 56.7
is open to scientific measurement and, consequently, to scientific
investigation. Given a description of an investigation they can identify
the change and measured variables
2 Students at this level can determine if scientific measurement can be applied to 81.3
a given variable in an investigation. They can recognize the variable being
manipulated (changed) by the investigator. Students can appreciate the
relationship between a simple model and the phenomenon it is modeling.
In researching topics students can select appropriate key words for a search
1 Students at this level can suggest appropriate sources of information on 94.9
scientific topics. They can identify a quantity that is undergoing variation in
an experiment. In specific contexts they can recognize whether that variable
can be measured using familiar measuring tools or not
Below level 1 5.1

items according to important framework variables and formats. The SEG also oversaw test development
which was the responsibility of a consortium led by the ACER.
Item development was carried out at five centers across the world to help achieve conceptually rigorous
material with a high level of cross-cultural and cross-national diversity. The process also involved
participating countries in item development and review. In addition, translation standards and procedures,
including the provision of dual source versions of instruments (English and French), were established and
monitored to ensure that the 77 national versions were as equivalent in difficulty as possible.
The consortium conducted a field trial in all 57 participating countries during 2005. Analyses of field
trial data and feedback from national centers was used to identify items that were unsuitable for inclusion
because they did not function well overall or did not perform as expected in different languages or cultures.
In addition, information obtained about item performance ensured that the main survey test contained
items with an appropriate distribution of difficulties.
The PISA survey is age not grade specific and tests 15-year-old students. This age was chosen because in
the majority of countries 15-year-olds are near the end of their compulsory schooling (OECD, 2006). PISA is
typically administered to between 4,500 and 10,000 students in each participating country. The 2006 survey
included approximately 500,000 students.
Participation in PISA has continually increased since the first survey in 2000. In 2000 and 2001,
28 OECD countries and four non-OECD partner countries participated in the first PISA survey. Forty-one
countries participated in PISA 2003, 57 in PISA 2006, and 68 countries participated in PISA 2009.
The constantly increasing number of participants demonstrates confidence in PISA as an important survey
instrument that provides internationally comparative data on critical outcomes in educational systems.
Although some individuals criticize international assessments such as PISA (see, e.g., Bracey, 2009;
Rotberg, 2008), we argue that international assessments in general, and PISA Science 2006 in particular,
Journal of Research in Science Teaching
880 BYBEE, McCRAE, AND LAURIE

Table 11
Explaining phenomena scientifically: Summary descriptions for the six proficiency levels
Percentage of All
Students Across OECD
Who Can Perform Tasks
Level Proficiency at Each Level at This Level
6 Students at this level can draw on a range of abstract scientific knowledge and 1.8
concepts and the relationships between these in developing explanations of
processes within systems
5 Students at this level can draw on knowledge of two or three scientific concepts 9.8
and identify the relationship between them in developing an explanation of
a contextual phenomenon
4 Students at this level have an understanding of scientific ideas, including 29.4
scientific models, with a significant level of abstraction. They can apply a
general, scientific concept containing such ideas in the development of an
explanation of a phenomenon
3 Students at this level can apply one or more concrete or tangible scientific 56.4
ideas/concepts in the development of an explanation of a phenomenon.
This is enhanced when there are specific cues given or options available
from which to choose. When developing an explanation, cause and effect
relationships are recognized and simple, explicit scientific models may be
drawn upon
2 Students at this level can recall an appropriate, tangible, scientific fact 80.4
applicable in a simple and straightforward context and can use it to explain
or predict an outcome
1 Students at this level can recognize simple cause and effect relationships given 94.6
relevant cues. The knowledge drawn upon is a singular scientific fact that is
drawn from experience or has widespread popular currency
Below level 1 5.4

provide valuable information for the science education community. The perspectives on content and the
design of tests serve as two examples, in addition to the provision of comparative data, which support
the importance of international assessments. Adams (2009) discusses what is done to ameliorate threats to
the validity of PISA in areas such as sampling, item selection, translation, and implementation fidelity, often
raised as sources of invalidity in international comparative studies.
Within the science education community, scientific literacy has become a ‘‘catch all’’ phrase used to
express a variety of goals. PISA 2006 Science presents a definition of scientific literacy, and assessment
examples based on this view, that differentiates it from other, more general, definitions. The view of scientific
literacy in PISA 2006 aligns very closely with one described as Vision II, a perspective that derives its
meaning from the character of situations with a scientific component, situations that students likely will
encounter as citizens (Roberts, 2007, p. 730). This view is consistent with those described by Millar (2006)
and Osborne (2007). One can contrast this view with the traditional primary emphasis on the orthodox canons
of physics, chemistry, biology, and Earth sciences.
With its emphasis on assessing scientific competencies in relevant life situations (i.e., contexts), the
PISA 2006 Science assessment represents a novel approach at assessing scientific literacy on a worldwide
scale. Assessing scientific literacy per se is not novel, as some countries, for example Canada, have been
doing this for more than a decade (CMEC, 1996, 1999, 2004, 2007). What is different with PISA though is the
participation of a vast number of countries. This high level of participation is a clear sign that the importance
given to scientific literacy as an educational outcome around the world is increasingly accepted as a
progressive alternative to the traditionally favored assessment of school-based science knowledge.

Note
1
The following 30 countries are members of the Organization for Economic Cooperation and Development
(OECD): Australia, Austria, Belgium, Canada, Czech Republic, Denmark, Finland, France, Germany, Greece, Hungary,

Journal of Research in Science Teaching


PISA 2006 SCIENCE 881

Table 12
Using scientific evidence: Summary descriptions of the six proficiency levels
Percentage of All
Students Across OECD
Who Can Perform Tasks
Level Proficiency at Each Level at This Level
6 Students at this level demonstrate an ability to compare and differentiate 2.4
among competing explanations by examining supporting evidence. They
can formulate arguments by synthesizing evidence from multiple sources
5 Students at this level are able to interpret data from related datasets presented 11.8
in various formats. They can identify and explain differences and
similarities in the datasets and draw conclusions based on the combined
evidence presented in those datasets
4 Students at this level can interpret a dataset expressed in a number of formats, 31.6
such as tabular, graphic and diagrammatic, by summarizing the data and
explaining relevant patterns. They can use the data to draw relevant
conclusions. Students can also determine whether the data support
assertions about a phenomenon
3 Students at this level are able to select a piece of relevant information from data 56.3
in answering a question or in providing support for or against a given
conclusion. They can draw a conclusion from an uncomplicated or simple
pattern in a dataset. They can also determine, in simple cases, if enough
information is present to support a given conclusion
2 Students at this level are able to recognize the general features of a graph if 78.1
they are given appropriate cues, and can point to an obvious feature in a
graph or simple table in support of a given statement. They are able to
recognize if a set of given characteristics apply to the function of everyday
artifacts in making choices about their use
1 In response to a question, students at this level can extract information from a 92.1
fact sheet or diagram pertinent to a common context. They can extract
information from bar graphs where the requirement is simple comparisons
of bar heights. For common, experienced contexts, students at this level can
attribute an effect to a cause
Below level 1 7.9

Iceland, Ireland, Italy, Japan, Korea, Luxembourg, Mexico, Netherlands, New Zealand, Norway, Poland, Portugal, Slovak
Republic, Spain, Sweden, Switzerland, Turkey, United Kingdom, and the United States.

References
Abell S. Lederman N. (Eds.). (2007). Handbook of research on science education. Mahwah, NJ:
Lawrence Erlbaum Associates.
Adams, R.J. (2009). PISA: Frequently answered criticisms. In R. Bybee & B. McCrae (Eds.), PISA
science 2006. Implications for science teachers and teaching. Arlington, VA: NSTA Press.
Baumert, J., & Köller, O. (1998). Interest research in secondary level 1: An overview. In L. Hoffmann, A.
Krapp, K.A. Renniger, & J. Baumert (Eds.), Interest and learning (pp. 241–256). Kiel, Germany: Institute for
Science Education at the University of Kiel (IPN).
Blosser, P. (1984). Attitude research in science education. Columbus, OH: Eric Clearinghouse for
Science, Mathematics and Environmental Education.
Bogner, F., & Wiseman, M. (1999). Toward measuring adolescent environmental perception. European
Psychologist, 4(3), 139–151.
Bracey, G. (2009). U.S. school performance, through a glass darkly (again). Phi Delta Kappan, 90(5),
386–387.
Bussière, P., Knighton, T., & Pennock, D. (2007). Measuring up: Canadian results of the OECD PISA
study. The performance of Canada’s youth in science, reading and mathematics—2006 first results for
Journal of Research in Science Teaching
882 BYBEE, McCRAE, AND LAURIE

Canadians aged 15. Canada: Human Resource and Social Development Canada and Council of Ministers of
Education.
Bybee, R. (1997). Achieving scientific literacy: From purposes to practices. Portsmouth, NH:
Heinemann.
Bybee, R., & Fuchs, B. (2006). Preparing the 21st century workforce: A new reform in science and
technology education. Journal of Research in Science Teaching, 43(4), 349–352.
CMEC. (1996). Report on science assessment. Toronto, Ontario: Council of Ministers of Education,
Canada.
CMEC. (1999). SAIP Science II: Assessment report. Toronto, Ontario: Council of Ministers of
Education, Canada.
CMEC. (2004). SAIP Science III 2004: The public report. Toronto, Ontario: Council of Ministers of
Education, Canada.
CMEC. (2007). PCAP-13 2007 report on the assessment of 13-year-olds in reading, mathematics, and
science. Toronto, Ontario: Council of Ministers of Education, Canada.
DeBoer, G. (2000). Scientific literacy: Another look at historical and contemporary meanings and its
relationship to science education reform. Journal of Research in Science Teaching, 37(6), 582–601.
Eagles, P., & Demare, R. (1999). Factors influencing children’s environmental attitudes. The Journal of
Environmental Education, 30(4), 33–37.
Gardner, P.L. (1975). Attitudes to science: A review. Studies in Science Education, 2, 1–41.
Gardner, P.L. (1984). Students’ interest in science and technology: An international overview.
In M. Lehrke, L. Hoffman, & P.L. Gardner (Eds.), Interests in science and technology education (pp. 15–34).
Kiel: Institute for Science Education (IPN).
Gauld, C., & Hukins, A.A. (1980). Scientific attitudes: A review. Studies in Science Education, 7,
129–161.
Grisay, A., de Jong, J.H.A.L., Gebhardt, E., Berezner, A., & Halleux-Monseur, B. (2007). Translation
equivalence across PISA countries. Journal of Applied Measurement, 8(3), 249–266.
Holton, G. (1998). 1948: The new imperative for science literacy. Journal of College Science Teaching,
8, 181–185.
Hurd, P.D. (1958). Science literacy: Its meaning for American schools. Educational Leadership, 16,
13–16.
Klopfer, L.E. (1971). Evaluation of learning in science. In B. Bloom, J. Hastings, & G. Madaus (Eds.),
Handbook of summative and formative evaluation of student learning (pp. 559–641). New York:
McGraw-Hill.
Klopfer, L.E. (1976). A structure for the affective domain in relation to science education. Science
Education, 60, 299–312.
Koballa, T., & Glynn, S. (2007). Attitudinal and motivational constructs in science learning. In S. Abell
& N. Lederman (Eds.), Handbook of research on science education. Mahwah, NJ: Lawrence Erlbaum
Associates.
Koballa, T., Kemp, A., & Evans, R. (1997). The spectrum of scientific literacy. The Science Teacher,
64(7), 27–31.
LaForgia, J. (1988). The affective domain related to science education and its evaluation. Science
Education, 72(4), 407–421.
Mayer, V.J., & Kumano, Y. (2002). The philosophy of science and global science literacy. In V.J. Mayer
(Ed.), Global science literacy. Dordrecht, The Netherlands: Kluwer Academic Publishers.
McCrae, B. (2009). PISA 2006 test development and design. In R. Bybee & B. McCrae (Eds.), PISA
science 2006. Implications for science teachers and teaching. Arlington, VA: NSTA Press.
Millar, R. (2006). Twenty first century science: Insights from the design and implementation of a
scientific literacy approach in school science. International Journal of Science Education, 28(13), 1499–
1521.
OECD. (2006). Assessing scientific, reading and mathematical literacy: A framework for PISA 2006.
Paris: OECD.

Journal of Research in Science Teaching


PISA 2006 SCIENCE 883

OECD. (2007). PISA 2006. Science competencies for tomorrow’s world. Volume I: Analysis. Paris:
OECD.
OECD. (2009). PISA 2006 technical report. Paris: OECD.
Osborne, J. (2007). Science education for the twenty first century. Eurasia Journal of Mathematics,
Science & Technology Education, 3(3), 173–184.
Osborne, J., Simon, S., & Collins, S. (2003). Attitudes towards science. A review of the literature and its
implications. International Journal of Science Education, 25(9), 1049–1079.
Piper, M., & Moore, K. (1977). Attitudes toward science: Investigations. Columbus, Ohio: SMEAC
Information Reference Center, Ohio State University.
Rickinson, M. (2001). Learners and learning in environmental education. A critical review of the
evidence. Environmental Education Research, 7(3), 207–208.
Roberts, D. (2007). Scientific literacy/science literacy. In S.K. Abell & N.G. Lederman (Eds.),
Handbook of research on science education (pp. 729–780). Mahwah, NJ: Lawrence Erlbaum Associates.
Rotberg, I., 2008. Quick fixes, test scores, and the global economy. Education Week, June 11, 2008.
Routitsky, A., & Turner, R. (2003). Item format types and their influence on cross-national comparisons
of student performance. Chicago, USA: Presentation given to the Annual Meeting of the American
Educational Research Association (AERA) in Chicago, USA.
Schibeci, R.A. (1984). Attitudes to science: An update. Studies in Science Education, 11, 26–59.
Simpson, R., Koballa, T., Oliver, S., & Crawley, F. (1994). Research on the affective dimension of
science learning. In D. Gabel (Ed.), Handbook on science teaching and learning. New York: Macmillan
Publishing Company.
Turner, R. (2009). PISA: An introduction and overview. In R. Bybee & B. McCrae (Eds.), PISA science
2006. Implications for science teachers and teaching. Arlington, VA: NSTA Press.
UNESCO 2003. UNESCO and the international decade of education for sustainable development
(2005–2015). UNESCO International Science, Technology & Environmental Education Newsletter, XXVIII
(1–2).
UNESCO. (2005). International implementation scheme for the UN decade of education for sustainable
development. Paris: UNESCO.
Weaver, A. (2002). Determinants of environmental attitudes: A five-country comparison. International
Journal of Sociology, 32(1), 77–108.
Wu, M.L., Adams, R.J., Wilson, M.R., & Haldane, S.A. (2007). ACER ConQuest Version 2.0:
Generalised item response modelling software. Melbourne: Australian Council for Educational Research.

Journal of Research in Science Teaching

You might also like