You are on page 1of 18

Computers & Education 95 (2016) 45e62

Contents lists available at ScienceDirect

Computers & Education


journalhomepage:www.elsevier.com/locate/compedu

Science teachers' TPACK-Practical: Standard-setting using an


evidence-based approach
a a, * b b
Tsung-Hau Jen , Yi-Fen Yeh , Ying-Shao Hsu , Hsin-Kai Wu , Kuan-
a
Ming Chen
a Science Education Center, National Taiwan Normal University, Taiwan, ROC
b Graduate Institute of Science Education, National Taiwan Normal University, Taiwan, ROC

article info abstract

Article history: Technological pedagogical content knowledge-practical (TPACK-P) refers to the knowledge construct that
Received 18 May 2015 teachers in the digital era develop for and from their teaching practices with technology. This study explored
Received in revised form 21 October 2015 a standard-setting method using item response theory to cross-validate ranks of proficiency levels and
Accepted 30 December 2015 Available online 2
examine in-service and pre-service science teachers' knowledge about and application of TPACK-P in
January 2016
Taiwan. A sample of 99 partic-ipants (52 pre-service and 47 in-service science teachers) completed a 17-item
TPACK-P questionnaire that described their typical responses, opinions, or actions in different instructional
Keywords:
scenarios. Initial analysis of these responses revealed a correlation (r 0.87) between the ranks of
Evaluation methodologies
proficiency levels and those previously identified that validated the hierarchical structure of the four
Improving classroom teaching
proficiency levels (1-lack of use, 2-simple adoption, 3-infusive application, and 4-reflective application). The
second analysis located the thresholds of the four proficiency levels in the two dimensions of knowledge
about and application of TPACK-P. It was found that there were no significant differences between in-service
and pre-service teachers' TPACK-P and that most of the participants displayed knowledge about TPACK-P at
Levels 2 and 3, but their application was at Level 1. The validated four pro ficiency levels coupled with
typical performances can be viewed as a roadmap of science teachers' TPACK-P development. The gap
between the knowledge about and application of TPACK-P suggests that further practical experiences in
supportive environments are needed in science teacher education programs. Only when teachers gain and
learn from practical usage of technology to support science education can their TPACK-P be further
developed and strengthened.

2016 Elsevier Ltd. All rights reserved.

1. Introduction

Surveys regarding technology in classrooms have revealed that teachers' and students' uses of technology have not been popular, even when
technology was readily available for instructional use (Gray, Thomas, & Lewis, 2010; Project Tomorrow, 2008). Lack of necessary knowledge
and confidence could explain this low frequency and ineffective use of technology in classrooms, along with common concerns such as lacking
good technological tools (Afshari, Bakar, Luan, Samah, & Fooi, 2009; Mumtaz, 2000). Researchers have proposed technological pedagogical
content knowledge (TPACK) as a body of knowledge

* Corresponding author.
E-mail address: yyf521@ntnu.edu.tw (Y.-F. Yeh).

http://dx.doi.org/10.1016/j.compedu.2015.12.009
0360-1315/ 2016 Elsevier Ltd. All rights reserved.
46 T.-H. Jen et al. / Computers & Education 95 (2016) 45e62

that digital-age teachers should be expected to have in order to properly use technology in their teaching ( Mishra & Koehler, 2006; Niess, 2005).
Teachers who use such knowledge are believed to make their instruction comprehensible and assist their students through their thoughtful use of
technology, such as information communication technology (ICT).
Researchers have tried different approaches to investigating teachers' TPACK. The resulting information has revealed not only how well
teachers may teach with technology, but also constructive directions for future teacher education. However, what and how to measure TPACK
has become even more complicated, especially when considering its situated and multi-faceted nature ( Cox & Graham, 2009; Doering,
Veletsianos, Scharber, & Miller, 2009; Koehler & Mishra, 2008) and the scales of the various proficiency levels (Dwyer, Ringstaff, & Sandholtz,
1991; Sandholtz, Ringstaff, & Dwyer, 1997). Even more fundamental issues in TPACK measurement are how it is defined and exactly what it is,
which makes objective evaluation a difficult task (or at least in need of exploratory data to provide a basis) (Mishra & Henriksen, 2015). This
research and development study constructed, validated, and benchmarked a practice-based TPACK assessment tool.

1.1. Complexity of TPACK and teachers' proficiency level

Effective science teachers have complex stores of knowledge about their science domains, topics, students, teaching strategies, and
assessment techniques that they enact during classroom practice. Mishra and Koehler (2006) added tech-nological knowledge (TK) to the
framework of pedagogical content knowledge (PCK; Shulman, 1986) to address technology-supported teaching/learning environments, making
TPACK distinct from PCK not only in its contemporary needs but also in its specificity to the profession and complexity of its nature. TPACK,
like PCK, is a craft knowledge that can be defined as the wisdom that teachers develop from their teaching practices, which guides their
instructional actions (van Driel, Verloop, & de Vos, 1998). Teachers' development begins with being equipped with professional knowledge about
the content to be taught, the pedagogy and assessment necessary to assist their students' content learning, and the technology required to further
accommodate that teaching and learning. These knowledge sets interact to produce a blended technological-pedagogical-content knowledge set
(Mishra & Koehler, 2006). TPACK for science teachers may include knowledge regarding represen-tations, science curricula, students'
understanding of science, various educational contexts, affordances of ICT tools, etc. (Angeli & Valanides, 2009; Jimoyiannis, 2010; Magnusson,
Krajcik, & Borko, 1999). The borders of the component categories become vague and diminished with professional growth and experience as
pre-service or novice teachers become experi-enced in-service teachers (Archambault & Crippen, 2009; Cox & Graham, 2009; Gess-Newsome &
Lederman, 1993; Niess, 2011). Teaching experience can be viewed as a major contributor, explaining the transformation of TPACK or PCK from
theoretical knowledge to practical knowledge (van Driel, Beijaard, & Verloop, 2001; van Driel et al., 1998).

Quality teaching practices need to employ appropriate strategies, utilize comprehensible language, encourage student learning engagement,
and be responsive to student needs (Chen, Hendricks, & Archibald, 2011). Inservice teachers tend to display more integrated PCK than
preservice teachers (Lee, Brown, Luft, & Roehrig, 2007; Lee & Luft, 2008). Teachers' knowledge evolves as they struggle making instructional
decisions and negotiating among diverse contextual elements (Koh, Chai, & Tay, 2014; Niess, 2011; Shulman, 1987) and different situations in
instruction (Yeh, Hsu, Wu, Hwang, & Lin, 2014). Reflection-in-action and reflection-on-action in classroom teaching practices further
individualize teachers' TPACK and keep it dynamic (Koh & Divaharan, 2011; Krajcik, Blumenfeld, Marx, & Soloway, 1994). Knowledge can be
attained for practice or elaborated in practice, but it can also achieve an ultimate level of knowledge of practice ( Cochran-Smith & Lytle, 1999).
TPACK-P refers to the knowledge teachers develop from long-term experiences in planning and enacting instruction with flexible ICT uses to
support different instructional needs (Ay, Karadag, & Acat, 2015). Considering that both TPACK skills and personal teaching experiences
compose teachers' TPACK-P, it is expected that experienced teachers have stronger TPACK-P than novice teachers or preservice teachers.

Teaching practices are the result of the complex and convoluted interactions of instructional, social, and physical factors; there are no one-
size-fits-all solutions for instructional tasks (Koehler & Mishra, 2008; Mishra & Koehler, 2006). Teachers encounter a wide variety of different
problems and solutions throughout their respective teaching experiences specific to their particular environment (Moersch, 1995; Niess et al.,
2009; Russell, 1995). Progression and evolution of teachers' TPACK stem from their instructional practices in response to different scenarios and
challenges; therefore, varying proficiency levels are expected amongst teachers. Apple Computers funded a project called Apple Classrooms of
Tomorrow (ACOT) in the mid-1980s where media was used to support teaching and learning in classrooms. Dwyer et al. (1991) concluded that
teachers' instructional evolution in technology-implemented classrooms of the ACOT project went through a series of phases, including entry,
adoption, adaption, appropriation, and innovation. They considered the teachers' mastery of technology and their level of technology infusion
when determining the success of the teachers' classrooms. A similar learning trajectory was also found for mathematics teachers' learning to use
spreadsheets as a means of facilitating students learning in mathematics class-rooms (i.e., recognizing, accepting, adapting, exploring, and
advancing) (Niess, 2007; Niess et al., 2009). These performance-based proficiency levels depict the typical features of the main stages of teacher
development, while statistical scrutiny of the cutting points between levels can offer another means of confirming the level's validity.

1.2. Measurement of teachers' TPACK

Considering that TPACK is an internal and dynamic construct, it is difficult to measure accurately (Kagan, 1990). However, the
collection of different types of data and use of different methods of analysis should enable better understanding
TPACK.
T.-H. Jen et al. / Computers & Education 95 (2016) 45e62 47

Self-report surveys and performance assessments are two measurement types that TPACK researchers and teacher educators have used ( Abbitt,
2011; Koehler, Shin, & Mishra, 2012). Self-report surveys are preferred by researchers hoping to rapidly collect large amounts of teachers' self-
ratings regarding their technology understanding and use, but the TPACK models embedded in the data should also be explored and estimated.
Schmidt et al. (2009) collected 77 pre-service teachers' self-rating scores regarding ability descriptions developed from the seven TPACK
knowledge domains (TK, PK, CK, TPK, TCK, PCK, TPACK). These data were later analyzed by principle component factor analyses and
correlation analyses, from which modifications to certain items and the relationships within component knowledge were suggested. Lee and Tsai
(2010) collected data from 558 pre-service teachers and used exploratory and confirmatory factor analyses to examine the val-idity of the items,
identify the main factors affecting the construct, and determine the data fit within the proposed framework of TPACK-Web. Although the
teachers' composite scores from their self-rating can be quantitatively compared, a more valid and complete understanding about teachers'
abilities can be achieved by linking these self-report scores and the indicators from observed teachers' practical mastery in instructional artifacts
and classroom applications of TPACK-P.

As for proficiency levels, Harris, Grandgenett, and Hofer (2011) constructed an assessment rubric to evaluate teachers' lesson plans; scores 1
to 4 were offered to four aspects of the appropriateness of technology use toward instruction. Expert teachers and researchers tested the validity
and reliability of the assessment rubrics through judging the content and scoring the lesson plans. Beside the instructional performance that
teachers display, their negotiations among different constraints in instructional contexts as well as their re flections and experiences from prior
teaching practices can also display the depth of their instructional knowledge. To investigate teachers' instructional dispositions, the authors first
identified the framework of TPACK-P in which TPACK transforms from and during teaching practices (Yeh, et al., 2014) and then interviewed
40 teachers about their design thinking and teaching experiences with ICTs in instructional tasks of assessment, planning and designing, and
enactment (Yeh, Lin, Hsu, Wu, & Hwang, 2015). A spectrum of design thinking, actions, and reflections were identified, which elicited more
features of teachers' TPACK-P at four levels.

1.3. Standard-setting for the proficiency levels of teachers' TPACK-P

Standard setting is a methodology utilized to provide cut scores for measurement scales. These cut scores are used to separate performance
categories by classifying test takers with lower scores into one level and higher scores into another (etin & Gelbal, 2013; Shepard, 2008).
Various approaches to standard setting have been proposed to achieve different purposes, including conducting mastery classi fications of the
target population, finding norms and passing rates for test takers or the greater population, inducting standards through empirical means, and
validating already developed assessment frameworks based on current theories (Haertel & Lorie, 2004; Shepard, 1980). The technique used in
this study for standard setting was applied firstly to validate the proposed proficiency levels of science teachers' TPACK-P, and secondly to
examine the in-service and pre-service science teachers' knowledge about and application of TPACK in their teaching practices.

Once the hierarchical proficiency levels of teachers' TPACK-P identified in previous studies were qualitatively generated (Yeh, et al., 2014,
2015), the typical performance features identified from these levels could be examined further and confirmed through statistical analyses. The
identified benchmarks and their features will be informative for teacher edu-cators, helping them to have a better awareness of what features
science teachers with different TPACK proficiency levels may display and develop in their teacher education programs. Utilizing ICTs can be
especially critical to teaching and learning science, since representation in different forms facilitates learners' visualization of micro-level or
macro-level phenomena (Ainsworth, 2006; Mayer, 1999; Treagust, Chittleborough, & Mamiala, 2003; Wu, Krajcik, & Soloway, 2001) and their
development of conceptual understanding and inquiry abilities (de Jong & van Joolingen, 1998; Perkins et al., 2006). Science teachers display
their pedagogical uses of ICTs along a spectrum of emergent to advanced levels. Their learning progress along the road to TPACK-P
development, though content specific, can offer some insights into professional development in other subjects.

2. Method

A questionnaire was developed to rapidly collect science teachers' responses to different instructional scenarios. Those responses were then
used to validate the proposed proficiency levels in the TPACK-P framework and for investigating the respondents' TPACK-P proficiency in their
teaching practices. Two analyses were performed on the responses in order to validate the hierarchy of pro ficiency levels, determine the
thresholds of each performance level in the metric, and generally investigate these science teachers' TPACK-P.

2.1. Questionnaire item construction

A total of 17 items were designed with built-in response options in the questionnaire. Each item stem described an instructional scenario that
expert teachers deemed important in teachers' implementation of ICTs in their instruction (Yeh, et al., 2014); the options described typical
responses that science teachers at four different TPACK-P proficiency levels might select as their response. Interview data (Yeh, et al., 2015)
were used to create benchmarks and associated indicators of science teachers' TPACK-P at different pro ficiency levels, which were then used to
construct the option pool for the items. It was
48 T.-H. Jen et al. / Computers & Education 95 (2016) 45e62

assumed that science teachers at similar TPACK-P proficiency levels shared similar ideas and, therefore, tended to select similar response
options.

2.1.1. Benchmarks and indicators


The interview data collected from 40 science teachers provided the foundation for an investigation into how science teachers think, act, and
reflect throughout their teaching careers. Five levels were identified from these teachers' interview data, including reflective application (Level 4),
infusive application (Level 3), simple adoption (Level 2), lack of use (Level 1), and no idea (Level 0). The interview responses at each level
offered information for use in the construction of the benchmarks and indicators for TPACK-P. Each item was accompanied by four options that,
individually, represented typical performances that teachers at Levels 1 to 4 displayed. Teachers who did not choose one of these four options
were viewed as below Level 1. Table 1 lists the benchmarks and indicators for the four proficiency levels.

These knowledge benchmarks and their indicators were also used to illustrate the thresholds for knowledge performance at each proficiency
level. Level 4 (reflective application) was determined to be the highest proficiency level that science teachers could achieve; it indicated that they
were adept at using their experience-based TPACK to employ ICTs in assisting their students in learning about science. Teachers at Level 3
(infusive application) used ICTs to guide students to self-explore and independently construct their science knowledge, whereas teachers at Level
2 (simple adoption) used ICTs to help students learn about science but via more teacher-centered strategies or with less well-founded rationales.
Unlike the regular improvement of TPACK-P associated with Levels 2 through 4, Level 1 represented teachers with only a basic understanding
of technology resulting from their limited experience (or lack thereof), negative impressions regarding technology in the classroom, or a lack of
intention to implement ICTs in their classrooms. Level 0 indicated the situations in which the inter-viewed teachers did not know certain
technological applications. Finally, detailed benchmarks and indicators for each pro-ficiency level were developed for use in assessments,
planning and design, and enactment.

2.1.2. Options for items


The interviews also served as a resource for constructing the pool of options for the questionnaire items. The ideas that most teachers at a
specific level were thought typically to share were determined through a frequency calculation of similar responses. The four options offered for
each of the 17 items were created based on these high-frequency responses collected from Level 1 to 4. Necessary rephrasing and other
modifications were made to both the options and the stems, making them coherent. It was assumed that science teachers at the same TPACK-P
proficiency levels would select options containing similar ideas. The four options for each item were designed to match the indicators of the four
proficiency levels (see Table 1). Fig. 1 shows an item sample (see the complete questionnaire in supplementary material).

2.1.3. Item tasks


Each item required teachers to evaluate the four options in two dimensions (knowledge about and application of TPACK-P) and select those
that best described their opinion. Taking the sample item (see Fig. 1) as an example, teachers were first required to list the criteria from A to D in
their perceived order of importance and circle the most important selection criterion according to their knowledge of how technology support
science instruction (knowledge). The second task for each item was to recall what criteria they had actually considered before offering their
instruction (application). Briefly, these 17 items were designed to elicit perceptions and opinions by engaging participants in judging technology
implementations in different instructional scenarios. An instructional sheet was provided to guide respondents in how to answer each
questionnaire item.
Taking item #5 as an example, the teacher's response to the knowledge box included option A, B and C, and among which option C was
circled. It implied this teacher thought that the descriptions of option A, B and C were important features of using technology-supported
assessments as compared to conventional assessments, while engaging students to manipulate simulations or present thinking processes (option
C) in assessments was most difficult to achieve. His response of option B in the application box meant that he had previously used multimedia to
present dynamic content when using technology-supported assessments or assessing student learning with technology.

2.1.4. Item validity


Content validity of the questionnaire was determined by the exploratory data collected from previous studies and then through four in-service
teachers' review process. The item stem and options were drafted based on the indicators expert teachers selected and the highest-frequency
responses given by the science teachers (Yeh, et al., 2014, 2015). Four science inservice teachers who had longitudinally collaborated with the
authors on designing and implementing technology-supported curricula at the middle and high school levels were invited to be questionnaire
reviewers. Their job was to re-view the preliminary questionnaire and comment on how to make it more comprehensible to the population of
science teachers, as well as descriptive of teachers' knowledge and experience in teaching with technology. Necessary modi fications to the items
and options were made before the questionnaire was used.

2.2. Data collection

A total of 52 pre-service and 47 in-service high school science teachers were recruited for this study (see Table 2). The pre-service teachers
were senior college students (final year of program) who took either a physics or an earth science
T.-H. Jen et al. / Computers & Education 95 (2016) 45e62 49

Table 1
Benchmarks and indicators of science teachers' TPACK-P.

Proficiency level Instructional Benchmark Observed indicator


scenario
Level 4 - Reflective Assessment Evaluate students' learning of and about 1. Be able to use various representations or ICTs in instruction that
application science before and after science learning. enables teachers to identify students' learning styles and learning
difficulties (e.g., cognitive, affective) for the preparation of
adaptive instruction. [1C] [2C] [3C]

2. Be able to construct technology-supported assessments through


which students' knowledge of and about science can be evaluated.
[4C] [5C] [6A]
Planning & Design technology-supported instruction 3. Be able to utilize functions of technology to facilitate teachers'
Designing that accommodates students' learning of and (their) and students' exploration of scientific phenomena and
about science. construction of their science knowledge. [7C] [8B]

4. Consider and design technology-supported curricula for the


purpose of enhancing students' learning of and about science with
skillful use of technology. [9C] [10A]
5. Be able to construct technology-supported curricula based
on students' prior knowledge or for purposes of inquiry
learning, with strategic uses of digital representations or ICT
tools. [11B]
6. Be able to use student-centered instructional strategies to
accommodate students' learning of and about science from
completing inquiry-based tasks in technology-supported
environment. [12C] [13C]
Enactment Use technology skillfully to assist 7. Be able to use a variety of technology flexibly and strategically to
instructional material creation and student accommodate students' different learning needs, support their
independent learning. knowledge construction, and improve instructional effectiveness.
Level 3 - Infusive Assessment Use technology to assess students before [14C] [15C] [16C]
8. Be able to customize instructional materials with skillful uses of
technology or multimedia resources for different instructional
purposes. [17A]
1. Be able to use appropriate technology or online platforms to
application and after instruction. observe students' learning styles and learning difficulties and to
Planning & Design technology-supported instruction assist student learning. [1B] [2B] [3A]
2. Be able to implement appropriate multimedia or ICT tools into
instruction for the purposes of evaluation and learning. [4A] [5B]
[6D]
3. Be able to use appropriate representations or ICT tools to
Designing from the student-centered perspective or facilitate teachers' (their) and students' science learning through
with a focus on developing students' investigating scientific phenomena and making virtual experiments.
science learning. [7A] [8C]
4. Consider and design technology-supported instruction for
enhancing instructional effectiveness and students' learning of
science. [9A] [10B]
5. Be able to implement appropriate digital representations or ICT
tools that facilitate students' learning of abstract concepts and
scientific investigations. [11C]
6. Be able to use appropriate instructional strategies to facilitate
teachers' instruction and student learning of science in
technology-supported curricula (e.g. engaging students in
collaborative learning). [12B] [13A]
Enactment Use technology flexibly to assist students' 7. Be able to use appropriate technology to improve the quality of
learning and teachers' instruction content presentation, support communications, or build up
management students' learning profiles. [14A] [15B] [16A]
8. Be able to use different technology to manage instructional
resources or track student learning progress. [17C]

Level 2 e Simple Assessment Evaluate students through presenting 1. Be able to use different representations or ICTs to present science
adopting content with ICTs. content, from which they observe students' learning performance
and student learning is made possible. [1A] [2A] [3B]

2. Be able to use online assessments, digital representation or ICT


tools to evaluate students' learning. [4B] [5A] [6B]
Planning & Design technology-supported instruction 3. Be able to use representations or ICT tools for teachers
Designing with a focus on developing students' (themselves) and students to learn abstract concepts. [7B] [8A]
content comprehension or learning
motivation. 4. Consider technology uses in instruction according to external
factors or students' learning motivations. [9B] [10C]

(continued on next page)


50 T.-H. Jen et al. / Computers & Education 95 (2016) 45e62
Table 1 (continued )

Proficiency level Instructional Benchmark Observed indicator


scenario
5. Be able to present science content with digital
representations that are available and good for enhancing
students' learning motivations. [11A]
6. Be able to teach science with technology in couple of
instructional strategies for the purpose of enhancing
students' motivations and conceptual understanding. [12A]
[13B]
Enactment Use technology to make teaching more 7. Be able to implement technology in class to impress
interesting and better supported. students in science learning and make teachers' instruction
easier. [14B] [15A] [16B]
8. Be able to use word processors or online platforms to
manage instructional resources. [17B]
Level 1 e Lack Assessment Think technology make no specific 1. Think technology are not good tools to be used for
of use contributions to student evaluation. knowing students' learning styles or learning
difficulties.[1D] [2D] [3D]
2. Think technology-supported assessments are no different
from conventional assessments or they have concerns
regarding implementing ICTs to assist their assessments.
[4D] [5D] [6D]
Planning & Think technology make no specific 3. View learning science content through technology no
Designing contributions to curriculum design over better than learning from professional books or magazines.
conventional teaching. [7D] [8D]
4. Consider teaching with technology to be an alternative
instructional method to conventional instruction. [9D]
[10D]
5. Consider technology to be useful only in limited
instructional occasions. [11D]
6. View teaching with technology to be good enough for
instructional purposes, in need of no other teaching
strategies for support. [12D] [13D]
Enactment Think technology make no contributions to 7. Believe teaching with technology brings similar
teaching practices. contributions to student learning as conventional
instruction. [14D] [15D] [16D]
8. View current technology as not accommodating teachers'
needs in instructional management. [17D]
Level 0 e In the current study, those who performed below level 1 were categorized as level 0.

Note: The information in brackets refers to the numbers of items and options in the questionnaire.

teaching practicum. The requirements of these two courses were the completion of at least three microteaching cycles that involved lesson plans
and teaching with peers in addition to teaching internships in their content area in collaborative high schools. The pre-service teachers completed
the questionnaires after they finished the microteaching sessions and were working on their teaching internships. The participating in-service
teachers were reasonably well educated (63.8% had master's degrees) and experienced (93.6% had more than 5 years of teaching experience). All
of the participating in-service teachers were the attendees and lectures of e-learning workshops on teaching with technology, and they reported
they had experience in teaching with technology. Their knowledge were expected to show characteristics of TPACK-P and could be used to
cross-validate the proficiency levels identified from the protocol analyses of interviews. The recruitment of pre-service and in-service teachers
was expected to reveal variations in TPACK-P proficiency along an inex-periencedeexperienced continuum.

2.3. Data analysis

Two analyses were conducted by applying partial credit model (PCM, Masters, 1982) in item response theory (IRT) to validate the
framework of TPACK-P and set up the cut scores of the proficiency levels for the scales of science teachers' knowledge about and application of
TPACK-P. IRT allows test-takers ability and the difficulty of certain performances at different proficiency levels for all task items to be located
along the same scale. As a result, researchers are able compare the test-takers abilities and the difficulty of the proficiency levels for all task
items (Bond & Fox, 2015; Wright & Stone, 1999). Based on the developed metrics and proficiency levels, the in-service and pre-service science
teachers' knowledge about and application of TPACK-P were also examined. The person and item parameters in the current study were estimated
by using an open-source software package called Test Analysis Modules (TAM; Kiefer, Robitzsch, & Wu, 2015); the Wright's maps were
generated by WrightMap (Irribarra & Freund, 2014). Both packages were conducted in R-3.1.3 (R Development Core Team, 2014).
T.-H. Jen et al. / Computers & Education 95 (2016) 45e62 51

Fig. 1. Sample questionnaire items with a teacher's responses.

2.3.1. Analysis 1
Before engaging in standard setting for the science teachers' TPACK-P, the response data from the 47 in-service teachers to the 17 items in
the first section (i.e., perceived as important criteria) were used to cross-validate the proficiency levels for the different indicators in the TPACK-
P assessment framework developed through protocol analyses of previous interviews. The four options for each question were treated as different
pseudo-items because multiple-response questions can be seen as a set of agree/disagree items. The partial credit model (PCM; Masters, 1982) in
IRT was applied to locate the thresholds for the pseudo-items (see Eq. (A.1) in Appendix A).

The option listed as the most important criterion of the corresponding pseudo-item was scored as 2, the option listed as an important criterion
was scored as 1, and the other options were scored as 0. The higher threshold or dif ficulty of score 2 for each pseudo-item implied a smaller
likelihood that these teachers would choose that option as the most important. Thus, we calculated the Spearman's rank correlation coef ficient
between the proficiency ranks that were estimated from the 47 re-sponses and the ranks that were identi fied from the previous interview study.
The correlation results could be viewed as a validity indicator; a high correlation would suggest that the proposed pro ficiency levels (identified
from interview results and embedded with typical responses in item options) should be cross-validated by the in-service teachers' selections of
options based on features of the teachers' behaviors for the ranks.

2.3.2. Analysis 2
Knowledge differences between the pre-service and in-service teachers were explored by applying a multidimensional PCM to the response
data collected from the 99 participants. The first dimension referred to their knowledge about TPACK-P (i.e., perceived importance of each
criterion) and the second dimension to their application of TPACK-P (i.e., the application of the criterion in the teaching practice). All responses
on both parts of the 17 questions were scored according to the corre-sponding proficiency level (i.e., 1, 2, 3, 4). A blank response was scored as 0,
referring to proficiency Level 0 (i.e., the respondent had no idea how to use technology in a science class). Therefore, the scores for the 17
questions in both the knowledge and application dimensions were in the range of 0e4. The highest level among the chosen options was
designated as the score received for the specific question. Therefore, each teacher received 17 scores that could be used to estimate their TPACK-
P from the dimension of knowledge and another 17 scores for the dimension of application.

The variances of subgroups estimated based on test-takers' raw scores or the MLE scores estimated by joint maximum likelihood (JML)
method are likely overestimated due to the measurement errors; therefore, the effect sizes of the differences between the subpopulations are
attenuated. In the current study, the marginal maximum likelihood (MML) estimation, which takes the prior distributions of subpopulations as
additional conditions, was used to obtain the unbiased estimation of subpopulation variances (i.e. inservice and preservice teachers) (Adams &
Wu, 2007). Thus, the correct effect size of group difference can be calculated.
52 T.-H. Jen et al. / Computers & Education 95 (2016) 45e62

Table 2
Professional background of the participating teachers (N 99).

Pre-service teachers (n 52) In-service teachers (n 47)


Gender
Male 39 24
Female 13 23
Academic degree
Bachelor 52 17
Master N/A 30
Subject
Biology e 11
Chemistry e 6
Earth Science 10 4
Physics 42 1
Mathematics e 2
Technology e 1
Science and Technology e 7
Hybrid e 15

By applying MML estimation, the item thresholds and population parameters (i.e., means and variances for the two groups) could be
estimated simultaneously on the same scale in each of the two dimensions (Bock, 1981). Therefore, the regression coef ficients and their standard
errors of the predictor (i.e., the group variable) for the two latent abilities as knowledge about and application of TPACK-P in Eq. (1) could be
estimated directly before estimating individual scores (Adams
& Wu, 2007; Bock, 1981; de Gruijter & van der Kamp, 2008).
qA RA G CA EA (1)
qK RK CK EK

In Eq. (1), qK and qA refer to the estimated latent abilities of knowledge about and application of TPACK-P, respectively; G is the dummy
variable and equals to 1 for an experienced in-service teacher and 0 for an inexperienced pre-service teacher; R K and RA are the regression
coefficients of the predictor G; C K and CA are constants referring to pre-service teachers' average ability in knowledge about and application of
2 2
TPACK-P; and EK and EA are the error functions N0; s K and N0; s A.
3. Results and discussion

3.1. Analysis 1

Analysis 1 established the rank of thresholds of score 2 for the options in all 17 questions via the TAM and WrightMap packages. Fig. 2
provides questions #1 and #8 as examples to illustrate how the proficiency level for each option was identified. For question #1, option C had the
lowest threshold, implying that this option was likely to be the most important consideration in the context provided in the item scenario. For
most of the 17 questions (including question #1), option D was not selected as an important consideration by any respondent so the threshold of
option D for these questions was auto-matically assigned the least important status among the four options. Therefore, option C of question #1
should be the highest level (i.e., Level 4), option B could be identified as Level 3, option A as Level 2, and option D as the lowest level (i.e.,
Level 1). Similarly, for question #8, we identified option C as Level 4, option B as Level 3, option A as Level 2, and option D as Level 1. Based
on the 47 experienced teachers' responses, the proficiency levels of the four options for the 17 questions in the TPACK-P Questionnaire were
identified. The Spearman's rank correlation between the identified levels and the levels previously specified from the interview data was 0.87. In
other words, the framework of the four proficiency levels identified from the interview data was quantitatively supported by the current study.

3.2. Analysis 2

Analysis 2 provided estimates of the 47 experienced in-service and 52 pre-service teachers' knowledge about and application of TPACK-P by
using the multidimensional PCM. In order to estimate the group differences on these two dimensions, the group (i.e., pre-service, in-service) was
used as a regression variable to predict the teachers' latent abilities of the TPACK-P from both dimensions. The results indicated that the person
separation reliabilities of the survey for knowledge about and application of TPACK-P are 0.85 and 0.90, respectively. These person separation
reliabilities are good but not as high as expected for an in-strument including 17 four-level questions (i.e., the reliability should be at the level of
an instrument with 68 dichotomous items). Low reliability could be explained by the small variance across the proficiency levels of respondents.
In addition, the item separation reliabilities suggested by contemporary psychometrics (e.g. Bond & Fox, 2015; Krishnan & Idris, 2014; Linacre,
2012; Wright & Stone, 1999) were also calculated to examine the stability in estimating the item parameters and the appro-priation of sample
size. For the dimensions of knowledge about and application of TPACK-P, the item separation reliabilities are
T.-H. Jen et al. / Computers & Education 95 (2016) 45e62
Fig. 2. Item thresholds in Wright Map and corresponding locations of options for questions #1 and #8.
5354 T.-H. Jen et al. / Computers & Education 95 (2016) 45e62

Fig. 3. Wright Maps and the thresholds of proficiency levels of the metrics in (a) knowledge about and (b) application of TPACK-P.

0.95 and 0.96 demonstrating good replicability of item locations along the two dimensions if these same items were given to another 99 science
teachers who had the similar background or experience with the sample in the current study.
Various item- and model-level fit indices were utilized to examine the validity in using multidimensional PCM. For all 34 questions (17 each
in knowledge and application) their information-weighted MNSQ were ranged from 0.79 to 1.24, and the t values were located in between 1.96
and 1.96 (Appendix B). The results suggested that the equal discrimination assumption was sustained (Bond & Fox, 2015; Linacre, 2012; Wright
& Stone, 1999). In addition, the absolute values of elements in Q3 matrix (Yen, 1984) were ranged from 0.01 to 0.26, supporting a robust
assumption of local independence (Kim, de Ayala, Ferdous & Nering, 2011; Yen, 1993). Finally, the standardized root-mean-square residual
(SRMR) was equal to 0.08 and suggested adequate fit of model globally (Hu & Bentler, 1999, Maydeu-Olivares, 2013).
Therefore, the
proposed multidi-mensional PCM was used to interpret the subjects' responses on the developed instrument.
T.-H. Jen et al. / Computers & Education 95 (2016) 45e62 55

Table 3
Thresholds of item steps for the questions in TPACK-P.

Knowledge about TPACK-P Application of TPACK-P


Item no. Threshold of item steps Item no. Threshold of item steps
Level 1 Level 2 Level 3 Level 4 Level 1 Level 2 Level 3 Level 4
1 0.25 0.08 0.97 2.69 1 1.32 0.74 1.76 3.12
2 0.89 0.20 0.32 1.64 2 1.30 1.28 2.05 2.44
3 n/a* 0.62 0.68 1.53 3 2.19 1.19 1.98 2.61
4 0.28 0.30 1.07 3.26 4 0.87 0.69 1.85 3.55
5 1.24 0.50 0.37 1.59 5 1.10 1.12 1.72 2.22
6 1.34 0.50 0.80 3.31 6 1.35 0.91 1.80 2.99
7 0.88 0.49 0.61 1.74 7 1.52 0.48 1.47 1.84
8 0.84 0.71 0.33 3.20 8 2.16 0.08 0.45 1.33
9 0.12 0.35 1.06 2.91 9 0.54 0.35 0.98 2.03
10 0.28 0.05 1.52 3.92 10 1.03 0.18 0.99 3.86
11 0.42 0.21 0.89 2.31 11 1.53 0.54 1.27 2.16
12 0.54 0.21 0.71 2.23 12 1.45 0.70 1.30 1.90
13 0.33 0.14 0.68 2.62 13 1.03 0.70 1.04 1.97
14 0.50 0.19 0.26 2.23 14 1.69 0.00 0.67 1.43
15 0.04 0.23 0.91 2.99 15 0.88 0.76 1.33 1.98
16 0.91 1.13 1.92 3.58 16 1.05 0.26 1.13 2.73
17 0.41 0.01 0.50 2.40 17 1.53 0.51 1.23 1.92
Mean 0.45 0.08 0.80 2.60 Mean 1.33 0.62 1.35 2.36
(s.e.) (0.14) (0.11) (0.11) (0.18) (s.e.) (0.11) (0.09) (0.11) (0.17)
SD 0.54 0.45 0.44 0.73 SD 0.44 0.38 0.46 0.70
(s.e.) (0.10) (0.08) (0.08) (0.12) (s.e.) (0.07) (0.06) (0.08) (0.12)

3.2.1. Locating the thresholds of proficiency levels on the scales


The PCM and the responses given by the 99 science teachers were used to locate the item thresholds and the respondents' latent ability on the
same scale as demonstrated in Fig. 3. The thresholds of levels for all 17 questions on each of the knowledge about and application of TPACK-P
scale are listed in Table 3. For each item, the threshold of each proficiency level in the logit scale at which a person had a 75% chance to get a
score point higher than or just equal to the corresponding level of response was calculated. In each of the two dimensions, the variation of
thresholds of the same proficiency level across the 17 items reflected the fact that the task difficulty interacts with the context of educational
practice described in the scenarios. In education, 75% is usually a reasonable probability to certify that a person can reach a specific proficiency
level for tasks with an average difficulty. Thus, by averaging the thresholds across the items (see Eq. (A.2) in Appendix A), the thresholds of
proficiency levels were located for the dimensions of knowledge about TPACK-P as 0.45, -0.08, 0.80, and 2.60 (logit) and of application of
TPACK-P as 1.33, 0.62, 1.35, and 2.36, respectively (Fig. 3).

3.2.2. Comparisons between pre-service and in-service science teachers


Table 4 shows the results of the regression coefficients of the predictor (i.e., group variable G) of the latent abilities in the two dimensions.
The estimates of constant, 0.65 and 0.07 (logit), were the average abilities of preservice teachers' knowledge about and application of TPACK-P,
respectively. The regression coefficients of group variable G, 0.02 and 0.10 (logit), were the differences of inservice teachers' knowledge about
and application of TPACK-P from preservice teachers' ones. The results of t-test for group differences can be examined by the ratios of the
coefficients (Lord, 1960) and their standard errors. Both ratios (0.02/0.13 and 0.10/0.18) for the two dimensions were between 1.96 and 1.96,
indicating that there was no significant difference between the two groups in both the dimensions of knowledge about and application of
TPACK-P.

The TAM package also provided the covariance matrix of the two abilities in knowledge about and application of TPACK-P (Table 5). The
correlation between knowledge about and application of TPACK-P was 0.74, indicating a good discriminating validity to differentiate knowledge
about and application of TPACK-P as two different latent abilities. In addition, the variances

Table 4
Regression Coefficients (standard errors) and the Constant Estimations of Latent Abilities in knowledge about and
application of TPACK-P.
Regression variable Dimension
Knowledge Application
G 0.02 (0.13) 0.10 (0.18)
Constant 0.65 (0.20) 0.07 (0.28)

Note. G is the dummy variable that equals to 1 for an in-service teacher and 0 for an inexperienced pre-service teacher.
56 T.-H. Jen et al. / Computers & Education 95 (2016) 45e62

Table 5
Covariance Matrix of Abilities in knowledge about and application of
TPACK-P.

qK qA
qK 0.35 0.35
q 0.74 0.64
A

Note. Value below the diagonal is correlation, value above is covariance,


and the diagonal ones are the variances.

of the two latent abilities were small (i.e., 0.35 and 0.64) in comparison with general ability, such as reading or mathematics ability, in classroom
contexts. In addition, the distributions of the two groups at different proficiency levels were almost the same (see Table 6).

The average abilities and distributions at different proficiency levels were similar for the two groups. In addition, most of the participants'
knowledge about TPACK-P was located at Level 3 and application of TPACK-P was located at Level 1. This evidence implied that most of the
participants demonstrated their TPACK-P at proficiency Level 3 and Level 1 for the di-mensions of knowledge and application, respectively.
Lacking obvious proficiency differences between preservice and inservice teachers suggested even the inservice science teachers in this study
who reported having experiences teaching with technology did not develop better TPACK-P than pre-service teachers did.

4. Discussion

It is common to see teachers' TPACK-P be evaluated through composite scores earned from how well they understand and use technology, or
by proficiency ranks determined by their achievement of certain levels. Higher scores or higher ranks entail more advanced TPACK-P. Teachers
at the same developmental levels are assumed to share knowledge or teaching perfor-mances at the same complexity levels. Qualitative data can
be informative revealing typical features and identifying nuances between levels, but it is rare that the thresholds for the categorical ranks or
ordinal scores are statistically examined. In this study, we validated the five proficiency ranks of teachers' TPACK by examining the correlations
between the ranks located in Wright Maps constructed from the data provided by teachers' questionnaire responses and the ranks identified in the
in-terviews. The thresholds of these ranks were also identified according to the average threshold of each proficiency level from the 17 items,
according to a logit scale. Science teachers in this study were found to possess higher pro ficiency levels from the perspective of knowledge than
application, though there were no significant differences found between pre-service and in-service teachers.

Difficulty variations in these 17 items were expected because: (a) the different instructional scenarios demanded different abilities, and (b)
teacher evaluations had to be conducted as specifically as possible in terms of the scope of the objective evaluated (e.g., oral language use and
direction offered for communicating with students; Danielson Group, 2013, pp. 22-24). Therefore, in the current study, we located the threshold
of a proficiency level by averaging the thresholds across all the task items; the threshold of each item was defined as the presence of a 75%
chance of a participant performing at the same or a higher level in the item. However, most researchers in educational testing acknowledge that
the techniques used in standard setting are judgmental, due either to the arbitrary requirement of the likelihood of success or the arbitrary
selection of observed indicators (e.g., Block, 1978; Shepard, 1980). One could apply a harsher criterion to set the cut scores for proficiency levels
(for example, by estimating an 85% chance of a participant performing at the same or a higher level for 80% of the task items). A harsher or
lesser criterion can be used for setting proficiency levels, depending on the purposes and the risks that may come along. Researchers can easily
apply the procedure demonstrated in the current study and change the required probability of success to meet their purposes and manage the
associated risks.

The proficiency levels on an ability scale can be used to map learners' developmental trajectory. Studies that investigate learning progression
usually adopt a longitudinal design, following up with analyses of developmental trajectory categorization ( Niess et al., 2009; Penuel, Confrey,
Maloney, & Rupp, 2014). This study intended to validate features that were typical of levels in a hierarchical structure on the scales of teachers'
knowledge about and application of technology for instructional use. Subsequently, the variance of population performance could be modeled
and explained. Although the learning proficiency and learning progression can be different concepts in nature, the

Table 6
Distribution of Pre-service and In-service Science Teachers at Different Proficiency Levels in knowledge about and application of TPACK-P.

Proficiency level Knowledge Application


Pre-service teachers In-service teachers Pre-service teachers In-service teachers
Level 0 2 (3.8%) 2 (4.3%) 4 (7.7%) 3 (6.4%)
Level 1 4 (7.7%) 4 (8.5%) 38 (73.1%) 38 (80.9%)
Level 2 22 (42.3%) 20 (42.6%) 9 (17.3%) 6 (12.8%)
Level 3 24 (46.2%) 21 (44.7%) 0 (0%) 0 (0%)
Level 4 0 (0%) 0 (0.0%) 0 (0%) 0 (0%)
T.-H. Jen et al. / Computers & Education 95 (2016) 45e62 57

proficiency levels that are validated on the two scales of teachers' knowledge and application (the current study) are found parallel to stages that
describe teachers' learning progression related to a piece of technology (spreadsheet) designed for students' math learning ( Niess et al., 2009).
Similar features were found for teachers ranked Levels 2 through Level 4. Teachers at Level 4 (reflective application) showed their reflective
thinking or innovative curriculum construction abilities through their potential ICT uses. Teachers at Level 3 (infusive application) displayed
their ability to select and use appropriate ICTs to support instruction, whereas teachers at Level 2 (simple adoption) tended to use ICTs to
facilitate students' learning of content knowledge. The degree of student-centeredness and appropriateness of the ICT engagement increased as
the level increased. On the other hand, features of teachers at Level 1 (lack of use) or below were quite different from those observed in teachers'
learning progression (i.e., recognizing, accepting). Sampling differences could be the main reason for this characteristic. Teachers' TPACK
development usually begins with recog-nizing technology in instruction and forming an accepting attitude and belief in their value, especially
when they are motivated learners of technology-supported instruction or properly guided in teaching with certain technology (i.e., Niess et al.,
2009). However, we should not ignore the fact that there are still some teachers who do not opt to attend or are too busy to make long-term
learning programs related to technology. How to attract and assist these two teacher groups in their TPACK and TPACK-P learning tracks cannot
be less important than assisting teachers in refining their teacher knowledge.

Examination of the thresholds of the four proficiency levels, which were based on the average difficulty across the 17 items, indicated that it
was especially hard for these science teachers to reach Level 4 in knowledge of TPACK-P or Level 2 in application. The greater barrier difficulty
faced by teachers hoping to master these two levels can be explained by the greater proportion of teachers in Levels 2 and 3 in knowledge and
Level 1 in application. That is, the science teachers evaluated in this study might have developed knowledge of adopting and infusing technology
into their instruction, but it was rare that they applied such knowledge in their actual teaching practices (TPACK-P). Considering that TPACK-P
is developed based on experiences of teaching with technology and instructional reflection, neither academic nor applicational TPACK-P can
effectively be enhanced when technology are not implemented or experimented within classrooms. The instructional environment can be
supportive of devices but not supportive in terms of curriculum or teacher support systems ( Afshari et al., 2009; Ertmer, 1999; Mumtaz, 2000).
For example, some science teachers attributed their low technology implementation to tight curriculum schedules, which minimized the time
they tried out new learning tools in class or digitalized the curriculum. Increased knowledge and uses of such devices can boost teachers'
confidence and self-efficacy, and then lead to more instructional implementation in classrooms (Koh & Frick, 2009; Mueller, Wood, Willoughby,
Ross, & Specht, 2008; Thomas & O'Bannon, 2015). Therefore, teachers' TPACK-P can be further developed from their continuous and reflective
technology implementation with witness of instructional effectiveness.

Experienced teachers displayed a greater repertoire of representations and flexible teaching strategies in their PCK (Clement, Borko, &
Krajcik, 1994). Inservice teachers who had experience in teaching with technology were assumed to develop stronger TPACK-P than preservice
teachers, since TPACK-P develops through actual technology implementation in instruction. However, results of the current study showed that
preservice teachers may not know less about teaching with technology than inservice teachers. Previous large-scale survey results indicated that
age and teaching experience may not be reliable predictors of teachers' technology uses in classrooms, or even their level of TPACK. Russell,
Bebell, O'Dwyer and O'Connor (2003) found that novice teachers (<5 years) possessed higher confidence in technology, but such confidence did
not translate to actual uses of technology or student-centered instruction. Veteran teachers (>15 years) seemed to possess the lowest confidence in
technology, but both veteran teachers and experienced teachers (6e15 years) showed significantly more teacher-directed student use of
technology during instruction. Another survey found that preservice teachers (digital natives) recognized and used more features of smart
phones, but they were not enthusiastic about the use of smart phones in instruction (O'Bannon & Thomas, 2015; Thomas & O'Bannon, 2015).
These studies indicate the lack of an automatic connection between their PCK and technology knowledge or use, although most teachers are now
exposed to technology in daily life and are aware of the possible benefits technology can offer. Most teachers still need to be guided and peer-
supported in making their technology uses more effectively or technology-supported instruction more student-centered. Higher levels of PCK
and technological knowledge or confidence can offer good starting points, but learning or actual uses of technology to support different
instructional purposes are required in terms of their TPACK or TPACK-P development.

There are limitations to the current study. First, the statistical results should be interpreted with caution because the participants were
convenient samples from two target populations. Second, designing items in a format with multiple options can be a first trial among the self-
rated questionnaires used for TPACK-P evaluation. IRT results showed that the options generated from the interview data were sensitive to
teachers' TPACK-P proficiency; therefore, these options can be viewed as typical features of the teachers at specific ranks. However, it should be
noted that self-reported questionnaires may not reflect actual instructional performance, although this issue was partially addressed by requiring
questionnaire respondents to report their knowledge about and application of TPACK-P. Third, the questionnaire constructed in this study offers
a quick examination of science teachers' TPACK in contexts of teaching practices. Future studies would be needed if researchers would like to
examine whether teachers' learning progression in teaching with technology follows a linear or cyclical pathway.
58 T.-H. Jen et al. / Computers & Education 95 (2016) 45e62

5. Conclusions

Teacher evaluation is much more difficult than student evaluation because teachers' instructional knowledge is dynamic, contextualized, and
personal. In previous studies, the developmental trajectory of TPACK and TPACK-P has been qualitatively validated based on teachers' actual
performance. This study further examined and validated the hierarchy of proficiency levels and corresponding typical features of teachers'
TPACK-P through an IRT standard setting. These qualitatively and quantitatively validated features can be used as milestones along the TPACK-
P roadmap, allowing references for quick, one-time evaluations or longitudinal observations of teachers' TPACK development. Researchers and
teachers can quantify how mature a teacher's TPACK-P is and what is needed for it to further evolve. Finally, knowing is easier than doing.
Science teachers may know what ICT tools are available to them and how they might facilitate instruction, but lacking actual experience (e.g.,
designing, enacting, and reflecting on technology-supported instruction) can decrease the possibility that they notice what still needs to learn and
their TPACK-P is unlikely to be further refined. The availability of technology in the classroom and teachers' professional development in
TPACK might not be the only issues; how teachers feel about their environment and the support available for their teaching with technology
should be considered and must be addressed if the development and elaboration of teachers' TPACK-P is to be pursued.

Acknowledgment

This research was partially supported by the Aim for the Top University Project e NTNU and International Research-Intensive Center of
Excellence Program of National Taiwan Normal University (Grant no. 104-2911-I-003-301) and the Database of Science Education Research
of Ministry of Science and Technology (Grant no. NSC102-2511-S-003-017-MY3).

Appendix A. Partial credit model

If item i is a partial credit item with score 0, 1, 2, , mi, the probability of person n scoring x on item i is given by
ni mi P h
Pr X x exp
kx q (A.1)
0 n dik
P P q d
h0 exp k0 n ik
where we define expPxk0qn dik 1. In Eq. (A.1), qn refers to the person's latent ability and d ik to the item parameters. In the current study, the threshold
ability qTix of the xth step for item i is defined as the value of latent ability satisfying
q P P
Pr Xi x q Tix mi exp h 0:75 (A.2)
mi h 0 qT d ik
h
h
x 0 exp
k
k 0 T
q
ix
ix
d
ik

P P

Therefore, for a person whose latent ability is higher than qTix , the probability of obtaining a score higher than x on item i is larger than 0.75.

Appendix B. Fit indices of partial credit model

Table B.1
Information-weighted fit (infit) indices for average item threshold.

Knowledge about TPACK-P Application of TPACK-P


Item no. MNSQ CI T Item no. MNSQ CI T
1 0.92 (0.71,1.29) 0.5 1 1.07 (0.72,1.28) 0.5
2 1.15 (0.68,1.32) 0.9 2 1.05 (0.64,1.36) 0.3
3 0.98 (0.76,1.24) 0.1 3 1.04 (0.64,1.36) 0.2
4 1.02 (0.72,1.28) 0.2 4 1.15 (0.73,1.27) 1.1
5 1.02 (0.71,1.29) 0.2 5 0.98 (0.67,1.33) 0.1
6 0.99 (0.70,1.30) 0.0 6 1.14 (0.70,1.30) 0.9
7 1.03 (0.71,1.29) 0.3 7 0.99 (0.71,1.29) 0.0
8 0.9 (0.62,1.38) 0.5 8 0.91 (0.75,1.25) 0.7
9 1.16 (0.72,1.28) 1.1 9 1.19 (0.74,1.26) 1.4
10 1.18 (0.70,1.30) 1.1 10 0.84 (0.75,1.25) 1.3
11 1.14 (0.71,1.29) 1.0 11 0.84 (0.73,1.27) 1.2
12 1.03 (0.70,1.30) 0.2 12 0.97 (0.71,1.29) 0.2
13 0.95 (0.70,1.30) 0.3 13 0.93 (0.72,1.28) 0.5
14 0.99 (0.66,1.34) 0.0 14 1.06 (0.75,1.25) 0.5
15 0.95 (0.71,1.29) 0.3 15 0.95 (0.72,1.28) 0.3
16 1.17 (0.76,1.24) 1.3 16 1.23 (0.74,1.26) 1.8
17 0.94 (0.69,1.31) 0.4 17 1.13 (0.73,1.27) 0.9
T.-H. Jen et al. / Computers & Education 95 (2016) 45e62 59

Table B.2
Information-weighted fit (infit) indices for the thresholds of item steps.

Knowledge about TPACK-P Application of TPACK-P


Item no Item step MNSQ CI t Item no Item step MNSQ CI t

1 0 1.04 (0.40,1.60) 0.2 1 0 0.81 (0.50,1.50) 0.7


1 1 0.93 (0.33,1.67) 0.1 1 1 1.09 (0.89,1.11) 1.5
1 2 1.01 (0.77,1.23) 0.1 1 2 1.02 (0.82,1.18) 0.2
1 3 1.00 (0.91,1.09) 0.1 1 3 1.07 (0.58,1.42) 0.4
1 4 1.00 (0.75,1.25) 0.0 1 4 1.00 (0.05,1.95) 0.2
2 0 1.51 (0.00,2.43) 0.8 2 0 1.03 (0.52,1.48) 0.2
2 1 1.01 (0.36,1.64) 0.1 2 1 0.90 (0.87,1.13) 1.6
2 2 0.97 (0.55,1.45) 0.1 2 2 0.97 (0.76,1.24) 0.2
2 3 0.96 (0.86,1.14) 0.5 2 3 1.05 (0.00,2.04) 0.3
2 4 90 (0.86,1.14) 0.3

3 0 0. * (n/a,n/a) 1.4 2 4 1.06 (0.07,1.93) 0.1


n/a n/a 3 0 0.91 (0.18,1.82)
3 1 1.03 (0.21,1.79) 0.2 3 1 1.03 (0.83,1.17) 0.3
3 2 1.03 (0.76,1.24) 0.3 3 2 0.99 (0.77,1.23) 0.1
3 3 1.00 (0.77,1.23) 0.1 3 3 1.15 (0.24,1.76) 0.5
3 4 0.95 (0.85,1.15) 0.7 3 4 0.93 (0.07,1.93) 0.0
4 0 1.24 (0.40,1.60) 0.8 4 0 1.02 (0.62,1.38) 0.2
4 1 0.87 (0.56,1.44) 0.5 4 1 1.03 (0.89,1.11) 0.6
4 2 1.00 (0.75,1.25) 0.0 4 2 1.05 (0.83,1.17) 0.6
4 3 0.97 (0.91,1.09) 0.7 4 3 1.09 (0.61,1.39) 0.5
4 4 1.06 (0.62,1.38) 0.4 4 4 0.97 (0.00,2.20) 0.1
5 0 1.94 (0.00,3.59) 1.0 5 0 0.89 (0.58,1.42) 0.4
5 1 0.95 (0.16,1.84) 0.0 5 1 0.94 (0.88,1.12) 1.0
5 2 0.98 (0.65,1.35) 0.0 5 2 0.99 (0.72,1.28) 0.0
5 3 1.02 (0.84,1.16) 0.3 5 3 0.96 (0.24,1.76) 0.0
5 4 0.99 (0.86,1.14) 0.2 5 4 1.04 (0.31,1.69) 0.2
6 0 1.93 (0.00,3.58) 1.0 6 0 1.01 (0.49,1.51) 0.1
6 1 0.87 (0.25,1.75) 0.3 6 1 1.06 (0.89,1.11) 1.0
6 2 0.96 (0.77,1.23) 0.4 6 2 1.08 (0.79,1.21) 0.7
6 3 1.00 (0.92,1.08) 0.1 6 3 0.89 (0.53,1.47) 0.4
6 4 1.02 (0.64,1.36) 0.2 6 4 1.09 (0.05,1.95) 0.3
7 0 1.30 (0.00,2.44) 0.6 7 0 0.79 (0.41,1.59) 0.7
7 1 0.89 (0.00,2.01) 0.1 7 1 1.01 (0.87,1.13) 0.1
7 2 1.02 (0.74,1.26) 0.2 7 2 1.01 (0.88,1.12) 0.1
7 3 1.01 (0.83,1.17) 0.2 7 3 0.99 (0.30,1.70) 0.1
7 4 1.04 (0.85,1.15) 0.5 7 4 1.17 (0.55,1.45) 0.8
8 0 1.09 (0.00,2.46) 0.4 8 0 0.90 (0.07,1.93) 0.1
8 1 0.94 (0.00,2.87) 0.3 8 1 1.00 (0.81,1.19) 0.1
8 2 0.93 (0.68,1.32) 0.4 8 2 1.00 (0.63,1.37) 0.0
8 3 0.98 (0.89,1.11) 0.3 8 3 0.98 (0.72,1.28) 0.1
8 4 1.05 (0.69,1.31) 0.4 8 4 0.94 (0.75,1.25) 0.5
9 0 1.31 (0.60,1.40) 1.4 9 0 0.94 (0.70,1.30) 0.3
9 1 0.96 (0.32,1.68) 0.0 9 1 0.99 (0.82,1.18) 0.1
9 2 0.98 (0.75,1.25) 0.1 9 2 1.00 (0.75,1.25) 0.1
9 3 1.03 (0.90,1.10) 0.5 9 3 1.05 (0.69,1.31) 0.3
9 4 0.97 (0.70,1.30) 0.1 9 4 1.16 (0.60,1.40) 0.8
10 0 1.18 (0.40,1.60) 0.7 10 0 0.93 (0.56,1.44) 0.3
10 1 0.93 (0.32,1.68) 0.1 10 1 0.94 (0.83,1.17) 0.6
10 2 1.00 (0.89,1.11) 0.0 10 2 1.00 (0.76,1.24) 0.1
10 3 0.99 (0.89,1.11) 0.1 10 3 0.85 (0.81,1.19) 1.6
10 4 1.11 (0.38,1.62) 0.4 10 4 0.87 (0.01,1.99) 0.1
11 0 1.06 (0.25,1.75) 0.3 11 0 0.81 (0.41,1.59) 0.6
11 1 0.99 (0.00,2.02) 0.1 11 1 0.98 (0.87,1.13) 0.2
11 2 1.02 (0.79,1.21) 0.2 11 2 0.98 (0.79,1.21) 0.1
11 3 1.00 (0.89,1.11) 0.1 11 3 1.01 (0.60,1.40) 0.1
11 4 1.11 (0.81,1.19) 1.1 11 4 0.87 (0.52,1.48) 0.5
12 0 1.20 (0.13,1.87) 0.6 12 0 0.85 (0.46,1.54) 0.5
12 1 0.90 (0.15,1.85) 0.1 12 1 0.96 (0.87,1.13) 0.6
12 2 0.98 (0.74,1.26) 0.1 12 2 1.02 (0.76,1.24) 0.2
12 3 0.98 (0.90,1.10) 0.4 12 3 0.96 (0.48,1.52) 0.1
12 4 1.07 (0.82,1.18) 0.8 12 4 1.12 (0.55,1.45) 0.6
13 0 1.16 (0.34,1.66) 0.6 13 0 1.02 (0.59,1.41) 0.2
13 1 0.92 (0.46,1.54) 0.2 13 1 0.92 (0.87,1.13) 1.2
13 2 0.97 (0.64,1.36) 0.1 13 2 1.02 (0.58,1.42) 0.1
13 3 1.00 (0.93,1.07) 0.0 13 3 1.04 (0.63,1.37) 0.2
13 4 0.99 (0.78,1.22) 0.1 13 4 0.82 (0.58,1.42) 0.9
14 0 1.12 (0.14,1.86) 0.4 14 0 0.89 (0.30,1.70) 0.2
14 1 0.89 (0.16,1.84) 0.1 14 1 0.99 (0.81,1.19) 0.1

(continued on next page)


60 T.-H. Jen et al. / Computers & Education 95 (2016) 45e62
Table B.2 (continued )

Knowledge about TPACK-P Application of TPACK-P


Item no Item step MNSQ CI t Item no Item step MNSQ CI t
14 2 1.04 (0.51,1.49) 0.2 14 2 1.02 (0.78,1.22) 0.2
14 3 0.99 (0.94,1.06) 0.2 14 3 1.01 (0.67,1.33) 0.1
14 4 1.00 (0.84,1.16) 0.0 14 4 0.99 (0.73,1.27) 0.0
15 0 1.07 (0.53,1.47) 0.3 15 0 1.03 (0.63,1.37) 0.2
15 1 0.93 (0.32,1.68) 0.1 15 1 0.90 (0.89,1.11) 1.8
15 2 0.99 (0.72,1.28) 0.0 15 2 1.00 (0.73,1.27) 0.0
15 3 0.99 (0.92,1.08) 0.2 15 3 1.01 (0.48,1.52) 0.1
15 4 0.99 (0.70,1.30) 0.0 15 4 0.92 (0.52,1.48) 0.2
16 0 1.14 (0.79,1.21) 1.2 16 0 1.24 (0.56,1.44) 1.6
16 1 1.01 (0.49,1.51) 0.1 16 1 1.09 (0.84,1.16) 1.1
16 2 1.01 (0.78,1.22) 0.1 16 2 1.02 (0.80,1.20) 0.2
16 3 1.03 (0.74,1.26) 0.3 16 3 1.20 (0.75,1.25) 1.5
16 4 1.03 (0.34,1.66) 0.2 16 4 0.97 (0.42,1.58) 0.0
17 0 1.14 (0.26,1.74) 0.5 17 0 0.98 (0.41,1.59) 0.0
17 1 0.81 (0.34,1.66) 0.5 17 1 0.93 (0.87,1.13) 1.0
17 2 0.99 (0.59,1.41) 0.0 17 2 1.05 (0.80,1.20) 0.5
17 3 0.98 (0.94,1.06) 0.6 17 3 1.04 (0.54,1.46) 0.2
17 4 1.04 (0.82,1.18) 0.4 17 4 1.11 (0.58,1.42) 0.5
*
For item# 3, the 95% CI and t value are not available because no respondent was scored 0 in the item.

Appendix C. Supplementary data

Supplementary data related to this article can be found at http://dx.doi.org/10.1016/j.compedu.2015.12.009.

References

Abbitt, J. T. (2011). Measuring technological pedagogical content knowledge in pre-service teacher education: a review of current methods and instruments.
Journal of Research on Technology in Education, 43(4), 281e300.
Adams, R., & Wu, M. (2007). The mixed-coefficients multinomial logit model: a generalized form of the Rasch model. In C. Carstensen (Ed.), Multivariate and
mixture distribution Rasch models (pp. 57e75). New York, NY: Springer.
Afshari, M., Bakar, K. A., Luan, W. S., Samah, B. A., & Fooi, F. S. (2009). Factors affecting teachers' use of information and communication technology. Online Submission, 2(1),
77e104.
Ainsworth, S. (2006). DeFT: a conceptual framework for considering learning with multiple representations. Learning and Instruction, 16(3), 183e198.
Angeli, C., & Valanides, N. (2009). Epistemological and methodological issues for the conceptualization, development, and assessment of ICT-TPCK: ad- vances in technological
pedagogical content knowledge (TPCK). Computers & Education, 52(1), 154e168.
Archambault, L., & Crippen, K. (2009). Examining TPACK among K-12 online distance educators in the United States. Contemporary Issues in Technology and Teacher
Education, 9, 71e88. Retrieved from http://www.citejournal.org/vol9/iss1/general/article2.cfm.
Ay, Y., Karadag, E., & Acat, M. B. (2015). The Technological Pedagogical Content Knowledge-practical (TPACK-Practical) model: examination of its validity in the Turkish
culture via structural equation modeling. Computers & Education, 88, 97e108.
Block, J. H. (1978). Standards and criteria: a response. Journal of Educational Measurement, 15, 291e295.
Bock, R. D. (1981). Marginal maximum likelihood estimation of item parameters: Application of an EM algorithm. Psychometrika, 64(4), 443e459.
Bond, T. G., & Fox, C. M. (2015). Applying the Rasch model: Fundamental measurement in the human sciences (2nd ed.). Mahwah, NJ: Lawrence Erlbaum.
etin, S., & Gelbal, S. (2013). A comparison of bookmark and Angoff standard setting methods. Kuram Ve Uygulamada Egitim Bilimleri, 13(4), 2169e2175.
Chen, W., Hendricks, K., & Archibald, K. (2011). Assessing pre-service teachers' quality teaching practices. Educational Research and Evaluation, 17(1), 13e32.
Clement, C. P., Borko, H., & Krajcik, J. S. (1994). Comparative study of the pedagogical content knowledge of experienced and novice chemical demonstrators.
Journal of Research in Science Teaching, 31(4), 419e441.
Cochran-Smith, M., & Lytle, S. L. (1999). Relationships of knowledge and practice: teacher learning in communities. Review of research in education, 249e305.

Cox, S., & Graham, C. R. (2009). Using an elaborated model of the TPACK framework to analyze and depict teacher knowledge. TechTrends, 53(5), 60e71.
Danielson Group, Danielson 2013 rubric: adapted to New York Department of Education framework for teaching components, Retrieved from http://usny.
nysed.gov/rttt/teachers-leaders/practicerubrics/Docs/danielson-teacher-rubric-2013-instructionally-focused.pdf
Doering, A., Veletsianos, G., Scharber, C., & Miller, C. (2009). Using the technological, pedagogical, and content knowledge framework to design online learning environments
and professional development. Journal of Educational Computing Research, 41(3), 319e346.
van Driel, J. H., Beijaard, D., & Verloop, N. (2001). Professional development and reform in science education: the role of teachers' practical knowledge.
Journal of Research in Science Teaching, 38(2), 137e158.
van Driel, J. H., Verloop, N., & de Vos, W. (1998). Developing science teachers' pedagogical content knowledge. Journal of Research in Science Teaching, 35(6), 673e695.

Dwyer, D. C., Ringstaff, C., & Sandholtz, J. H. (1991). Changes in teachers' beliefs and practices in technology-rich classrooms. Educational Leadership, 48(8), 45e52.

Ertmer, P. A. (1999). Addressing first- and second-order barriers to change: strategies for technology integration. Educational Technology Research and Development, 47(4),
47e61.
Gess-Newsome, J., & Lederman, N. G. (1993). Pre-service biology teachers' knowledge structures as a function of professional teacher education: a year-long assessment. Science
Education, 77(1), 25e45.
Gray, L., Thomas, N., & Lewis, L. (2010). Teachers' use of educational technology in U.S. public schools: 2009 (NCES 2010e040). Washington, DC: National Center for
Education Statistics, Institute of Education Sciences, US Department of Education.
de Gruijter, D. N. M., & van der Kamp, L. J. T. (2008). Statistical test theory for the behavioral sciences. Boca Raton, FL: Chapman & Hall/CRC.
Haertel, E. H., & Lorie, W. A. (2004). Validating standards-based test score interpretations. Measurement: Interdisciplinary Research and Perspectives, 2, 61e103.
T.-H. Jen et al. / Computers & Education 95 (2016) 45e62 61

Harris, J. B., Grandgenett, N., & Hofer, M. (2011). Testing a TPACK-based technology integration assessment rubric. In D. Gibson, & B. Dodge (Eds.), Pro-ceedings of Society
for information technology & teacher education International Conference 2010 (pp. 3833e3840). Chesapeake, VA: Association for the Advancement of Computing in
Education (AACE). Retrieved from http://www.editlib.org/p/33978.
Hu, L. t., & Bentler, P. M. (1999). Cutoff criteria for fit indexes in covariance structure analysis: Conventional criteria versus new alternatives. Structural
Equation Modeling: A Multidisciplinary Journal, 6(1), 1e55. http://dx.doi.org/10.1080/10705519909540118.
Irribarra, D. T., & Freund, R. (2014). WrightMap: IRT item-person map with ConQuest integration. Retrieved from http://github.com/david-ti/wrightmap. Jimoyiannis, A. (2010).
Designing and implementing an integrated technological pedagogical science knowledge framework for science teachers' profes-
sional development. Computers & Education, 55(3), 1259e1269.
de Jong, T., & van Joolingen, W. R. (1998). Scienti fic discovery learning with computer simulations of conceptual domains. Review of Educational Research, 68(2), 179e201.

Kagan, D. M. (1990). Ways of evaluating teacher cognition: Inferences concerning the Goldilocks Principle. Review of Educational Research, 60(3), 419e469. Kiefer, T.,
Robitzsch, A., & Wu, M. (2015). TAM: Test analysis modules (R package version 1.6-0). Retrieved from http://CRAN.R-project.org/packageTAM. Kim, D., de Ayala, R. J.,
Ferdous, A. A., & Nering, M.l. (2011). The comparative performance of conditional independence indices. Applied Psychological
Measurement, 35(6), 447e471.
Koehler, M. J., & Mishra, P. (2008). Introducing TPCK. In American association of colleges for teacher education Committee on Innovation and technology. In The handbook of
technological pedagogical content knowledge (TPCK) for educators (pp. 3e29). New York, NY: Routledge.
Koehler, M. J., Shin, T. S., & Mishra, P. (2012). How do we measure TPACK? Let me count the ways. In R. N. Ronau, C. R. Rakes, & M. L. Niess (Eds.), Educational technology,
teacher knowledge, and classroom impact: A research handbook on frameworks and approaches (pp. 16e31). Hershey, PA: Information Science Reference.

Koh, J. H. L., Chai, C. S., & Tay, L. Y. (2014). TPACK-in-Action: unpacking the contextual in fluences of teachers' construction of technological pedagogical content knowledge
(TPACK). Computers & Education, 78, 20e29.
Koh, J. H. L., & Divaharan, S. (2011). Developing pre-service teachers' technology integration expertise through the TPACK-developing instructional model.
Journal of Educational Computing Research, 44(1), 35e58.
Koh, J. H., & Frick, T. W. (2009). Instructor and student classroom interactions during technology skills instruction for facilitating preservice teachers' computer self-efficacy.
Journal of Educational Computing Research, 40(2), 211e228.
Krajcik, J. S., Blumenfeld, P. C., Marx, R. W., & Soloway, E. (1994). A collaborative model for helping teachers learn project-based instruction. Elementary School Journal, 94(5),
483e497.
Krishnan, S., & Idris, N. (2014). Investigating reliability and validity for the construct of inferential statistics. International Journal of Learning, Teaching and Educational
Research, 4(1), 51e60.
Lee, E., Brown, M. N., Luft, J. A., & Roehrig, G. H. (2007). Assessing beginning secondary science teachers' PCK: Pilot year results. School Science and Mathematics, 107(2),
52e60.
Lee, E., & Luft, J. A. (2008). Experienced secondary science teachers' representation of pedagogical content knowledge. International Journal of Science Education, 30(10),
1343e1363.
Lee, M. H., & Tsai, C. C. (2010). Exploring teachers' perceived self ef ficacy and technological pedagogical content knowledge with respect to educational use of the World Wide
Web. Instructional Science, 38, 1e21.
Linacre, J. M. (2012). Winsteps (Version 3.75.1) [Computer Software]. Chicago, IL Winsteps.com.
Lord, F. M. (1960). Large-sample covariance analysis when the control variable is fallible. Journal of the American Statistical Association, 55(290), 307e321. Maydeu-Olivares,
A. (2013). Goodness-of-fit assessment of item response theory models. Measurement: Interdisciplinary Research and Perspectives, 11(3),
71e101. http://dx.doi.org/10.1080/15366367.2013.831680.
Magnusson, S., Krajcik, J. S., & Borko, H. (1999). Nature, sources, and development of pedagogical content knowledge for science teaching. In J. Gess- Newsome, & N. G.
Lederman (Eds.), Examining pedagogical content knowledge: The construct and its implications for science education (pp. 95 e132). Dordrecht, The Netherlands: Kluwer
Academic.
Masters, G. N. (1982). A Rasch model for partial credit scoring. Psychometrika, 47, 149e174.
Mayer, R. E. (1999). The promise of educational psychology: Learning in the content areas. Upper Saddle River, NJ: Prentice Hall.
Mishra, P., & Henriksen, D. (2015). The end of the beginning: an epilogue. In Y.-S. Hsu (Ed.), Development of science teachers' TPACK: East Asia practices (pp.
133e142). Singapore: Springer.
Mishra, P., & Koehler, M. J. (2006). Technological pedagogical content knowledge: a framework for teacher knowledge. Teachers College Record, 108(6), 1017e1054.

Moersch, C. (1995). Levels of technology implementation (LoTi): a framework for measuring classroom technology use. Learning and Leading with Tech- nology, 23(3), 40e42.

Mueller, J., Wood, E., Willoughby, T., Ross, C., & Specht, J. (2008). Identifying discriminating variables between teachers who fully integrate computers and teachers with limited
integration. Computers & Education, 51(4), 1523e1537.
Mumtaz, S. (2000). Factors affecting teachers' use of information and communications technology: a review of the literature. Journal of Information Technology for Teacher
Education, 9(3), 319e342.
Niess, M. L. (2005). Preparing teachers to teach science and mathematics with technology: developing a technology pedagogical content knowledge.
Teaching and Teacher Education, 21(5), 509e523.
Niess, M. L. (2007, June). Mathematics teachers developing technological pedagogical content knowledge (TPCK). Paper presented at IMICT2007, Boston, MA.
Niess, M. L. (2011). Investigating TPACK: knowledge growth in teaching with technology. Journal of Educational Computing Research, 44(3), 299e317.
Niess, M. L., Ronau, R. N., Shafer, K. G., Driskell, S. O., Harper, S. R., Johnston, C., et al. (2009). Mathematics teacher TPACK standards and development model.
Contemporary Issues in Technology and Teacher Education, 9(1), 4e24.
O'Bannon, B. W., & Thomas, K. M. (2015). Mobile phones in the classroom: preservice teachers answer the call. Computers & Education, 85, 110e122.
Penuel, W. R., Confrey, J., Maloney, A., & Rupp, A. A. (2014). Design decisions in developing learning trajectoriesebased assessments in mathematics: a case study. Journal of
the Learning Sciences, 23(1), 47e95.
Perkins, K., Adams, W., Dubson, M., Finkelstein, N., Reid, S., Wieman, C., & LeMaster, R. (2006). PhET: Interactive simulations for teaching and learning physics. The Physics
Teacher, 44(18), 18e23.
st st
Project Tomorrow. (2008). 21 century learners' deserve a 21 century education: Selected national findings of the speak up 2007 survey. Retrieved from http://
www.tomorrow.org/docs/national%20findings%20speak%20up%202007.pdf.
R Development Core Team. (2014). R: a language and environment for statistical computing. Retrieved from. Vienna, Austria: R Foundation for Statistical Computing
http://www.R-project.org/.
Russell, A. L. (1995). Stages in learning new technology: nave adult email users. Computers & Education, 25(4), 173e178.
Russell, M., Bebell, D., O'Dwyer, L., & O'Connor, K. (2003). Examining teacher technology use implications for preservice and inservice teacher preparation.
Journal of Teacher Education, 54(4), 297e310.
Sandholtz, J. H., Ringstaff, C., & Dwyer, D. C. (1997). Teaching with technology: Creating student-centered classrooms. New York, NY: Teachers College Press.
Schmidt, D. A., Baran, E., Thompson, A. D., Mishra, P., Koehler, M. J., & Shin, T. S. (2009). Technological pedagogical content knowledge (TPACK): the
development and validation of an assessment instrument for preservice teachers. Journal of Research on Technology in Education, 42(2), 123e149.
Shepard, L. (1980). Standard setting issues and methods. Applied Psychological Measurement, 4(4), 447e467.
Shepard, L. (2008). Commentary on the national mathematics advisory panel recommendations on assessment. Educational Researcher, 37(9), 602e609.
Shulman, L. S. (1986). Those who understand: knowledge growth in teaching. Educational Researcher, 15(2), 4e14.
Shulman, L. S. (1987). Knowledge and teaching: foundations of the new reform. Harvard Educational Review, 57(1), 1e22.
62 T.-H. Jen et al. / Computers & Education 95 (2016) 45e62

Thomas, K., & O'Bannon, B. (2015, March). Looking across the new digital Divide: a comparison of inservice and preservice teacher perceptions of Mobile phone integration. In
Society for Information Technology & Teacher Education International Conference (Vol. 2015,(1), 3460e3467.
Treagust, D., Chittleborough, G., & Mamiala, T. (2003). The role of submicroscopic and symbolic representations in chemical explanations. International Journal of Science
Education, 25(11), 1353e1368.
Wright, B. D., & Stone, M. H. (1999). Measurement essentials. Wilmington, DE: Wide Range, Inc.
Wu, H.-K., Krajcik, J., & Soloway, E. (2001). Promoting understanding of chemical representations: students' use of a visualization tool in the classroom.
Journal of Research in Science Teaching, 38(7), 821e842.
Yen, W. M. (1984). Effects of local item dependence on the fit and equating performance of the three-parameter logistic model. Applied Psychological Measurement, 8, 125e145.

Yen, W. M. (1993). Scaling performance assessments: strategies for managing local item dependence. Journal of Educational Measurement, 30, 187e213. Yeh, Y.-F., Hsu, Y.-S.,
Wu, H.-K., Hwang, F.-K., & Lin, T.-C. (2014). Developing and validating technological pedagogical content knowledge - practical (TPACK-
Practical) through the Delphi Survey Technique. British Journal of Educational Technology, 45(4), 707e722.
Yeh, Y.-F., Lin, T.-C., Hsu, Y.-S., Wu, H.-K., & Hwang, F.-K. (2015). Science teachers' pro ficiency levels and patterns of TPACK in a practical context. Journal of Science
Education and Technology, 24(1), 78e90.