Mccqe Part I (Pdfdrive)

Technical Report
on the Standard Setting

Exercise for the
Medical Council of Canada
Qualifying Examination
Part I
Psychometrics and Assessment Services

July 2015
TABLE OF CONTENTS
INTRODUCTION ..................................................................................................................................... 3
Pre-Session Activities ......................................................................................................................... 3
SELECTING A STANDARD SETTING METHOD ....................................................................... 3
SELECTING PARTICIPANTS AND ASSIGNING INTO PANELS ............................................... 4
SELECTING TEST QUESTIONS FOR THE STANDARD SETTING SESSION......................... 4
PRE-SESSION MATERIALS ....................................................................................................... 5
Activities During the Two-Day Session .............................................................................................. 5
ORIENTATION............................................................................................................................. 5
DEFINING THE BORDERLINE CANDIDATE ............................................................................. 5
THE PRACTICE TEST................................................................................................................. 6
THE PRACTICE BOOKMARK METHOD .................................................................................... 6
TWO ROUNDS OF BOOKMARKING .......................................................................................... 7
Recommendation from the Panelists.................................................................................................. 9
Evaluation of the Standard Setting Judgments .................................................................................. 9
Providing Feedback through an Online Survey ................................................................................ 10
Concluding Remarks ........................................................................................................................ 11
REFERENCES ...................................................................................................................................... 12
Table 1: Canadian and International Medical Graduate Pass/Fail Rates for the Years 2012-2014..... 13
Table 2: Standard Setting Results for Panels 1 and 2 for Rounds 1 and 2 .......................................... 13
Figure 1: Failure Rates for First-Time Takers (Panel 1) ....................................................................... 14
Figure 2: Failure Rates for First-Time Takers (Panel 2) ....................................................................... 14
Figure 3: Failure Rates for First-Time Takers (Combined Panels) ...................................................... 15
Figure 4: Failure Rates for all First-Time Takers (Round 2) ................................................................. 15
Figure 5: Failure Rates for all First-Time Takers and Hofstee Boundaries .......................................... 16
APPENDIX A: Demographic Information Sheet .................................................................................. 17

APPENDIX B: Demographic Summary of the Two Panels ................................................................. 20
APPENDIX C: Standard Setting Agenda ............................................................................................. 21
APPENDIX D: Defining Borderline Performance and the Minimally Competent Candidate ............... 23
APPENDIX E: Form to Document a Bookmark for Each Round ......................................................... 24
APPENDIX F: Form to Document a Bookmark for Each Round ......................................................... 25
APPENDIX G: Part I Standard Setting Fall 2014 – Post-Session Survey Summary .......................... 26

MCCQE Part I Standard Setting Report 2
INTRODUCTION
In the context of licensing and certification, standard setting has become a critical and essential
component of assessment programs. Standard setting is a process by which an acceptable level
of performance is defined (Kane, 1994, 1998). For the medical profession, standard setting is the
establishment of a qualitative statement of what minimum level of performance should be attained
to practice medicine safely and effectively. An integral part of the standard setting process is also
the establishment of a cut score on an assessment of interest that is congruent with the definition
of a minimum performance level. At the Medical Council of Canada (MCC), standard setting is an
essential part of every examination program, including the Medical Council of Canada Qualifying
Examination (MCCQE) Part I.
The MCCQE Part I is a computer-delivered examination which assesses basic medical

knowledge and skills expected to be mastered at the end of medical school. It is composed of a
three and a half hour multiple-choice (MCQ) component and a four-hour clinical decision-making
(CDM) component. Its MCQ component consists of seven sections of 28 questions in which
testlets of four questions for each of the six disciplines (Internal Medicine, Obstetrics/Gynecology,
Pediatrics, Population Health, Ethical, Legal, and Organizational aspects of Medicine, Psychiatry,
and Surgery) are presented to candidates. The CDM component is composed of 36 clinical cases
each including one to four questions. CDM questions can either be selected-response type or
constructed- response (CR) type items.
The purpose of the standard setting session for the MCCQE Part I that took place October 23-24,
2014, was to arrive at a recommended cut score for subsequent review and approval by the
Central Examination Committee (CEC). The most important aspect of standard setting is the
validity of the process and activities. In the sections that follow, we describe in detail the pre
standard setting session activities, as well as the activities that took place during the standard
setting session for the MCCQE Part I.
Pre-Session Activities
SELECTING A STANDARD SETTING METHOD
Standard setting methodologies abound but not all are well suited for the types of items that are
used in the MCCQE Part I. Several methodologies were considered but the Bookmark method
was chosen because of its simplicity and the ease with which both MCQs and CDM items can be
integrated in the cut score (Cizek, 2007). The Bookmark method is an item mapping procedure
where items are ordered from easiest to most difficult based on operational data and panelists
are asked to place a bookmark at the point at which they believe a minimally proficient candidate

would no longer correctly answer subsequent items presented in the ordered exam form. De
facto, this corresponds to the cut score for each panelist. A detailed description of participants’
task is outlined in a later section of this report.
SELECTING PARTICIPANTS AND ASSIGNING INTO PANELS
Since the panelists selected for a standard setting exercise represent a microcosm of all MCCQE
Part I examination stakeholders, it is critical to select participants that are representative with
respect to a number of key variables, including the region of Canada, ethnicity, medical specialty
and years of experience. Furthermore, to assess the reproducibility of the cut score across 2
groups of physicians, we decided to split our panelists into 2 matched subgroups. The latter
allows us to collect critical validity evidence in support of the recommended cut score.
The process of selecting participants started with an invitation which was forwarded to physicians
from across Canada, targeting Family Physicians as well as a broad range of other specialists. A
total of 22 physicians were retained based on several key criteria (see Appendix A for the
demographic information survey that was filled out by all potential participants). As previously
mentioned, we attempted to select panelists in both subgroups that were reflective of various
regions across the country (i.e., Western, Central, and Eastern Canada); medical specialty (family
medicine, internal medicine, surgery, obstetrics and gynecology, pediatrics, and psychiatry);
ethnicity (i.e., Asian, Black, Caucasian, First Nation, or Hispanic), sex, and years of experience
supervising residents. In Appendix B, we present a summary of the demographics of the two
panels. Some minor imbalance ensued when five participants bowed out a few days before the
session. Two of these people decided not to participate on account of the tragic incident that
occurred in Ottawa at the War Memorial and Parliament building center block the day before this
session.
SELECTING TEST QUESTIONS FOR THE STANDARD SETTING SESSION
All questions used for the standard setting session were taken from the most recent MCCQE
Part I, namely the spring 2014 administration. Dichotomously scored MCQs were calibrated
using the Rasch model (Rasch, 1960/1980) which, in turn, were used as anchors to calibrate the
CDM questions (Rasch model for dichotomous CDMs and the partial credit model (Masters,
1982) for polytomous CDMs). With the bookmark method, the basic question that panelists must
answer is the following: “Is it likely that the borderline candidate will be able to answer this
question correctly”. A typical probability level used with the bookmark method is the 67%
response probability or, 2/3 chance of answering correctly. Therefore, response probabilities
were calculated using a 2/3 probability criterion for each dichotomously scored MCQ and CDM
and for each step value for polytomously scored CDMs.

PRE-SESSION MATERIALS
To assist panelists to prepare for the standard setting session, we asked them to read an article
(De Champlain, 2004) and a book chapter (De Champlain, 2014) on the topic of standard setting
that we sent out prior to the exercise in October, 2014. Additionally, the agenda for the two-day
session was mailed out to participants a few weeks before the session (see Appendix C).
Activities During the Tw o -Da y Session

ORIENTATION
The success of any standard setting session relies heavily on the extensive training of
participating panelists. This helps to ensure that panelists have the same objective in mind and
the same basic premises and understanding of the standard setting process. To this end, we
spent half of the first day of the exercise training our panelists on a number of issues, including
the structure and content of the MCCQE Part I. Examples of questions for both components of
the examination were shown with the type of scoring rubrics that would be seen in the exercises
included in the session. This was followed by a tutorial on standard setting, including issues to
consider, methods and sources of evidence to support the reliability and validity of any cut-score.
Particular attention was provided to the method that was selected to arrive at a recommended
cut- score for the MCCQE Part I exam, namely, the Bookmark method . In addition, a second,
ancillary standard setting method was introduced, the Hofstee method , which was used as a
complement to the item-centered Bookmark approach. The Hofstee method is described in the
literature as a compromise method (Hofstee, 1983) in that it integrates both norm-referenced
(relative interpretations) and criterion-referenced (absolute interpretations) considerations in a
“gut estimate” that is used to further validate the cut-score obtained following the Bookmark
exercise.
DEFINING THE BORDERLINE CANDIDATE
Commonly, standard setting methodologies, including the Bookmark method, assume that a cut-
score is set for the minimally proficient or borderline candidate. This hypothetical candidate is
critical in setting the cut-score, i.e., a point on the continuum of professional competence that
separates those deemed as competent candidates from those deemed as incompetent. The
Bookmark method requires that panelists clearly define what constitutes a minimally proficient (or
borderline) candidate, with respect to what they may know and not know in the domains targeted
by the MCCQE Part I exam.
To assist panelists in this task, a basic definition was developed by the Vice-chair of the CEC and
offered to the panelists as a starting point. After much discussion, the participants agreed on

some modifications and enhancements by listing several attributes that they felt were reflective of
borderline candidate behaviours and attitudes. The definition that was agreed upon by all our
panelists is shown in Appendix D.
THE PRACTICE TEST
To better understand the type of questions that Part I candidates must answer during an
examination, a practice test was administered to the panelists prior to collecting their judgments.
It contained a representative sample of 50 multiple-choice questions and 26 clinical decision-
making questions selected from the spring 2014 MCCQE Part I examination. Panelists were
given 90 minutes to complete the practice test after which they were instructed to self-score their
test using an item map which provided correct answers for each question. The purpose of the
practice test was also to give participants a sense of the level of difficulty of the MCCQE Part I.
Participants were not asked to share their resulting score with other panelists. However, this
exercise did provide the basis for a discussion of their perceived level of difficulty of the questions
and the appropriateness of the content in relation to the purpose of the Part I examination and its
target population (i.e. candidates entering supervised training or residency).
THE PRACTICE BOOKMARK METHOD
A practice bookmark exercise was planned to train the panelists in this procedure before they
engaged in the actual full-scale activity. The same questions used in the practice test were used
for this exercise as well. However, the questions were now ordered by difficulty level, from
“easiest” to “most difficult”, based on actual spring, 2014 MCCQE Part I candidate performances.
The goal of this standard setting method was to allow panelists, in a practice round, to identify a
point on the scale that they believed reflected minimal competency in the domains measured by
the MCCQE Part I examination.
Each participant was presented with a booklet that contained examination questions (one per
page) that were ordered by difficulty from easiest to most difficult. Each participant was asked to
place their bookmark at the point at which they felt a minimally proficient (or borderline) candidate
would correctly answer all items up to that point and incorrectly answer all items beyond that
point. The basic question that panelists must answer in the Bookmark procedure is the following:
“Is it likely that a minimally proficient candidate will be able to correctly answer this test question?”
Of course, the “likeliness” must be defined more specifically. In the Bookmark method, it is
defined as having a 2/3 chance of answering correctly (or 2/3 chance of reaching a CR score or
higher – for polytomous items). The expression “RP67” is often used to capture the essence of a
.67 response probability; simply another way of expressing the 2/3 chance of answering correctly.

Panelists were instructed to read questions starting with the first question in their booklet and
proceed one item at a time sequentially until they arrived at a point where they felt that the
minimally proficient candidate would no longer have a 2/3 chance of correctly answering the item.
Panelists were not provided with the correct answers for this initial practice round. Following this
initial bookmark placement, panelists were then provided with an item map that contained
information on each question in the booklet including the correct answer as well as the associated
RP67 value. Following this practice round, panelists were invited to begin the actual two rounds
of the Bookmark standard setting exercise.
TWO ROUNDS OF BOOKMARKING
Round 1 (Preliminary round). Following the practice bookmark round, panelists were reminded of
some key points about the Bookmark method and were assigned to their respective panels. They
were then each provided with a booklet that contained 236 items (one form’s worth of items)
which were ordered by difficulty level (based on RP67 value) from easiest to most difficult. They
were then instructed to independently place a bookmark at the point at which they felt a
minimally proficient (or borderline) candidate would correctly answer all items up to that point and
incorrectly answer all items beyond that point. Forms were distributed for documenting each
panelist’s bookmark (see Appendix E). The panelists were given 3.5 hours to complete their
round 1 bookmark placement. Note that the judgments provided in round 1 were solely based on
the item text that was provided, i.e., no performance data were given.
Following round 1, panelists were asked to provide answers to the following four Hofstee method
questions: (1) What is the minimally acceptable cut-score (Cmin), even if all candidates attained
this score level; (2) What is the maximum acceptable cut-score (Cmax), even if no candidate tis
score level; (3) What is the minimum tolerable failure rate (Fmin) and; (4) What is the maximum
tolerable failure rate (Fmax). Again, this information is used to gauge the appropriateness of the
Bookmark method cut-score as per the panelists’ holistic views. Forms were distributed (see
Appendix F) to allow panelists to record the data for the Hofstee method. Forms were collected
and provided to Statistical Analysts who in turn entered the data in an application which allowed
us to view each panel’s bookmark overlaid with the Hofstee boundaries. Figures 1 and 2 illustrate
Bookmark and Hofstee data for round 1 for Panels 1 and 2, respectively. Figure 3 combines the
data for both panels. Panel 1 panelists are represented as blue letters on each graph. Panel 1
had 9 panelists: A, C, D, E, G, H, I, J, and K. Panel 2 panelists are represented as red letters.
Panel 2 had 8 panelists: A, B, E, F, G, H, I, and J. The placement of letters on the graphs have
significance only on the x-axis, namely the cut scores on the theta scale. The stacking of some of
the letters was done simply to distinguish panelists whose cut score was the same instead of

superimposing them. The placement of the letters has no significance on these graphs in terms of
failure rates for individual panelists.
Panelists from both panels were gathered in one room to provide them with impact data which
consisted of failure rates given their respective cut scores and combined cut score values. Pass
and failure rates for Canadian and International Medical Graduates for the years 2012-2014 were
presented to all panelists (See Table 1). Also, a cumulative distribution of examination results
was prepared from all first-time candidates who completed the spring 2014 MCCQE Part I. For
each score, a distribution of cumulative percentage of failures was established and a look-up
table was created to obtain a percentage failure for any given cut score obtained from each
panelist.
To translate bookmark placement into cut scores on the item response theory (IRT) ability (theta)
scale, an additional look-up table was created that listed: (1) item identification number for each
item used in the bookmarking exercise; (2) the corresponding booklet page number; (3) the
Rasch item difficulty measure and; (4) the RP67 value or IRT ability value needed to have a 2/3
chance of correctly answering any given item in the sample MCCQE Part I exam form that was
used in our standard setting exercise. Once we obtained all bookmark placement page numbers,
those were entered and a corresponding cut score was identified using the look-up table for each
panelist, panel and overall.
To obtain a panel-level cut score, the median cut score was calculated from the distribution of cut
scores by panel. The median was chosen instead of the mean since it mitigates the influence of
extreme values when they occur. The latter value corresponded to the preliminary or round 1 cut
score.
In Figure 1, we can observe that failure rates increase as cut scores increase and that the cut
score obtained by the Hofstee method (established by drawing a line down to the cut score at the
point where Fmax / Cmin and Fmin / Cmax lines traverse the cumulative failure rates curve) for Panel
1 falls between the lower and higher boundaries identified by the Hofstee method. This is a
desirable outcome. It is desirable because it indicates that the cut score (-0.39 on the theta scale)
identified by Panel 1 falls within what they expected in terms of maximum and minimum failure
rates and maximum and minimum cut scores.
In Figure 2, Panel 2 results for round 1 are presented. The results indicate that this panel had
incongruent outcomes between what they established as acceptable Hofstee boundaries and the
bookmark cut score (-0.78 on the theta scale). It would seem that 2 panelists (B and E) are

mostly responsible for this outcome. Figure 3 illustrates the results of the combined data for both
panels taken together resulting in a combined cut score of -0.58 which falls within the Hofstee
higher and lower boundaries. Panelists were provided with an opportunity to discuss the results
presented to them after this preliminary round. Much discussion ensued in terms of the impact
on medical graduates who would potentially fail given the cut score produced by round 1
bookmarking. Some panelists expressed the fact that, given the impact data, they felt that they
were too lenient in terms of what they expected the borderline candidate would be able to
master while others felt they were too harsh.
Round 2 (Final round). Panelists were then directed to their respective subgroup to engage in the
second and final round of bookmarking. Results from this second round constitute the
recommended cut score which was subsequently brought forward to the CEC for consideration
and adoption. Panelists were given two hours to complete this final standard setting round. As
was the case in the preliminary round (round 1), forms were gathered from panelists who
indicated their second bookmark placement as well as their responses to the four Hofstee
questions (post round 2). Graphical representations for round 2 bookmarking results are
presented in Figures 4 and 5. In Figure 4, round 2 individual and panel bookmark cut scores and
corresponding failure rates are presented. In Figure 5, the same data are provided with an
additional overlay of the Hofstee boundaries from round 2. The combined (i.e., both panels taken
together) cut score of -0.22 on the |IRT ability scale (theta) would fail 14% of all first-time
candidates using the spring 2014 examination results. This cut score would fail 5.1% of first-time
Canadian medical graduates from the spring, 2014 MCCQE Part I administration.
Recommendation from the Panelists

The above-mentioned figures were presented to all panelists concurrently and they were
provided with an opportunity to discuss the impact of using the resulting cut score. Several
panelists expressed their satisfaction with the method that they used to arrive at the final cut
score. They felt comfortable with the results of the exercises. Consistent with their mandate as
set at the beginning of the meeting, they recommended that the cut score of -0.22 on the IRT
ability scale be brought forward to the CEC for approval, at the spring, 2015 meeting.
Evaluation of the Standard Setting Judgments

Details of each panel’s recommended cut scores following Round 2 (final round) are presented in
Table 2. This table presents a summary of the 2 panels’ cut scores and their associated
descriptive statistics, namely the means, medians and standard deviations. The standard error of
judgment (SEJ) is also presented. This statistic captures the amount of variability associated with
each panel’s cut score. It provides a rough indication of the extent to which the same or a similar

cut score would be obtained if we were to gather physicians with the same demographics as the
ones that were chosen for this session, who would have gone through the same type of training
and using the same examination items. By building a confidence interval around the SEJ, we can
evaluate the extent to which the 2 panels arrived at comparable cut scores. Panel 1’s interval
extends from 0-.37 to -0.18 and Panel 2’s interval extends from -0.38 to 0.03. From this finding,
we confidently conclude that their cut scores were very comparable.
Providing Feedback through an Online Surve y

At the conclusion of the meeting, panelists were provided with an opportunity to provide feedback
on the activities in which they participated. An online survey tool was developed for this specific
purpose. Panelists were informed that the feedback provided would be treated anonymously. All
but one panelist completed the survey before they left the meeting. One of the panelists
completed the survey one day later.
Results of the survey are presented in Appendix G. All 17 participants thought that the
information regarding the overview of the MCCQE Part I was either good (18%), very good
(18%), or excellent (65%). They thought that the overview of standard setting was either good
(6%), very good (29%), or excellent (65%). Central to the exercises during this standard setting
session was the notion of the minimally competent (i.e., borderline) candidate. Participants were
asked to assess the clarity of the definition of that target population that they developed. All 17
participants thought that the definition was clear (76%) or very clear (24%).
A significant amount of time was devoted to training panelists to the task which was felt by staff
as extremely important to ensure a common understanding of what we expected of them before
they engaged in the actual bookmarking exercise. Ninety- four percent of panelists thought that
exercise was appropriate, 6% thought that it was somewhat appropriate, and none thought it was
not appropriate. All participants thought that the training provided for the bookmark method was
either good (12%), very good (18%) or excellent (71%).
Among the facto" that influenced participants the most when they engaged in the Bookmark
method were their perception of the level of difficulty of the items (94%), the description of the
minimally competent candidate (88%), the item statistics provided in round 2 (76%), and the
knowledge and skills measured by the items (76%). Among the factors that had the least
influence on their bookmarking exercise were the quality of the item distractors (12%) and the
number of answer choices per item (18%).
Participants were asked about their level of understanding of how to apply the bookmark and
Hofstee methods during round 1. For the bookmark method, 16 out of 17 participants said that

they either understood (29%) or understood very well (65%) this process while one participant
reported that they understood “somewhat”. For the Hofstee method, 1 participant (6%) said that
they understood somewhat, 5 participants said that they understood (29%), 11 participants (65%)
said that they understood very well, while none of the participants reported not understanding the
method “at all”.
Participants were also asked about their level of confidence regarding the consequential/
feedback data and the final discussion. Two participants (12%) felt somewhat confident, 6
participants (35%) felt confident, 9 participants (53%) felt very confident, whereas none of the
participants felt that they were not at all confident.
One of the significant outcomes desired following a standard setting exercise is a standard that
participants would recommend with a very high level of confidence. As part of the survey,
participants were asked about the level of confidence in the final recommended passing score.
One participant felt somewhat confident while the large majority reported being confident (18%)
or very confident (76%) about the recommended cut score value.
Finally, participants were surveyed on potential improvements to consider for further standard
setting exercises. Among the suggestions for improvement were comments about providing
impact data after the practice bookmark method as well as each panelist’s bookmark placement.
Also, one participant suggested providing failure rates for each panelist’s bookmark following the
practice bookmark method. A few participants felt that there were no improvements to be made.
Concluding Remarks
The main goal of this report was to outline the main activities that constituted the standard setting
exercise for the MCCQE Part I. In summary, two panels were gathered for the purpose of
establishing and recommending a cut score by participating in a 2- day session during which they
were trained in the Bookmark and Hofstee standard setting methods. A significant amount of time
was spent defining the target population and training of panelists on various critical aspects of the
exercise. Two panels established highly comparable cut scores as demonstrated by the overlap
of their respective confidence interval using the standard error of judgment. A high level of
confidence in the recommended cut score was expressed by a majority of participants. Several
staff from Psychometrics and Assessment Services and the Evaluation Bureau participated in
making this a successful session. Finally, a comprehensive description of all the activities and the
resulting cut score as well as impact data for both the spring 2014 and 2015 cohorts were
presented to the CEC on June 8, 2015 for their discussion and consideration. The CEC
unanimously accepted the recommended cut score of -0.22 (427 on the 3-digit MCCQE Part I
reporting scale) at this meeting.

REFERENCES
Cizek, G. J. and Bunch, M. B. (2007). Standard Setting: A Guide to Establishing and Evaluating
Performance Standards on Tests (55-189). Thousand Oaks, CA: Sage.
De Champlain, A. F. (2014). Standard setting methods in medical education. In T. Swanwick

(Ed.). Understanding Medical Education: Evidence, Theory and Practice. (305-316). Chichester,
West Sussex: John Wiley & Sons, Ltd.
De Champlain, A. F. (2004). Ensuring that the competent are truly competent: An overview of
common methods and procedures used to set standards on high-stakes examinations. Journal
of Veterinary Medical Education, 31, 61-5.
Hofstee, W. K. B. (1983). The case for compromise in educational selection and grading. In S. B.
Anderson and J. S. Helmick (Eds.). On educational testing (109-127). San Francisco: Jossey-
Bass.
Kane, M. (1994). Validating the Performance Standards Associated with Passing Scores. In
Review of Educational Research. Fall 1994 64 (3), 425-461.
Kane, M. (1998). Choosing Between Examinee-Centered and Test-Centered Standard-Setting

Methods, Educational Assessment, 5 (3), 129-145.
Masters, G.N. (1982). A Rasch model for partial credit scoring. Psychometrika, 47, 149- 174.
Rasch, G. (1960/1980). Probabilistic models for some intelligence and attainment tests.
(Copenhagen, Danish Institute for Educational Research), expanded edition (1980) with foreword
and afterword by B.D. Wright. Chicago: The University of Chicago Press
Wright, B. D. and Stone, M. H. (1979). Best Test Design: Rasch Measurement.

Table 1: Canadian and International Medical Graduate
Pass/Fail Rates for the Years 2012-2014
2012 2013 2014

Canadian Medical Graduates FAIL 1.5% 1.3% 2.3%
First- Time Takers PASS 98.5% 98.7% 97.7%
Canadian and International Medical FAIL 9.0% 8.6% 10.6%
Graduates First-Time Takers PASS 91.0% 91.4% 89.4%
Table 2: Standard Setting Results for Panels 1 and 2 for Rounds 1 and 2
Summary of Cut Scores by Panel for Rounds 1 and 2

Round 1 Round 2
Panel 1 Panel 2 Panel 1 Panel 2
Panelist 1 -0.07 -0.98 -0.26 -0.04
Panelist 2 -0.89 -1.74 -0.31 -0.02
Panelist 3 -0.46 -1.73 -0.46 -0.44
Panelist 4 -0.37 0.74 -0.19 -0.17
Panelist 5 -0.07 -1.08 -0.26 -0.37
Panelist 6 -0.95 -0.27 -0.59 -0.18
Panelist 7 -0.99 -0.58 -0.20 -0.53
Panelist 8 0.29 0.53 -0.27 0.38
Panelist 9 -0.39 -0.43
Mean -0.43 -0.64 -0.33 -0.17

Median -0.39 -0.78 -0.27 -0.17
Standard Deviation 0.44 0.93 0.13 0.29
Standard Error of
0.16 0.35 0.05 0.10
Judgment (SEJ)

Figure 1: Failure Rates for First-Time Takers (Panel 1)
Figure 2: Failure Rates for First-Time Takers (Panel 2)

Figure 3: Failure Rates for First-Time Takers (Combined Panels)
Figure 4: Failure Rates for all First-Time Takers (Round 2)

Figure 5: Failure Rates for all First-Time Takers and Hofstee Boundaries

APPENDIX A:
Demographic Information Sheet
The information requested below is being collected to help the MCC obtain a pan-Canadian
representative panel to recommend a passing score on the MCC Part I Examination. This
information will only be used to select the panel members so that we can represent the diversity
of physicians across the country. The information will not be linked in any way to the collection
of data for setting the passing score. A reminder that the meeting will take place on October
23-24, 2014 therefore we are asking panelists to be available on those 2 days.
Please provide your name and contact information, and check a box next to each of the
questions. The form can be sent by mail or electronically by 30 April 2014.
Name: __________________________________________________________________
Email: _____________________________________ Phone number: ________________
Mailing address: __________________________________________________________
________________________________________________________________________
________________________________________________________________________
1. Number of years in practice post residency:
1-5 years ☐
6-10 years ☐
11-20 years ☐
21-30 years ☐
More than 30 years ☐
2. Number of years’ experience supervising residents:
1-5 years ☐
6-10 years ☐
11-20 years ☐
21-30 years ☐
More than 30 years ☐

3. Do you have experience supervising Canadian Medical Graduates?
Yes ☐
No ☐
4. Have you ever been a member of a Medical Council test committee?
Yes ☐
No ☐
5. Country of medical training (post graduate training):
Canada ☐
Other ☐
6. Region of the country in which you live:
Alberta ☐
British Columbia ☐
Manitoba ☐
Maritimes ☐
Ontario ☐
Quebec ☐
Saskatchewan ☐
Territories ☐
7. First Language:
English ☐
French ☐
Other (________________________) ☐
8. Gender:
Male ☐
Female ☐

9. Ethnicity:
Asian ☐
Black ☐
Caucasian ☐
First Nations ☐
Hispanic ☐
10. Medical Specialty:
Pediatrics ☐
Internal Medicine ☐
Psychiatry ☐
Obstetrics and Gynecology ☐
Surgery ☐
Family Medicine ☐
Other ☐
11. Type of community in which you work:
Urban ☐
Rural ☐
12. Type of care setting:
Hospital-based ☐
Community-based ☐

APPENDIX B:
Demographic Summary of the Two Panels
Variable of
Group Panel A Panel B Total
Interest
Female 56% 50% 53%
Gender
Male 44% 50% 47%
West 22% 38% 29%
Geographic
Central 56% 38% 47%
Region
East 22% 25% 24%
Internal Medicine 33% 38% 35%
Surgery 22% 13% 18%
Medical
Specialty Obstetrics/Gynecology 11% 13% 12%
Pediatrics 22% 13% 18%
Psychiatry 0% 13% 6%
Family Medicine 11% 13% 12%
1-5 years 11% 38% 24%
Number
of Years 6-10 years 44% 13% 29%
Supervising 11-20 years 11% 25% 18%
Residents
21-30 years 33% 25% 29%
Country of Canada 89% 88% 88%
Medical Training Other 11% 12% 12%

Appendix C: Standard Setting Agenda
STANDARD SETTING FOR QUALIFYING EXAMINATION PART I

Location: MCC office, 2283 St. Laurent Blvd., Ottawa, ON
University Boardroom (3rd Floor)
OCTOBER 23-24, 2014 | 8:00 a.m. – 4:00 p.m.
AGENDA – DAY 1: Thursday, Oct. 23, 2014

CONTINENTAL BREAKFAST 08:00
1. Breakfast and Registration 08:00

1.1 Complete confidentiality and biographical information forms
1.2 Let panellists know to what table/room they belong
2. Welcome and Introduction by MCC
2.1 Introduction of Panellists
2.2 Overview of Agenda
2.3 Overview of Part I Examination
2.4 Overview of Standard Setting
2.5 Overview of Bookmark Method
3. Review Practice Test and Self-Score 09:30
3.1 Break as needed
3.2 Take Practice Test: 50 MCQs + 25 CDM questions
3.3 Self-score using Practice Test Item Map
3.4 Discuss knowledge and skills on test
LUNCH 11:45
4. Develop Target Student Description and Reach Consensus 12:30

4.1 Clear definition of minimally competent candidate starting residency
5. Training of Bookmark Method & Practice 13:15

5.1 Practice bookmark method 50 MCQs and 39 CDMs
P.M. BREAK 14:45
6. Practice Ordered Item Booklet (OIB) 15:00

6.1 Provide item map for Practice OIB
6.2 Discussion of ordered difficulty and placement of bookmark
6.3 Survey post-bookmark training
7. Additional Discussion/Clarification 16:30
END OF DAY 1 17:00

AGENDA – DAY 2: Friday, Oct. 24, 2014
CONTINENTAL BREAKFAST 08:00
8. Independently Mark Round 1 Bookmark Judgement/Hofstee by Panel 08:30

9. End of Round 1 11:30
9.1 Data entry
LUNCH / Data Entry 11:30
10. Round 1 – Data Feedback Whole Group 12:15

10.1 Provide Panel- and room-level data and impact data
10.2 Round 1 panel discussions with large group
11. Independently Make Round 2 Bookmark Judgements/Hofstee 13:00
P.M. BREAK 15:15
12. End of Round 2 15:15

12.1 Data entry
13. Round 2 Data Feedback 15:45
13.1 Provide panel- and room-level data and impact data
13.2 Presentation of Bookmark Recommendation
14. Complete Final Evaluation and Collection of Materials 16:15
END OF DAY 2 16:30

APPENDIX D:
Defining Borderline Performance and the Minimally
Competent Candidate
The “minimally competent” candidate entering residency is a candidate who possesses the
minimum level of knowledge, skills and attitudes required to safely practice medicine under
supervision. A “minimally competent” candidate’s performance is acceptable, despite gaps in their
knowledge and clinical decision-making skills.
The minimally competent candidate will:
• Have the right attributes

• Be able to reflect limits of their own
• Be able to recognize that a patient is sick, but doesn’t necessarily know why
• May not have the ability to adequately recognize life threatening situations
• Be able to gather information but not necessarily be able to integrate it
• Be reliable in identifying red flags (and sense of urgency) for patient safety
• Ask for help
• Improve over time
• Recognize his/her own weakness
• Have a willingness to learn and reflect on feedback
• Incorporate professionalism
• Be clinically, logically and culturally competent
• Build a rapport with the patient
• Synthesize information

APPENDIX E:
Form to Document a Bookmark for Each Round
Panel: ______________________________________________________________________
Panelist: _____________________________________________________________________
Standard Setting for the Qualifying Examination Part I

The Bookmark Method
Please indicate the page number of the item on which you placed your bookmark. It is the item for
which, in your judgment, a minimally proficient candidate’s chance of answering correctly falls
below a 2/3 probability.
Please initial after each round:
Round Bookmark Page Initials

1
2

APPENDIX F:
Form to Document a Bookmark for Each Round
Panel: ______________________________________________________________________
Panelist: _____________________________________________________________________
Standard Setting for the Qualifying Examination Part I

The Hofstee Method
Please answer the following 4 questions, once after eac h round:
1. What is the highest percent correct cut score that would be acceptable, even if every
candidate attains that score? This value represents your estimate of the maximum level of
knowledge that should be required of candidates.
Round 1: ______ Round 2: ______
2. What is the lowest percent correct cut score that would be acceptable, even if no candidate
attains that score? This value represents your judgment of the minimum acceptable
percentage of knowledge that should be tolerated.
Round 1: ______ Round 2: ______
3. What is the maximum acceptable failure rate? This value represents your judgment of the
highest percentage of failing candidates that could be tolerated.
Round 1: ______ Round 2: ______
4. What is the minimum acceptable failure rate? This value represents your judgment of the
lowest percentage of failing candidates that could be tolerated.
Round 1: ______ Round 2: ______

APPENDIX G:
Part I Standard Setting Fall 2014 – Post-Session Survey Summary
1. Which panel did you participate in? (Select ONE)
Response Chart Percentage Count

Panel 1 (University room) 53% 9
Panel 2 (Barr/Bérard room) 47% 8
Total responses 17
2. What was your impression of the clarity of the information regarding the overview of
the MCCQE Part I exam that was provided on the morning of Day 1? (Select ONE)

Excellent 65% 11
Very good 18% 3
Good 18% 3
Fair 0% 0
Poor 0% 0
Total responses 17
standard setting that was provided on the morning of Day 1? (Select ONE)

Excellent 65% 11
Very good 29% 5
Good 6% 1
Fair 0% 0
Poor 0% 0
Total responses 17
the Bookmark Method that was provided on the morning of Day 1? (Select ONE)

Excellent 53% 9
Very good 41% 7
Good 6% 1
Fair 0% 0
Poor 0% 0
Total Responses 17

5. How clear were you about the description of the “Minimally Competent” (or sometimes
called “Borderline”) candidate on the MCCQE Part I exam as you began the task of
setting a passing score following the training on the afternoon of Day 1? (Select ONE)

Very clear 24% 4
Clear 76% 13
Somewhat clear 0% 0
Not clear 0% 0
Total Responses 17
6. How clear were you about the description of the “Minimally Competent” (or sometimes
called “Borderline”) candidate on the MCCQE Part I exam as you began the task of
setting a passing score following the training on the afternoon of Day 1? (Select ONE)

Yes, very helpful 47% 8
Yes, helpful 47% 8
Yes, somewhat helpful 6% 1
Not helpful at all 0% 0
Total Responses 17
7. How would you judge the length of time spent (about 45 minutes on the agenda) on the
afternoon of Day 1 introducing, discussing and editing the definition of the “Minimally
Competent” or “Borderline” candidate? (Select ONE)

About right 82% 14
Too little time 6% 1
Too much time 12% 2
Total Responses 17
8. What is your impression of the practice session for applying the Bookmark Method to a
set of MCQs and CDM questions on the afternoon of Day 1? (Select ONE)

Appropriate 94% 16
Somewhat appropriate 6% 1
Not appropriate 0% 0
Total Responses 17

9. What is your overall evaluation of the training that was provided for setting a passing
score using the Bookmark Method? (Select ONE)

Excellent 71% 12
Very good 18% 3
Good 12% 2
Fair 0% 0
Poor 0% 0
Total Responses 17
10. What factors influenced your placement of your Bookmark on day 2? (Select ALL
choices that apply)

The description of the minimally
88% 15
competent or borderline candidate
My perception of the difficulty of the 94% 16
test items
The test item statistics 76% 13
Other panelists during the discussion 53% 9
My experience in the field 41% 7
Knowledge and skills measured by
76% 13
the test items
The quality of the distractors to the 12% 2
test items
The number of answer choices to 18% 3
the test items
Other (please specify) 0% 0
Total Responses 17
11. How did you feel about participating in the group discussions regarding the ordered
item booklet? (Select ONE)

Very comfortable 82% 14
Somewhat comfortable 18% 3
Unsure 0% 0
Somewhat uncomfortable 0% 0
Very uncomfortable 0% 0
Total Responses 17

12. How would you rate your understanding of how to apply the Bookmark Method during
the marking round 1 on Day 2? (Select ONE)

I understood very well 65% 11
I understood 29% 5
I understood somewhat 6% 1
I did not understand at all 0% 0
Total Responses 17
13. How comfortable were you in applying the Bookmark Method during marking round 1
on Day 2? (Select ONE)

Unsure 12% 2
Total Responses 17
14. How comfortable were you in applying the Bookmark Method during marking round 1
on Day 2? (Select ONE)

I understood very well 53% 9
I understood 35% 6
I understood somewhat 12% 2
I did not understand at all 0% 0
Total Responses 17
15. How comfortable were you in applying the Hofstee during marking round 1 on Day 2?
(Select ONE)

Unsure 6% 1
Total Responses 17

16. What level of confidence do you have that the consequential/feedback data and final
discussion this afternoon helped the panel arrive at a defensible passing score?
(Select ONE)

Very confident 53% 9
Confident 35% 6
Somewhat confident 12% 2
Not at all confident 0% 0
Total Responses 17
17. What level of confidence do you have in the final recommended passing score?
(Select ONE)

Very confident 76% 13
Confident 18% 3
Somewhat confident 6% 1
Not at all confident 0% 0
Total Responses 17
18. How could the method used for setting a passing score on the MCCQE Part I exam have
been improved?
# Response
1. The process as executed was excellent.
2. no
3. I think it took a little while to grasp the concept of minimally competent & hence the
book mark but became very clear after the initial exercise
4. I think that people are pushed to change their scores after the first session on day 2.
The bias was to increase the passing score on the second round because of the large
disparity in panels.
5. This is my first time doing this exercise, so I do not have previous experience for
comparison. Having said that, I don't feel there was nothing to improve.
6. it would have been valuable after the practice bookmark to provide the data including
the impact information and graphical spread, as we had done after round 1 on day 2.
7. I think the discussions were excellent!
8. no improvement needed - there was lots of time for discussion which I think was
important
9. Not sure; I thought the process went well as it is.
10. Develop the list of competencies from the onset of the exercise.

11. the teaching, preparation, and handling of questions were all excellent. there was some
confusion among participants as to whether they should discuss with others or not,
especially during round I. Given the discussion that ensued after the impact statistics
were shown, I wonder about including that on the practice day so desensitize people to
this aspect.
12. As suggested at the time, letting us know immediately what failure rate would result
from with our individual bookmarks would be helpful.


Mccqe Part I (Pdfdrive)

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Mccqe Part I (Pdfdrive)

Uploaded by

Copyright:

Available Formats

Technical Report

on the Standard Setting

Psychometrics and Assessment Services

APPENDIX A: Demographic Information Sheet .................................................................................. 17

Medical Council of Canada

The MCCQE Part I is a computer-delivered examination which assesses basic medical

Medical Council of Canada

SELECTING PARTICIPANTS AND ASSIGNING INTO PANELS

SELECTING TEST QUESTIONS FOR THE STANDARD SETTING SESSION

Medical Council of Canada

Activities During the Tw o -Da y Session

DEFINING THE BORDERLINE CANDIDATE

Medical Council of Canada

THE PRACTICE TEST

THE PRACTICE BOOKMARK METHOD

Medical Council of Canada

TWO ROUNDS OF BOOKMARKING

Medical Council of Canada

Medical Council of Canada

Recommendation from the Panelists

Evaluation of the Standard Setting Judgments

Medical Council of Canada

Providing Feedback through an Online Surve y

Medical Council of Canada

Medical Council of Canada

De Champlain, A. F. (2014). Standard setting methods in medical education. In T. Swanwick

Kane, M. (1998). Choosing Between Examinee-Centered and Test-Centered Standard-Setting

Wright, B. D. and Stone, M. H. (1979). Best Test Design: Rasch Measurement.

Medical Council of Canada

2012 2013 2014

Summary of Cut Scores by Panel for Rounds 1 and 2

Mean -0.43 -0.64 -0.33 -0.17

Medical Council of Canada

Figure 2: Failure Rates for First-Time Takers (Panel 2)

Medical Council of Canada

Figure 4: Failure Rates for all First-Time Takers (Round 2)

Medical Council of Canada

Medical Council of Canada

Email: _____________________________________ Phone number: ________________

Mailing address: __________________________________________________________

1. Number of years in practice post residency:

2. Number of years’ experience supervising residents:

Medical Council of Canada

4. Have you ever been a member of a Medical Council test committee?

5. Country of medical training (post graduate training):

6. Region of the country in which you live:

Medical Council of Canada

10. Medical Specialty:

11. Type of community in which you work:

12. Type of care setting:

Medical Council of Canada

Medical Council of Canada

STANDARD SETTING FOR QUALIFYING EXAMINATION PART I

AGENDA – DAY 1: Thursday, Oct. 23, 2014

1. Breakfast and Registration 08:00

4. Develop Target Student Description and Reach Consensus 12:30

5. Training of Bookmark Method & Practice 13:15

P.M. BREAK 14:45

6. Practice Ordered Item Booklet (OIB) 15:00

END OF DAY 1 17:00

Medical Council of Canada

8. Independently Mark Round 1 Bookmark Judgement/Hofstee by Panel 08:30

10. Round 1 – Data Feedback Whole Group 12:15

Email: _____________________ Phone number: