Assessing Healthcare Quality Using Standardized Patients

Using
Standardized Patients to Assess Quality in The problem
the Field: An Overview
JISHNU DAS
WORLD BANK, WASHINGTON DC AND CENTRE FOR POLICY
RESEARCH, NEW DELHI
Based on joint work with many authors including Jeff Hammer, Madhukar Pai, Abhijit Banerjee, Abhijit
Chowdhury, Manoj Mohanan, Veena Das, R.K. Das, Lupe Bedoya, Benjamin Daniels, Reshmaan Hussam
#ISQua2017 @ISQua #ISQua2017 @ISQua 2
Clinical records don’t work Structural measures don’t work
Low income countries have almost no Total, 231 Researchers and policymakers have used other measures, most commonly the availability of
data on how doctors treat patients Patient Visits
(100%) medicines and the quality of infrastructure
Many private sector providers do not But, there are small or no associations between structural measures and the most relevant
have patient records Patient Listed in Yes, 171 No, 60
metric of quality:
Register (74%) (26%)
Public sector providers sometimes
have patient records and sometimes “When a patient comes to a doctor, does the patient get what they need to get better
these are used for analysis Yes, 75 No, 96
Symptoms Listed
(32%) (42%) and not get what they don’t need?”
But they really shouldn’t be!
Yes, 57 No, 18
It is not hard to see why structural measures are not associated with quality:
Symptoms Indicative
Figure shows the fate of 231 tracked of Presentation (25%) (8%) ◦ An economist sitting in the doctor’s chair is still an economist, not a doctor—even with the best
Standardized Patients sent to public infrastructure
clinics in Madhya Pradesh, India
◦ The availability of medicines suffers both from the “economist in a doctor’s chair” problem. In addition,
6 months later, we try to track them This does not require any diagnostic skills on the part of if medicines are not available, does that reflect consumer demand or the doctors’ quality?
down in the official clinic registers… the doctor—it is a mechanical recording exercise
#ISQua2017 @ISQua 3 #ISQua2017 @ISQua 4
Knowledge measures hold more promise What have medical vignettes shown?
A more promising measure is direct ALTHOUGH PROVIDERS KNOW TO FREQUENTLY THEY ALSO FREQUENTLY “KNOW” TO GIVE
measurement of healthcare GIVE PATIENTS WHAT THEY NEED PATIENTS WHAT THEY DON’T NEED
provider’s knowledge using
vignettes
◦ Arguably provider knowledge is
necessary for correct case
management
◦ Using medical vignettes, knowledge
can be measured in a case‐controlled
setting with dedicated surveyors
◦ Can tailor cases and sampling as
required
◦ Allows us to understand whether
doctors know to give patients what
they need and also know not to give
patients what they don’t need
#ISQua2017 @ISQua Click to access the report here 5 #ISQua2017 @ISQua 6
1
Critical difference with key policy
But do vignettes predict clinical practice? implications
This is not just a measurement of performance 45o
Performance
PEABODY ET. AL. JAMA 2000 RETHANS ET. AL. BMJ 1991 issue Peabody et. al.
Objective was “to validate clinical vignettes as Objective was “to study the differences and If Peabody et. al. are right, then increasing

a method for measuring the competence of relation between what a doctor actually does competence will automatically increase
physicians and the quality of their actual in daily practice (performance) and what he or performance—policy should largely be about
practice” she is capable of doing (competence) training, not actual behavior Rethans et. al.
In Rethans’ world, the
Design sent SPs unknown to doctors followed “behavior gap” is large
Design sent standardized patients (SPs) to different SPs known to the doctor with the Rethans et. al.’s results suggest a greater role
clinicians followed by medical vignettes for the same cases for behavior
same cases
Providers did much better when SPs were Note the parallel to Miller’s pyramid in
Conclusion was that vignettes and SPs known rather than unannounced. Authors medical education from “know” to “do” In Rethans’ world, X
produced comparable measures of quality conclude that “performance and competence improvement in
should be considered as distinct constructs” competence yields lower
improvements in practice
X
Peabody view implies that moving from poor to good
system performance implies increasing performance by X Competence
Early Evidence on the know‐do gap: 2007 Patterns of Competence in the data
Das and Hammer (2007) conduct the following study (Journal of Development Economics) Likelihood of non‐harmful treatment
100
Among 203 public and private health care providers (HCPs) in Delhi, they first conducted medical 91
90
vignettes for 5 cases
80 77
◦ It included fully qualified MBBS, AYUSH providers (BIMS, BAMS and BUMS) and Informal Providers. 74
MBBS providers were 40%. 70
60 56 55
6 months later, they and their team sat in the providers clinics recording details of all 50 47
interactions 41
40
This provided a basic description of the competence and practice of a random sample of 30 25
providers in Delhi 20 18
For some frequent cases (cough, cold, unspecified fever), they were also able to link 10
performance in the vignettes with performance in practice 0
Diarrhea TB Pre‐Eclampsia
20% Least Competent Average Competence 20% Most Competent
#ISQua2017 @ISQua 9 #ISQua2017 @ISQua Click to access paper here 10
How doctors practice in the data Key Result and insight
7 Practice depends both on what you know
40% of
(competence) and what you do (“effort” in essential
6 short) question
s asked
Percentage of Essential Tasks Completed
5 When you put the two together, seeming
4 anomalies make sense (for example, why
low effort
informal providers and AYUSH are so popular
3 medium when there are free public providers in the
high same neighborhoods)
2
0
time questions exams
Less than 2 minutes Just one question Private Public Private, No
MBBS MBBS MBBS
Almost none!
#ISQua2017 @ISQua Click to access paper here 11 #ISQua2017 @ISQua Click to access paper here 12
2
Our best guess of the know‐do gap The urgent problem
Delhi, India Tanzania (Leonard and others) We urgently need measures of quality that can be used to
◦ Assess current levels and correlates of quality at the population‐level and by subgroups
10 20 30 40 50 60 70
Performance (% of required items)
◦ Assess what explains the levels we find
◦ Experiment and innovate to improve—and measure that improvement
Remainder of Presentation
Recap why SPs are important
0
0 10 20 30 40 50 60 70 80 90
Competence (% of required items)
Individual clinician's competence and performance Show progress that we have made measuring quality using SPs

Predicted quadratic relationship of competence to performance (Public)
Predicted quadratic relationship of competence to performance (Non-Public)
Performance = Competence Discuss results from 4 key research designs
Open for feedback and discussion
Results replicated in different ways from Rethans’ work in Netherlands to India, Tanzania, Rwanda, Canada
and the U.S.: Doctors performed better when watched or tested relative to with real patients
#ISQua2017 @ISQua Click to access a summary paper here 13 #ISQua2017 @ISQua 14
Why use the SP approach?
Measuring quality appropriately is hard… Measure of Quality Hawthorne Effects Illnesses Covered
Accounts for
Accounts for
Patient‐Mix
Knowledge
Measures
Measures
Case‐Mix
Practice
Recall Four problems
◦ Accounting for case and patient‐mix
Vignettes Yes No Yes Yes By design: Vignettes measure the All
◦ Accounting for Hawthorne effects maximum a provider can do
◦ Allowing for case‐specific inference (did the doctor do the right thing given what the patient has) Clinical Observation No Yes No No Yes Limited in two ways. First, “serious” illnesses like unstable

◦ Allow for distinguishing under and over‐treatment angina will show up on a sporadic basis. Second, the
observer never knows what the patient actually has—and
doctors frequently make incorrect diagnoses.
Options
◦ Medical Vignettes measure clinical knowledge, but knowledge need not reflect practice Chart Abstraction No Yes No No No Similar to clinical observation, but providers rarely keep
◦ Observations of doctor‐patient interactions: Measures practice but suffers from all four problems (1 and patient charts. Even when they exist, charts tend to be
incomplete and don’t accurately reflect patient‐provider
2 may be smaller than thought, 3 and 4 are very serious problems) interactions.
◦ Standardized patients: People recruited from local communities and extensively trained to depict the
Standardized No Yes Yes Yes No Limited to (A) adults only; (B) diseases that don’t have any
same case to multiple providers. Interaction details obtained through structured questionnaire within 1 Patients obvious physiological symptoms (which cannot be
hour of the interaction. mimicked) and (C) conditions that don’t require invasive
exams—particularly in low‐income countries.
Standardized patients Studies
Extensively used in the U.S. and Canada in medical schools (and part of the examination system) Cross‐section study from population‐based sample of providers in rural India (Madhya Pradesh)
and urban India (Delhi) for three tracer conditions—unstable angina in a 45 y.o. male, asthma in
Large number of studies looking at various aspects of validation (more so for medical education)
a 25 y.o. female/male, dysentery for a 2 y.o. child who is sleeping at home (Health Affairs)
Fascinating studies in small‐N clinic samples varying aspects of the SP presentation (Contextual care)
Cross‐section study among sample of public providers with dual practices in rural India (MP) for
Limited studies on viability in the field with large sample sizes both (in high‐income countries as well) angina, asthma and dysentery: SPs sent to both public and private practices of the same
provider (American Economic Review)
Randomized Control Trial of extended training (4 hours a week for 9 months) for informal sector
Here: Document early learning from SP studies, focusing on methods and emerging substantive issues providers in Birbhum, West Bengal: SPs used to evaluate impact on angina, asthma and
Solicit Feedback: How can we improve? dysentery. NOTE: Evaluation completely firewalled from implementation, so that training
◦ What are we getting right? foundation did not know the cases that would be tested (Science)
◦ What are we missing? Cross‐section pilot of 4 variants of Tuberculosis cases among 100 providers in Delhi (LancetID)
Cross‐section pilot of angina, asthma, dysentery and one TB case among 42 clinics in Nairobi,
Kenya, accompanied by drug testing after collection of medicines (BMJ Global Health)
3
Process and Timeline Validation
Case Development: 2‐3 months. Uses experts, anthropologists and panel that understands both SP detection: In cases with consent, go back and ask providers whether they have seen an SP
the local context and the medical details of the case (and if yes, what they presented with and their age).
◦ Detection rates ranging from 0% (Kenya) to 3% (TB, with highly compressed schedule)
SP recruitment and further case development: Several exercises and interviews to emerge at SPs
who will be trained (typically 50% of people recruited make the cut): 1 week Harm to SPs: In pilot, 3 cases arose across all 5 studies where an SP was exposed to an injection
or a finger prick (with sterile needles in all cases, <.02%): Protocols revised accordingly (For
SP training and script development: 3‐4 weeks, eventually decreasing the SP pool by another
instance: don’t leave hands on table)
50%
Harm to providers: No self‐reported harm (TB), time taken is 3‐5 minutes. Providers themselves
Survey and data: 2 weeks to 3 months; data virtually immediate
say that there were no adverse implications from participating in an SP study
IRB: First pilot survey done with informed consent from providers. If required, larger survey
Inter‐rater agreement: This is an inappropriate validation technique, since it assumes that there
done with waiver of consent after first proving public benefit and no harm to either participating
is little variation in performance for the same provider. Better to use SP fixed‐effects and test for
providers or SPs
joint significant of SP fixed‐effects (typically small)
Validation Some Results: Process measures in MI
Do providers treat patients “as if” they had the An average interaction
real case they were presenting with, or does the
patient lead them to suspect that nothing is
wrong? 3.89 minutes
If latter, more history taking and examinations
would increasingly lead the provider to not do
anything 2.89 questions
Typically, we find the opposite: The less the
provider did, the more likely they were to get it 1.46 exams
wrong. Those who did the most were also more
likely to think that the SP had the case that they
were presenting with 2.34 medicines
◦ Extreme examples include providers trying to
immediately take the angina case to hospital
Rs. 31
Das and others, 2012
Diagnosis given
to patient in four Example: TB, leading to know‐do gap
SP conditions: Patient with 3 week history of
Nairobi, Kenya cough and fever. Took medicines
from chemist but is not feeling
better
Note the large know‐do gaps
that emerge between vignettes
and standardized patients
#ISQua2017 @ISQua Click to access the paper here
Add your personal Twitter handle 23 #ISQua2017 @ISQua Click here to access paper 24
4
In fact, large know‐do gaps found in all
studies This is financially costly to patients
Table 2: Necessary and avoidable costs of treatment
(All values are in US dollars)
(1) (2) (3) (4) (5) (6) (7) (8) (9)
India Rural India Urban
Madhya China Kenya
All Birbhum All Delhi Mumbai Patna
Pradesh
Total cost 1.592 0.676 2.586 8.737 2.094 8.834 10.216 3.763 4.330
Consultation cost 0.410 0.280 0.551 2.998 1.616 3.420 2.674 0.224 1.553
Pharmaceutical cost 1.182 0.396 2.035 2.220 0.478 2.521 2.557 2.855 3.846
Avoidable costs 1.296 0.487 2.175 3.251 1.479 3.022 4.041 3.515 3.335
Fraction of costs that is avoidable
0.810 0.777 0.846 0.640 0.815 0.634 0.609 0.895 0.746
(unweighted, average per case)
Fraction of costs that is avoidable
0.814 0.721 0.841 0.372 0.706 0.342 0.396 0.934 0.786
(weighted, average by study)
Number of cases 1,651 861 790 2,852 250 1,583 1,019 299 166
Notes: We assume that all costs are necessary when a provider recommends the correct treatment and only the correct treatment. When a provider over-treats, we
assume that the cost of consultation, and cost of indicated medicines are necessary, while the costs of unnecessary medicines are avoidable. Finally, when a provider
recommends an incorrect treatment, we assume all costs (consultation and pharmaceutical) are unnecessary.
#ISQua2017 @ISQua (Unpublished data) 25 #ISQua2017 @ISQua Based on ongoing research 26
Correlates of Quality: Potential research Design 1: Send same SP to different clinics
With basic technology in place, validated, and used for multiple conditions, several interesting Das and others (2016) examine the quality of care in public and private clinics. They undertake
research designs and studies can be initiated two audit studies
Some are fully fleshed out; some are in “A/B” testing phase—idea yet to be fully developed Audit 1: In rural Madhya Pradesh, they send the same SPs to randomly selected public and
These research designs are broadly structured in the following manner private clinics
◦ This tells us the average quality of public and private clinics, but does not tell us where this difference
Research Design 1: Use SPs sent to multiple clinics to study differences across clinics and providers
comes from—particularly important because 70% of private clinics are run by providers with no formal
Research Design 2: Vary the presentation of the SP to study how differences across SPs alters medical training
provider behavior
Audit 2: First identify public health care providers who have both a public and a private practice.
Research Design 3: Use variation in the clinics over time to study the effect of different aspects of the
clinic Next, send the same standardized patient to both practices of the same doctor and compare
Research Design 4: Use SPs together with an quality improvement RCT to understand impacts performance
I provide examples of each for further discussion This isolates the impact of public/private differences in care, since the provider is the same
#ISQua2017 @ISQua 27 #ISQua2017 @ISQua Click here to access the paper 28
Doctors perform better in private practice Research Design 2: Change aspects of SPs
The same provider has the lowest checklist adherence in their
Experiment 1 shows that quality of public sector clinic....and the highest in their private sector clinic This is the most frequently used research design in SP studies in the U.S. and Canada, usually
care is virtually identical in public Same around “contextual care”
30 0.5
and private clinics
Standardized Checklist Score (Standard Devaiations)
individual 0.4 Example from LMIC Field Experiment

25 providers 0.3
Experiment 2 shows that the same 0.2
Experiment in China, Currie et. al. 2014: When China had a health budget crisis, one “reform”
%Checklist Completion
doctor in private clinic was best
20
0.1
they instituted was to pay a portion of doctors’ salaries from the profits of drug sales in the
pharmacy
doctor in the system and in public 15 0
clinic, the worst in terms of ‐0.1
Currie et. al. send two types of standardized patients with viral pharyngitis to doctors
10
necessary care ‐0.2
◦ The first will buy medicines from the pharmacy of the hospital (nothing special said to doctor)
‐0.3
5
‐0.4
◦ The second indicates that he/she will buy medicines from a pharmacy outside. Therefore, the
No difference in extent or type of doctor has no financial gain from prescribing additional drugs
0 ‐0.5
unnecessary care across public and Private Sector (ALL) Public Sector (ALL) Public Sector Doctors in Public Sector Doctors in
private clinics
their Public Clinics their Private Clinics
Antibiotic use declined from 63.3% to 11.7% across the two SPs
% Checklist Completion Standardized Checklist Score
#ISQua2017 @ISQua Click here to access the paper 29 #ISQua2017 @ISQua Click here to access the paper 30
5
Research Design 3: Natural Variation in Clinics
Research Design 2: Change aspects of SPs (A/B testing phase)
Second example relates to sex‐ Example 1: Public clinics in Nairobi are
specific treatment by doctors very busy on Mondays/Tuesdays and less
so on Wednesday and Thursday (and busy
The prevalence of TB is much again on Friday)
higher among males than females
When they are busy, patients can easily
Is this because doctors are less wait 4‐6 hours to see the provider
likely to treat and diagnose TB
among women? To look at the impact of wait times, send
SPs on M/T and compare them to the
Design: Randomly allocate female ones sent on Thursday
and male SPs to doctors
When we do this, we find that clinics that
Finding: No difference in overall are on average busier have lower quality
care, although differences in care, but the same clinic does not have
types of questions asked (women worse care on more busy compared to
more about family, men more less busy days
about smoking and alcohol)
Perhaps clinics that are on average busy
are in areas where care is worse (urban
slums)
#ISQua2017 @ISQua (Unpublished) 31 #ISQua2017 @ISQua (Unpublished) 32
Research Design 3: Natural Variation in Clinics
(A/B testing phase) Research Design 4: SP + RCT
Example 2: Private clinics in Indian cities RCT in Birbhum, West Bengal
operate in 2 shifts 9 month training course for informal
providers
They operate from 10am to 1pm and
then again from 5pm to 9pm Trainers did not know SP case scenarios; SPs
were blinded from treatment assignment
In Mumbai, India, they can stay open till This eliminates teaching to the test
1am
Training improved checklist adherence for all
Experiment: Randomize SPs for TB to cases
visit clinics in the morning and in the Large improvements in correct treatments
evening shifts. Is there a difference in from very low base
care? No significant change in incorrect treatments
at initially very high levels
Yes, with significant declines in correct ◦ Incorrect antibiotic use
case management Correct Case Management for 4 SP scenarios for Tuberculosis
#ISQua2017 @ISQua (Unpublished) 33 #ISQua2017 @ISQua Click here to access the paper 34
But these differences small relative to
differences across countries Early Learnings
SPs are a viable tool for understanding a broad system of care in population based samples
◦ Provide information on case‐specific information (correct diagnosis rates for instance) that cannot be
obtained easily by any other means, particularly for rare cases
◦ Allow for valid measurement of improvement
◦ Distinguish care that is needed from care that is not necessary
Equally importantly, the limitations are becoming clearer
◦ Full cycle of care has been difficult to evaluate (Tuberculosis)
◦ Difficult to evaluate care across different systems when these include referrals or lab tests
Nevertheless, new research designs can begin to answer fairly complex questions
Further studies that look at the same cases in multiple countries or start to examine variation
within countries can shed much needed light on the quality of care
#ISQua2017 @ISQua Click here to access the paper 35 #ISQua2017 @ISQua 36
6
Selected papers
Das, Jishnu, Alaka Holla, Aakash Mohpal and Karthik Muralidharan. 2016. Quality and Accountability in Healthcare Delivery: Audit-Study
Evidence from Primary Care in India. American Economic Review, Vol. 106(12): 3765-3799.
Das Jishnu, Guadalupe Bedoya, Amy Dolinger, Khama Rogo, Njeri Mwaura, Francis Wafula, Bernard Olayo. 2016. Examining the quality of
medicines at Kenyan healthcare facilities: a validation of an alternate post-market surveillance model that uses standardized patients. Drugs-Real
World Outcomes, November 2016. Advance online access: doi:10.1007/s40801-016-0100-7
Das, Jishnu, Abhijit Chowdhury, Reshmaan Hussam and Abhijit V. Banerjee. 2016. The Impact of Training Information Healthcare Providers
in India: A Randomized Controlled Trial. Science, Vol. 354, Issue 6308.
Das, Jishnu, Ada Kwan, Ben Daniels, Srinath Satyanarayana, Ramnath Subbaraman, Sofi Bergkvist, Ranendra K. Das, Veena Das and
Madukar Pai. 2016. Use of standardized patients to assess antibiotic dispensing for tuberculosis by pharmacies in urban India: A cross‐
sectional study.” Lancet Infectious Diseases, Vol. 16(11): pp1261‐1268.
Das, Jishnu, Ada Kwan, Ben Daniels, Srinath Satyanarayana, Ramnath Subbaraman, Sofi Bergkvist, Ranendra K. Das, Veena Das and
Madukar Pai. 2015. “First use and validation of the standardized patient methodology to assess quality of tuberculosis care: a pilot, cross‐
sectional study.” Lancet Infectious Diseases, Volume 15(11): 1305‐1313.
Das, Jishnu and Jeffrey Hammer. 2014. Quality of Primary Care in Low‐Income Countries: Facts and Economics. Annual Review of
Economics, Vol. 6: 525‐555.
Das, Jishnu, Alaka Holla, Veena Das, Manoj Mohanan, Diana Tabak and Brian Chan. 2012. “The Quality of Medical Care in Clinics: Evidence
from a Standardized Patient Study in a Low‐Income Setting”. Health Affairs, Vol. 31(12): 2274‐2784
#ISQua2017 @ISQua
#ISQua2017 @ISQua 37
#ISQua2017 @ISQua

Assessing Healthcare Quality Using Standardized Patients

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Assessing Healthcare Quality Using Standardized Patients

Uploaded by

Copyright:

Available Formats

Using

#ISQua2017 @ISQua Click to access the report here 5 #ISQua2017 @ISQua 6

Objective was “to validate clinical vignettes as Objective was “to study the differences and If Peabody et. al. are right, then increasing

20% Least Competent Average Competence 20% Most Competent

#ISQua2017 @ISQua 9 #ISQua2017 @ISQua Click to access paper here 10

#ISQua2017 @ISQua Click to access paper here 11 #ISQua2017 @ISQua Click to access paper here 12

Individual clinician's competence and performance Show progress that we have made measuring quality using SPs

#ISQua2017 @ISQua Click to access a summary paper here 13 #ISQua2017 @ISQua 14

◦ Allowing for case‐specific inference (did the doctor do the right thing given what the patient has) Clinical Observation No Yes No No Yes Limited in two ways. First, “serious” illnesses like unstable

#ISQua2017 @ISQua (Unpublished data) 25 #ISQua2017 @ISQua Based on ongoing research 26

#ISQua2017 @ISQua 27 #ISQua2017 @ISQua Click here to access the paper 28

individual 0.4 Example from LMIC Field Experiment

#ISQua2017 @ISQua Click here to access the paper 29 #ISQua2017 @ISQua Click here to access the paper 30

#ISQua2017 @ISQua (Unpublished) 31 #ISQua2017 @ISQua (Unpublished) 32

#ISQua2017 @ISQua (Unpublished) 33 #ISQua2017 @ISQua Click here to access the paper 34

#ISQua2017 @ISQua Click here to access the paper 35 #ISQua2017 @ISQua 36

You might also like