Online Learning Evaluations Uncover Success Strategies

U. S .
D E PA R T M E N T O F E D U CAT I O N
Evaluating Online Learning

Challenges and Strategies for Success
I n n o v a t io n s i n E d u c a t io n
Evaluating Online Learning
Challenges and Strategies for Success
I n n o v a t io n s i n E d u c a t io n
Prepared by WestEd
With Edvance Research, Inc.
For
U.S. Depar tment of Education

Office of Innovation and Improvement
2008
This report was produced under U.S. Department of Education Contract No. ED-01-CO-0012, Task Order
D010, with WestEd. Sharon Kinney Horn served as the contracting officer’s representative. The content
of this report does not necessarily reflect the views or policies of the U.S. Department of Education, nor
does the mention of trade names, commercial products or organizations imply endorsements by the U.S.
government. This publication also contains URLs for information created and maintained by private orga-
nizations. This information is provided for the reader’s convenience. The U.S. Department of Education
is not responsible for controlling or guaranteeing the accuracy, relevance, timeliness, or completeness of
this outside information. Further, the inclusion of information or URL does not reflect the importance of
the organization, nor is it intended to endorse any views expressed, or products or services offered.
U.S. Department of Education

Margaret Spellings
Secretary
Office of Innovation and Improvement

Doug Mesecar
Assistant Deputy Secretary
Technology in Education Programs

Cheryl Garnette
Director
July 2008
This report is in the public domain. Authorization to reproduce it in whole or in part is granted. While
permission to reprint this publication is not necessary, the citation should be: U.S. Department of
Education, Office of Innovation and Improvement, Evaluating Online Learning: Challenges and
Strategies for Success, Washington, D.C., 2008.
To order copies of this report (order number ED004344P),

write to: ED Pubs, Education Publications Center, U.S. Department of Education, P.O. Box 1398,
Jessup, MD 20794-1398;
or fax your request to: 301-470-1244;
or e-mail your request to: edpubs@inet.ed.gov;
or call in your request toll-free: 1-877-433-7827 (1-877-4-ED-PUBS). Those who use a telecommuni-
cations device for the deaf (TDD) or a teletypewriter (TTY), should call 1-877-576-7734. If 877 ser-
vice is not yet available in your area, call 1-800-872-5327 (1-800-USA-LEARN; TTY: 1-800-437-0833);
or order online at: http://www.edpubs.ed.gov.
This report is also available on the Department’s Web site at:

http://www.ed.gov/admins/lead/academic/evalonline
On request, this publication is available in alternate formats, such as Braille, large print, or computer
diskette. For more information, please contact the Department’s Alternate Format Center at 202-260-0852
or 202-260-0818.
Contents
List of Illustrations iv
Foreword v
Acknowledgments vii
Introduction 1
Common Challenges of Evaluating Online Learning (1)
The Featured Evaluations (2)
What’s in This Guide (3)
Part I: Challenges of Evaluating Online Learning 7

Meeting the Needs of Multiple Stakeholders (7)
Building on the Existing Base of Knowledge (12)
Evaluating Multifaceted Online Resources (20)
Finding Appropriate Comparison Groups (26)
Solving Data Collection Problems (33)
Interpreting the Impact of Program Maturity (40)
Translating Evaluation Findings Into Action (43)
Part II: Recommendations for Evaluating Online Learning 49

General Recommendations (49)
Recommendations for Meeting the Needs of Multiple Stakeholders (50)
Recommendations for Utilizing and Building on the Existing Base of
Knowledge (50)
Recommendations for Evaluating Multifaceted Resources (52)
Recommendations for Programs Considering a Comparison Study (52)
Recommendations for Gathering Valid Evaluation Data (53)
Recommendations for Taking Program Maturity Into Account (54)
Recommendations for Translating Evaluation Findings Into Action (54)
Featured Online Learning Evaluations 55
Appendix A: Resources 59
Distance Learning (59)
Evaluation Methods and Tools (60)
Resources From Higher Education (61)
Appendix B: Research Methodology 63
Glossary of Common Evaluation Terms 65
Notes 67
iii
Illustrations
FIGURES
1. Excerpt From Digital Learning Commons’ Meeting 21st Century Learning Challenges
in Washington State 24
2. Example of Tabulated Results From an Arizona Virtual Academy Online Parent Survey
About the Quality of a Recent Student Workshop 38
TABLES
1. Selected Variables of Profiled Online Learning Evaluations 4
2. Excerpt From Appleton eSchool’s Online Program Perceiver Instrument 17
3. Evaluation Design Characteristics 28
4. Excerpts From AZVA’s Next Steps Plan, In Response to Recommendations

From the K12 Quality Assurance Audit 47
iv
Foreword
Education in this country has evolved dramatically from the days of one teacher in a one-room
schoolhouse. Today, student learning is no longer confined to a physical space. Computers and the
Internet have broken through school walls, giving students greater opportunities to personalize their
education, access distant resources, receive extra help or more-challenging assignments, and engage
in learning in new and unique ways.
Although online learning is a relatively new enterprise in the K–12 arena, it is expanding rapidly,
with increasing numbers of providers offering services and more students choosing to participate. As
with any education program, online learning initiatives must be held accountable for results. Thus,
it is critical for students and their parents—as well as administrators, policymakers, and funders—to
have data informing them about program and student outcomes and, if relevant, about how well a
particular program compares to traditional education models. To this end, rigorous evaluations are
essential. They can identify whether programs and online resources are performing as promised, and,
equally important, they can point to areas for improvement.
The evaluations highlighted in this guide represent a broad spectrum of online options, from pro-
grams that provide online courses to Web sites that feature education resources. The evaluations
themselves range from internal assessments to external, scientific research studies. All demonstrate
how program leaders and evaluators have been able to implement strong evaluation practices despite
some challenges inherent to examining learning in an online environment.
This guide complements another publication, Connecting Students to Advanced Courses Online, pub-
lished last year by the U.S. Department of Education. Both are part of the Innovations in Education
series, which identifies examples of innovative practices from across the country that are helping
students achieve.
My hope is that this guide will assist evaluators and program leaders who seek to use data to guide
program improvement aimed at achieving positive outcomes for our nation’s students.
Margaret Spellings, Secretary

U.S. Department of Education
v
Acknowledgments
This guide was developed under the auspices Algebra I Online
of the U.S. Department of Education’s Office of Louisiana Virtual School
Innovation and Improvement. Sharon Horn was Louisiana Department of Education
project director. P.O. Box 94064
Baton Rouge, LA 70804-9064
An external advisory group provided feedback http://www.louisianavirtualschool.net/?algebra
to refine the study scope, define the selection
criteria, and clarify the text. Members included Appleton eSchool
Robert Blomeyer, president, Blomeyer & Clem- 2121 E. Emmers Drive
ente Consulting Services; Tom Clark, president, Appleton, WI 54915
TA Consulting; Trina Davis, president, Interna- http://www.aasd.k12.wi.us/Eschool
tional Society for Technology in Education, and
assistant professor and director of eEducation, Arizona Virtual Academy
Texas A&M University; Helene Jennings, vice 4595 S. Palo Verde Road
president, Macro International, Inc.; Liz Pape, Suite 517/519
president and chief executive officer of Virtual Tucson, AZ 85714
High School Global Consortium; and Lisa Evans, http://www.azva.org
director, Center for Evaluation Capacity Build-
ing, Wexford, Inc. Chicago Public Schools’ Virtual High School
Staff in the Department of Education who 125 S. Clark St.
provided input and reviewed drafts include Chicago, IL 60603
Sue Betka, Cynthia Cabell, Tom Corwin, Kate http://clear.cps.k12.il.us/ohsp/distance_
Devine, David Dunn, Lorenzo Esters, Mere- learning.html
dith Farace, Steve Fried, Cheryl Garnette, Vir-
ginia Gentles, Robin Gilchrist, Doug Herbert, Digital Learning Commons
Nettie Laster, Brian Lekander, Meghan Lerch, Corbet Building
Tiffany Taber, Linda Wilson, and Jacquelyn 4507 University Way NE
Zimmermann. This guide would not be pos- Suite 204
sible without the support of Tim Magner, direc- Seattle, WA 98105
tor, Office of E
ducational Technology. http://www.learningcommons.org
The seven online programs that participated in Thinkport

the development of this guide and the case stud- Maryland Public Television
ies on which it is based were generous with both 11767 Owings Mills Blvd.
their time and attention. Owings Mills, MD 21117
http://www.thinkport.org
Alabama Connecting Classrooms, Educators, & in partnership with:
Students Statewide Distance Learning Johns Hopkins Center for Technology in
Alabama Department of Education Education
50 N. Ripley St. 6740 Alexander Bell Drive
P.O. Box 302101 Suite 302
Montgomery, AL 36104 Columbia, MD 21046
http://accessdl.state.al.us http://cte.jhu.edu/index.htm
vii
Introduction
This guide is designed as a resource for leaders and evaluators of K–12 online learning programs. In
this guide the term “online learning” is used to refer to a range of education programs and resources
in the K–12 arena, including distance learning courses offered by universities, private providers, or
teachers at other schools; stand-alone “virtual schools” that provide students with a full array of online
courses and services; and educational Web sites that offer teachers, parents, and students a range of
resources.1 The guide features seven evaluations that represent variety in both the type of program or
resource being evaluated, and in the type of evaluation. These evaluations were selected because they
offer useful lessons to others who are planning to evaluate an online learning program or resource.
Of course, evaluating online learning is not alto- Common Challenges of Evaluating

gether different from assessing any other type of
Online Learning
education program, and, to some degree, eval-
uators may face the same kind of design and Online learning is a relatively new development
analysis issues in both instances. Still, online in K–12 education but is rapidly expanding
program evaluators may encounter some unan- in both number of programs and participants.
ticipated challenges in the virtual arena owing, According to a report by the North American
for example, to the distance between program Council for Online Learning (NACOL), “As of
sites and students, participants’ unfamiliarity September 2007, 42 states [had] significant sup-
with the technology being used, and a lack of plemental online learning programs (in which
relevant evaluation tools. This guide examines a students enrolled in physical schools take one
range of challenges that online program evalu- or two courses online), or significant full-time
ators are likely to meet, some that are unique programs (in which students take most or all
to online settings and others that are more gen- of their courses online), or both.”2 In addition,
eral. It also describes how online environments the Internet houses an ever-expanding number
can sometimes offer advantages to evaluators of Web sites with a broad range of education
by presenting opportunities for streamlined data resources for students, parents, and teachers.
collection and analysis, for example. Given this expansion and a dearth of existing
research on the topic, it is critical to conduct
The guide specifically focuses on challenges rigorous evaluations of online learning in K–12
and response strategies. All of the evaluations settings to ensure that it does what people hope
described here illustrate strong assessment prac- it will do: help improve student learning.
tices and robust findings, and they are models
for demonstrating how program leaders and However, those undertaking such evaluations
evaluators can handle the challenges of evaluat- may well encounter a number of technical and
ing online learning. methodological issues that can make this type
1
Evaluating Online Learning: Challenges and Strategies for Success
I n n o vat io n s i n E d u c at io n
of research difficult to execute. For example, the learning settings. And, like educators in any
scant research literature on K–12 online learning setting—traditional or online—they may feel a
evaluation provides few existing frameworks to natural trepidation about inviting evaluators to
help evaluators describe and analyze programs, take a critical look at their program, fearing that
or tools, such as surveys or rubrics, they can it will hamper the progress of their program,
use to collect data or assess program quality. rather than strengthen it.
Another common challenge when students are
studying online is the difficulty of examining This guide will discuss how evaluators have
what is happening in multiple, geographically tried to compare traditional and online learning
distant learning sites. And multifaceted educa- approaches, what challenges they have encoun-
tion resources—such as vast Web sites offering tered, and what lessons they have learned.
a wide range of features or virtual schools that
offer courses from multiple vendors—are also
The Featured Evaluations
hard to evaluate, as are programs that utilize
technologies and instructional models that are This guide intentionally features a variety of on-
new to users. line programs and resources, including virtual
schools, programs that provide courses online,
Furthermore, evaluations of online learning of- and Web sites with broad educational resources.
ten occur in the context of a politically load- Some serve an entire state, while others serve a
ed debate about whether such programs are particular district. This guide also includes dis-
worth the investment and how much funding tinct kinds of evaluations, from internally led
is needed to run a high-quality program; about formative evaluations (see Glossary of Common
whether online learning really provides students Evaluation Terms, p. 65) to scientific research
with high-quality learning opportunities; and studies by external experts. In some cases, pro-
about how to compare online and traditional gram insiders initiated the evaluations; in others,
approaches. Understandably, funders and pol- there were external reasons for the evaluation.
icymakers—not to mention students and their The featured evaluations include a wide range
parents—want data that show whether online of data collection and analysis activities—from
learning can be as effective as traditional educa- formative evaluations that rely primarily on sur-
tional approaches and which online models are vey, interview, and observation data, to scientific
the best. These stakeholders may or may not experiments that compare outcomes between
think about evaluation in technical terms, but all online and traditional settings. In each instance,
of them are interested in how students perform evaluators chose the methods carefully, based
in these new programs. At the same time, many on the purpose of the evaluation and the specific
online program leaders have multiple goals in set of research questions they sought to answer.
mind, such as increased student engagement or
increased student access to high-quality courses The goal in choosing a range of evaluations for
and teachers. They argue that test scores alone this guide was to offer examples that could be
are an inadequate measure for capturing impor- instructive to program leaders and evaluators in
tant differences between traditional and online diverse circumstances, including those in vary-
2
ing stages of maturity, with varying degrees of gram or resource, whether the program serves
internal capacity and amounts of available fund- a district- or state-level audience, and stage of
ing. The featured evaluations are not without maturity. In selecting the featured evaluations,
flaws, but they all illustrate reasonable strategies the researchers drew from as wide a range of
for tackling common challenges of evaluating characteristics as possible while keeping the
online learning. quality criteria high. A full description of the
methodology used to study the evaluation(s) of
To select the featured evaluations, researchers
the selected sites can be found in appendix B:
for this guide compiled an initial list of can-
Research Methodology.
didates by searching for K–12 online learning
evaluations on the Web and in published doc- The final selection included evaluations of the
uments, then expanded the list through refer- following online programs and resources: Ala-
rals from a six-member advisory group (see list bama Connecting Classrooms, Educators, & Stu-
of members in the Acknowledgments section, dents Statewide Distance Learning, operated by
p. vii) and other knowledgeable experts in the the Alabama Department of Education; Algebra
field. Forty organizations were on the final list I Online, operated by the Louisiana Department
for consideration. of Education; Appleton eSchool, operated by
Wisconsin’s Appleton Area School District; Ari-
A matrix of selection criteria was drafted and
zona Virtual Academy, a public charter school;
revised based on feedback from the advisory
Chicago Public Schools’ Virtual High School;
group. The three quality criteria were:
Digital Learning Commons in Washington state;
• The evaluation considered multiple outcome and Thinkport, a Web site operated by Maryland
measures, including student achievement. Public Television and Johns Hopkins Center for
• The evaluation findings were widely commu- Technology in Education. Additional informa-
nicated to key stakeholders of the program or tion about each program and its evaluation(s) is
resource being studied. included in table 1.
• Program leaders acted on evaluation results.
Researchers awarded sites up to three points on What’s in This Guide

each of these three criteria, using publicly avail-
This guide was developed as a resource for eval-
able information, review of evaluation reports,
uators, whether external, third-party researchers,
and gap-filling interviews with program leaders.
or program administrators and other staff who
All the included sites scored at least six of the
possible nine points across these three criteria. are considering conducting their own internal
evaluation. Some of the evaluations highlighted
Since a goal of this publication was to showcase here were carried out by external evaluators,
a variety of types of evaluations, the potential while others were conducted by program staff
sites were coded as to such additional character- or the staff of a parent organization. In all cases,
istics as: internal vs. external evaluator, type of the research was undertaken by experienced
evaluation design, type of online learning pro- professionals, and this publication is aimed
3
Table 1. Selected Variables of Profiled Online Learning Evaluations
Name of Program Type of Program or Resource/ Type of Evaluation Year Evaluation

or Resource Year Initiated Featured in Guidea Evaluation Objective
Started
Alabama Connect- Online courses and interactive External; formative & 2006 Monitoring program
ing Classrooms, video-conference classes for summative; includes com- implementation; program
Educators, & students across state / Piloted in parisons with traditional improvement; sharing best
Students Statewide spring 2006; statewide imple- instructional settings practices
Distance Learning mentation in fall 2006
Algebra I Online Online algebra courses for stu- External and internal; 2003; com- Determine if program is ef-
dents across the state / 2002 formative and summative; parative study fective way to provide stu-
includes comparisons with in 2004–05 dents with certified algebra
traditional instructional teachers and to support the
settings in-class teacher’s certifica-
tion efforts
Appleton eSchool Online courses for students en- Internal; formative; evalua- Rubric piloted Program improvement;
rolled in district’s high schools tion process based on inter- in 2006 sharing best practices
(some students take all courses nally developed rubric
online) / 2002
Arizona Virtual charter school for stu- Formative and summative; 2003 State monitoring; quality
Virtual dents enrolled in public schools external and internal assurance; program im-
Academy and homeschool students (no provement
more than 20%) / 2003
Chicago Public Online courses for students en- External; formative 2002 Assess need for mentor
Schools – Virtual rolled in district’s high schools / training and other student
High School 2002 supports; identify ways to
improve completion and
pass rates
Digital Learning Web site with online courses External and internal; for- 2003 Understand usage of site;
Commons and a wide array of resources for mative assess impact on student
teachers and students / 2003 achievement and college
readiness
Thinkport – Web site with a wide array External and internal; 2001; Understand usage of site;
Maryland Public of resources for teachers and formative and summative, randomized assess impact of “elec-
Television with students / 2000 including randomized con- controlled trial tronic field trip” on student
Johns Hopkins trolled trial in 2005 performance
a
See Glossary of Common Evaluation Terms on page 65.
b
Run by the nonprofit College Board, the Advanced Placement (AP) program offers college-level course work to high school students. Many institutions of higher education
offer college credits to students who take AP courses.
c
North Central Regional Educational Laboratory.
4
Cost of Evaluation Funding Source for Data Data Improvements Resulting
Evaluation Collected Collection Tools From Evaluation
$60,000 in 2007; Specific allocation Student enrollment, comple- Surveys, interviews, Teacher professional devel-
$600,000 in 2008 in program budget tion, grades; APb course observations opment; improvements to
(originating from pass rates; student and technology and administrative
state of Alabama) teacher satisfaction; de- operations
scription of implementation
and challenges
$110,000 for the General program Student grades and state Pre- and posttests Teacher professional develop-
most labor-intensive funds, grants from test scores; pre- and post- developed by eval- ment; increased role for in-class
phase, including the NCREL,c BellSouth tests; student use and satis- uator, surveys teachers, curriculum improve-
comparative analysis Foundation, and faction data; focus groups; ments, new technologies used;
during 2004–05 U.S. Department teacher characteristics expansion to middle schools
of Education and teachers’ certification
outcomes
No specific allocation; General program Internal descriptions and as- Internally developed Mentor professional develop-
Approx. $15,000 to funds (originating sessments of key program rubric and surveys ment; course content improve-
make the rubric and from charter grant components (using rubric); ments; expanded interactivity in
evaluation process from state of Wis- mentor, student, and teacher courses; improvements to pro-
available in Web consin) satisfaction data; course gram Web site and printed mate-
format completion and grades rials; sharing of best practices
No specific allocation General program Student enrollment, grades, Electronic surveys; Wide range of operational and
funds (originating and state test scores; par- externally devel- instructional improvements
from state of Ari- ent, teacher, and student oped rubric
zona) satisfaction data; internal
& external assessments on
key program components
Approx. $25,000 District’s Office of Student enrollment, course Surveys, interviews, Designated class periods for
Technology Services completion, grades, and focus groups online learning; more onsite
test scores; student use and mentors; training for mentors
satisfaction data; mentor
assessments of needs
Approx. $80,000 for Bill & Melinda Gates Student transcripts; student Surveys Improvements to student ori-
the college-readiness Foundation grades and completion entation; curriculum improve-
study rates; use and satisfaction ments; development of school
data use plans to encourage best
practices
Estimated $40,000 Star Schools Grant Student test scores on Test of content knowl- Changes to teaching materials;
for the randomized custom-developed content edge developed by changes to online content and
controlled trial (part assessment; information evaluator, surveys, format
of a comprehensive about delivery of curriculum; teacher implementa-
evaluation) use and satisfaction data tion logs
5
rimarily at readers who are familiar with basic

p Each section of Part I presents practical infor-
evaluation practices. For readers who are less mation about one of the challenges of evaluat-
familiar with evaluation, a glossary of common ing online learning and provides examples of
terms is available on page 65. how the featured evaluations have addressed
it. Part II synthesizes the lessons learned from
Part I of this guide focuses on some of the likely
challenges faced by online program evaluators, meeting those challenges and offers recom-
and it is organized into the following sections: mendations based as well on research and
conversations with experts in evaluating online
• Meeting the Needs of Multiple Stakeholders
learning. These are geared to program leaders
• Building on the Existing Base of Knowledge who are considering an evaluation and to assist
• Evaluating Multifaceted Online Resources them and their evaluators as they work togeth-
• Finding Appropriate Comparison Groups er to design and complete the process. Brief
profiles of each of the seven online programs
• Solving Data Collection Problems
can be found at the end of the guide, and de-
• Interpreting the Impact of Program Maturity tails about each evaluation are summarized in
• Translating Evaluation Findings Into Action table 1.
6
PA R T I
Challenges of
Evaluating Online
Learning
Meeting the Needs of Multiple comparison groups. But if stakeholders want to
know, for example, how a program has been
Stakeholders
implemented across many sites, or why it is
Every good evaluation begins with a clearly stat- leading to particular outcomes, then they might
ed purpose and a specific set of questions to be opt for a descriptive study, incorporating such
answered. These questions drive the evaluation techniques as surveys, focus groups, or observa-
approach and help determine the specific data tions of program participants to gather qualita-
collection techniques that evaluators will use. tive process data.
Sometimes, however, program evaluators find
themselves in the difficult position of needing to When multiple stakeholders have differing in-
fulfill several purposes at once, or needing to an- terests and questions, how can evaluators meet
swer a wide variety of research questions. Most these various expectations?
stakeholders have the same basic question—Is
To satisfy the demands of multiple stakehold-
it working? But not everyone defines working
ers, evaluators often combine formative and
in the same way. While policymakers may be
summative components (see Glossary of Com-
interested in gains in standardized test scores,
mon Evaluation Terms, p. 65). In the case of
program leaders may be equally interested in
Alabama Connecting Classrooms, Educators, &
other indicators of success, such as whether the
Students Statewide Distance Learning (ACCESS),
program is addressing the needs of tradition-
ally underrepresented subgroups, or producing described below, the evaluators have been very
outcomes that are only indirectly related to test proactive in designing a series of evaluations
scores, like student engagement. that, collectively, yield information that has util-
ity both for program improvement and for un-
Naturally, these questions will not be answered derstanding program performance. In the case
in the same way. For example, if stakeholders of Arizona Virtual Academy (AZVA), the school’s
want concrete evidence about the impact on leadership team has made the most of the many
student achievement, evaluators might conduct evaluation activities they are required to com-
a randomized controlled trial—the gold stan- plete by using findings from those activities
dard for assessing program effects—or a quasi- for their own improvement purposes and pig-
experimental design that compares test scores of gybacking on them with data collection efforts
program participants with students in matched of their own. In each instance, program leaders
7
would likely say that their evaluations are first the pilot program was being implemented and
and foremost intended to improve their pro- what changes might be needed to strengthen it.
grams. Yet, when called upon to show achieve-
ment results, they can do that as well. The second evaluation took more of a summa-
tive approach, looking to see whether or not
ACCESS was meeting its overall objectives. First,
Combine Formative and Summative Evaluation
Approaches to Meet Multiple Demands evaluators conducted surveys and interviews
of students and teachers, as well as interviews
From the beginning of their involvement with with school administrators and personnel at the
ACCESS, evaluators from the International Society program’s three Regional Support Centers. In
for Technology in Education (ISTE) have taken addition, they gathered student enrollment and
a combined summative and formative approach achievement data, statewide course enrollment
to studying this state-run program that offers and completion rates, and other program out-
both Web-based and interactive videoconferenc- come data, such as the number of new distance
ing courses. The original development proposal courses developed and the number of partici-
for ACCESS included an accountability plan that pating schools.
called for ongoing monitoring of the program
to identify areas for improvement and to gener- This second evaluation also used a quasi-exper-
ate useful information that could be shared with imental design (see Glossary of Common Evalu-
other schools throughout the state. In addition, ation Terms, p. 65) to provide information on
from the beginning, program leaders and state program effects. Evaluators compared achieve-
policymakers expressed interest in gathering data ment outcomes between ACCESS participants
about the program’s impact on student learning. and students statewide, between students in
interactive videoconferencing courses and stu-
To accomplish these multiple goals, ISTE com-
dents in traditional settings, and between stu-
pleted two successive evaluations for Alabama,
dents who participated in online courses and
each of which had both formative and summative
those who took courses offered in the interac-
components. A third evaluation is under way.
tive videoconferencing format.
The first evaluation, during the program’s pilot
As of early 2008, the evaluators were conduct-
implementation, focused on providing feedback
ing a third evaluation, integrating the data from
that could be used to modify the program, if
the first two studies and focusing on student
need be, and on generating information to share
achievement. Looking ahead, ACCESS leaders
with Alabama schools. Evaluation activities at
plan to continue gathering data annually in an-
this stage included a literature review and ob-
ticipation of conducting longitudinal studies that
servation visits to six randomly selected pilot
will identify ACCESS’s full impact on student
sites, where evaluators conducted interviews progress and achievement.
and surveys. They also ran focus groups, for
which researchers interviewed respondents in a Together, the carefully planned evaluation ac-
group setting. Evaluators chose these methods tivities conducted by ACCESS’s evaluators have
to generate qualitative information about how generated several findings and recommendations
8
Reasons and Contexts for Formative Versus Summative Evaluations
Formative and summative evaluations can each serve important functions for programs. Formative evaluations,
sometimes called “process evaluations,” are conducted primarily to find out how a program is being implemented
and how it might be strengthened. Summative evaluations, also called “outcome evaluations,” are appropriate for
better-established programs, when program leaders have settled on their best policies and practices and want to
know, for example, what results the program is yielding.
Ideally, formative evaluations are developed as partnerships that give all stakeholders a hand in planning and help-
ing conduct the evaluation. Explicitly framing a formative evaluation as a collaboration among stakeholders can
help in more ways than one. Practitioners are more likely to cooperate with and welcome evaluators rather than feel
wary or threatened—a common reaction. In addition, practitioners who are invited to be partners in an evaluation
are more likely to feel invested in its results and to implement the findings and recommendations.
Even more than formative evaluations, summative evaluations can be perceived by practitioners as threatening
and, in many cases, program staff are not eager to welcome evaluators into their midst. Even in these situations,
however, their reaction can be mitigated if evaluators work diligently to communicate the evaluation’s goals. Evalu-
ators should make clear their intention to provide the program with information that can be used to strengthen it, or
to give the program credible data to show funders or other stakeholders. In many cases, summative evaluations do
not uncover findings that are unexpected; they merely provide hard data to back up the anecdotes and hunches of
program leaders and staff.
Program leaders who are contemplating an evaluation also will want to consider the costs of whatever type of study
they choose. Some formative evaluations are relatively informal. For example, a formative evaluation might consist
primarily of short-term activities conducted by internal staff, like brief surveys of participants, to gather feedback about
different aspects of the program. This type of evaluation is inexpensive and can be ideal for leaders seeking ongoing
information to strengthen their program. In other instances, formative evaluation is more structured and formal. For
instance, an external evaluator may be hired to observe or interview program participants, or to conduct field surveys
and analyze the data. Having an external evaluator can bring increased objectivity, but it also adds cost.
In many cases, summative evaluations are more formal and expensive operations, particularly if they are using
experimental or quasi-experimental designs that require increased coordination and management and sophisticated
data analysis techniques. Typically, external evaluators conduct summative evaluations, which generally extends
the timeline and ups the costs. Still, experimental and quasi-experimental designs may provide the most reliable
information about program effects.
Finally, program leaders should consider that an evaluation need not be exclusively formative or summative. As
the ACCESS case illustrates (see pp. 7–10), sometimes it is best for programs to combine elements of both, either
concurrently or in different years.
that already have been used to strengthen the Placement (AP) courses,* are achieving scores
program. For example, their findings suggest comparable to students taught in traditional
that students participating in the distance learn-
* Run by the nonprofit College Board, the Advanced Placement program offers col-
ing courses are completing courses at high rates lege-level course work to high school students. Many institutions of higher education
and, in the case of the College Board Advanced offer college credits to students who take AP courses.
9
s ettings. These kinds of data could be critical to products are working and in ensuring that it is
maintaining funding and political support for the partnered with a high-quality school. AZVA’s di-
program in the future. ACCESS also is building a rector, Mary Gifford, says that “from the second
longitudinal database to provide a core of data you open your school,” there is an expectation
for use in future evaluations and, program lead- [on the part of K12 Inc.] that you will collect
ers hope, to help determine ACCESS’s long-term data, analyze them, and use them to make deci-
impact on student progress and achievement. sions. “K12 Inc. has established best practices for
With these data collection tools and processes academic achievement. They take great pride in
in place, ACCESS has armed itself to address the being a data-driven company,” she adds. It con-
needs and expectations of various stakeholders ducts quality assurance audits at AZVA approxi-
inside and outside the program. mately every two years, which consist of a site
visit conducted by K12 Inc. personnel and an ex-
Make the Most of Mandatory Program tensive questionnaire, completed by AZVA, that
Evaluations documents various aspects of the program, such
as instruction, organizational structure, and par-
While all leaders and staff of online education
ent-school relations. K12 Inc. also requires AZVA
programs are likely to want to understand their
to produce a detailed annual School Improve-
influence on student learning, some have no
ment Plan (SIP), which covers program opera-
choice in the matter. Many online programs must
tions as well as student achievement. The plan
deliver summative student outcome data because must include an analysis of student performance
a funder or regulatory body demands it. In the on standardized state tests, including a compari-
case of Arizona Virtual Academy (AZVA), a K–12 son of the performance of AZVA students to the
statewide public charter school, the program performance of all students across the state.
must comply with several mandatory evaluation
requirements: First, school leaders are required Each of these mandates—those of the state and
by the state of Arizona to submit an annual ef- those of AZVA’s curriculum provider—has an im-
fectiveness review, which is used to determine portant purpose. But the multiple requirements
whether or not the school’s charter will be re- add up to what could be seen as a substantial
newed. For this yearly report, AZVA staff must burden for any small organization. AZVA’s small
provide data on student enrollment, retention, central staff chooses to look at it differently.
mobility, and state test performance. The report Although the requirements generate year-round
also must include pupil and parent satisfaction work for AZVA employees, they have made the
data, which AZVA collects online at the end of most of these activities by using them for their
each course, and a detailed self-evaluation of own purposes, too. Each of the many mandated
operational and administrative efficiency. evaluation activities serves an internal purpose:
Staff members pore over test scores, course
AZVA also must answer to K12 Inc., the educa- completion data, and user satisfaction data to
tion company that supplies the program’s cur- determine how they can improve their program.
riculum for all grade levels. K12 Inc. has its own The SIP is used as a guiding document to or-
interest in evaluating how well its curriculum ganize information about what aspects of the
10
program need fixing and to monitor the school’s other. For instance, the SIP is based on the find-
progress toward its stated goals. Although the ings from the evaluation activities mandated
process is time-consuming, everyone benefits: by the state and K12 Inc., and the latter’s audit
K12 Inc. is assured that the school is doing what process includes an update on progress made
it should, AZVA has a structured approach to toward SIP goals. More broadly, the formative
improving its program, and it can demonstrate evaluation activities help the school leaders to
to the state and others that student performance set specific academic goals and develop a plan
is meeting expectations. for reaching them, which ultimately helps them
improve the achievement outcomes assessed by
AZVA is able to make the most of its mandatory the state. One lesson AZVA illustrates is how to
evaluations because the school’s culture sup- make the most of evaluations that are initiated
ports it: Staff members incorporate data collec- externally by treating every data collection ac-
tion and analysis into their everyday responsibil- tivity as an opportunity to learn something valu-
ities, rather than viewing them as extra burdens able that can serve the program.
on their workload. Furthermore, AZVA’s lead-
ers initiate data collection and analysis efforts Summary
of their own. They frequently conduct online
surveys of parents to gauge the effectiveness of As the above program evaluations demonstrate,
particular services. More recently, they also have sometimes the best approach to meeting the
begun to survey teachers about their profession- needs of multiple stakeholders is being proac-
al development needs and their satisfaction with tive. The steps are straightforward but critical:
the trainings provided to them. “We’re trying to When considering an evaluation, program lead-
do surveys after every single professional de- ers should first identify the various stakehold-
velopment [session], to find out what was most ers who will be interested in the evaluation
effective,” says Gifford. “Do they want more of and what they will want to know. They might
this, less of this? Was this too much time? Was consider conducting interviews or focus groups
this enough time? That feedback has been very to collect this information. Leaders then need
good.” AZVA’s K–8 principal, Bridget Schleifer, to sift through this information and prioritize
confirms that teachers’ responses to the surveys their assessment goals. They should develop a
are taken very seriously. “Whenever a survey clear vision for what they want their evaluation
comes up and we see a need,” she says, “we to do and work with evaluators to choose an
will definitely put that on the agenda for the evaluation type that will meet their needs. If it
next month of professional development.” is meant to serve several different stakeholder
groups, evaluators and program leaders might
Together, these many efforts provide AZVA with decide to conduct a multi-method study that
comprehensive information that helps the school combines formative and summative evaluation
address external accountability demands, while activities. They might also consider develop-
also serving internal program improvement ob- ing a multiyear evaluation plan that addresses
jectives. Just as important, AZVA’s various evalu- separate goals in different years. In the report-
ation activities are integrated and support each ing phase, program leaders and evaluators can
11
consider communicating findings to different measures, making it difficult for evaluators to

stakeholder audiences in ways that are tailored gauge success relative to other online or tradi-
to their needs and interests. tional programs.
In instances where online programs partici- Given the existing base of knowledge on K–12
pate in mandatory evaluations, program leaders online learning, how should evaluators proceed?
should seek to build on these efforts and use Of course, evaluators will first want to consult
them for internal purposes as well. They can the K–12 online learning research that does ex-
leverage the information learned in a summa- ist. Although the field is comparatively limited, it
tive evaluation to improve the program, acquire is growing each year and already has generated
funding, or establish the program’s credibility. a number of significant resources (see appendix
Evaluators can help program leaders piggyback A, p. 59). Among other organizations, the North
on any mandatory assessment activities by se- American Council for Online Learning (NACOL)
lecting complementary evaluation methods that has developed and collected dozens of publica-
will provide not just the required data but also tions, research studies, and other resources use-
information that program staff can use for their ful for evaluators of K–12 online learning pro-
own improvement purposes. grams. In some cases, evaluators also may want
to look to higher education organizations, which
have a richer literature on online learning evalu-
Building on the Existing Base of ation, including several publications that identify
Knowledge standards and best practices (see appendix A).
In some cases, these resources can be adapted
Under normal circumstances, evaluators fre-
for K–12 settings, but in other cases, researchers
quently begin their work by reviewing available
have found, they do not translate well.
research literature. They also may search for a
conceptual framework among similar studies or Another approach is to develop evaluation tools
look for existing data collection tools, such as and techniques from scratch. As described be-
surveys and rubrics, that can be borrowed or low, this may be as simple as defining and stan-
adapted. Yet, compared to many other topics in dardizing the outcome measures used among
K–12 education, the body of research literature multiple schools or vendors, like the evaluators
on K–12 online learning is relatively new and of Digital Learning Commons did. Or it may be
narrow. Available descriptive studies are often a much more ambitious effort, as when Apple-
very specific and offer findings that are not eas- ton eSchool’s leaders developed a new model
ily generalized to other online programs or re- for evaluating virtual schools with an online sys-
sources. Empirical studies are few. Other kinds tem for compiling evaluation data. Finally, some
of tools for evaluators are limited, too. Recent evaluators respond to a limited knowledge base
efforts have led to multiple sets of standards for by adding to it. For example, the evaluators of
K–12 online learning (see Standards for K–12 Louisiana’s Algebra I Online program have pub-
Online Learning, p. 13). However, there still are lished their evaluation findings for the benefit
no widely accepted education program outcome of other evaluators and program administrators.
12
Standards for K–12 Online Learning
As online learning in K–12 education has expanded, there has been an effort to begin to codify current best prac-
tices into a set of standards that educators can look to in guiding their own performance. Several organizations
have released sets of these emerging standards based on practitioner input to date.
In late 2006, the Educational Technology Cooperative of the Southern Regional Education Board (SREB) issued
Standards for Quality Online Courses. These standards are available along with other key resources from SREB at
http://www.sreb.org/programs/EdTech/SVS/index.asp.
A year later, NACOL published National Standards of Quality for Online Courses, which endorsed SREB’s standards
and added a few others. The national standards cover six broad topic areas: course content, instructional design,
student assessment, technology, course evaluation and management, and 21st-century skills.
NACOL also developed National Standards for Quality Online Teaching in 2008 and is currently working on pro-
gram standards. The standards for online courses and teaching can be found at http://www.nacol.org along with
many other resources.
The National Education Association also has published standards for online courses and online teachers, both
available at http://www.nea.org.
In a different but equally helpful fashion, the and chief executive officer of Virtual High School
leaders of Appleton eSchool have contributed to Global Consortium, a nonprofit network of on-
the field, by developing a Web site that allows line schools, describes the problem this way:
administrators of online programs to share their
Although standards for online course and
best practices in a public forum.
program effectiveness have been identified,
data-driven yardsticks for measuring
Clearly Define Outcome Measures against those standards are not generally
Evaluators must be able to clearly articulate key agreed upon or in use. There is no general
program goals, define outcomes that align with agreement about what to measure and
them, and then identify specific outcome mea- how to measure. Even for measures
that most programs use, such as course
sures that can be used to track the program’s
completion rates, there is variation in the
progress in meeting the goals. Presently, how-
metrics because the online programs that
ever, outcome measures for evaluating online
measure course completion rates do not
learning programs are not consistently defined, measure in the same manner.3
which makes it difficult for stakeholders to gauge
a program’s success, compare it to other pro- The evaluators of Washington state’s Digital
grams, or set improvement goals that are based Learning Commons (DLC) encountered just such
on the experience of other programs. Further- a problem when attempting to calculate the
more, the lack of consistent outcome measures number of online course-takers served by the
creates technical headaches for evaluators. A re- program. DLC is a centrally hosted Web portal
cent article coauthored by Liz Pape, president that offers a wide range of online courses from
13
numerous private vendors. When evaluators academic and programmatic outcomes of dis-
from Cohen Research and Evaluation tried to an- tance learning programs.”5 A NACOL effort is
alyze DLC’s course-taking and completion rates, under way to develop benchmarks for measur-
they found a range of reporting practices among ing program effectiveness and overall standards
the vendors: Some tracked student participation for program quality. Meanwhile, the best evalu-
throughout the course, while others reported ators can do is to ensure internal consistency in
only on the number of students who completed the outcome measures used across all of their
a course and received a final grade. Also, in some own data sources. Although the research base
cases, vendors did not differentiate between stu- is limited, evaluators may be able to find similar
dents who withdrew from a course and students studies for ideas on how to define outcomes.
who received an F, conflating two different stu- Moving forward, online program evaluators can
dent outcomes that might have distinct implica- proactively find ways to share research methods
tions for program improvement. and definitions and reach a consensus on the
best ways to measure program effectiveness.
Pape et al. note additional problems in the ways
that course completion is defined:
Work Collaboratively to Develop New
When does the measure begin? How is Evaluation Tools
completion defined? Do students have a
In the early years of Appleton eSchool, school
“no penalty” period of enrollment in the
leaders became aware that they needed an over-
online course during which they may
drop from the course and will not be all evaluation system to determine the school’s
considered when calculating the course strengths and weaknesses. After failing to find an
completion rate? Is completion defined existing comprehensive tool that would fit their
as a grade of 60 or 65? How are students needs, Ben Vogel, Appleton’s principal and Gov-
who withdrew from the course after the ernance Board chair, and Connie Radtke, Apple-
“no penalty” period counted, especially if
ton’s program leader, began to develop their
they withdrew with a passing grade? 4
own evaluation process. Their goal was to de-
Following the recommendations of their evalua- sign an instrument that would identify the core
tors, DLC staff made efforts to communicate defi- components necessary for students to be suc-
nitions of course completion and withdrawal that cessful in an online program. In addition, they
were internally consistent and made sure each wanted to create a process that could be used to
vendor was reporting accurate data based on prompt dialogue among program leaders, staff,
these conversations. The result was a higher level governance board members, and external col-
of consistency and accuracy in data reporting. leagues, about the components of a successful
online learning experience, in order to provide
There is growing attention to the problem of un- direction for future growth and enhancement.
defined outcome measures in the field of evalu-
ating online learning. A 2004 report by Cathy Through extensive internal discussions and
Cavanaugh et al., specifically recommended consultation with external colleagues, includ-
that standards be developed “for reporting the ing staff from such online programs as Virtual
14
High School and Florida Virtual School, Vo- eSchool’s Online Program Perceiver Instrument,
gel and Radtke identified eight key program p. 17). To assess mentor perception of program
components and developed a rubric for mea- quality, the evaluators surveyed them following
suring them called the Online Program Perceiv- the completion of each online course.
er Instrument (OPPI). (See Key Components of
Appleton eSchool’s Online Program Perceiver As part of the OPPI process, Vogel and Radtke
Instrument, p. 16.) Vogel says, “[Our goal was] developed a three-phase approach to internal
to figure out those universal core components evaluation, which they refer to as a “self-dis-
that are necessary in a K–12 online program to covery” process. In the Discovery Phase, pro-
allow students to be successful.… We were able gram personnel fill out a report that describes
to share [our initial thoughts] with other people the school’s practices in each of the eight areas
in the K–12 realm, and say, ‘What are we miss- identified in the OPPI. Then program decision-
ing? And what other pieces should we add?’ It makers use the rubric to determine what level of
just kind of developed from there.” program performance is being attained for each
element: deficient, developing, proficient, or ex-
The core components identified in the OPPI are emplary. In addition, program leaders e-mail sur-
what Appleton leaders see as the essential build- veys to students, mentors (usually parents), and
ing blocks for supporting students in an online teachers at the end of each course, giving them
environment. The eight components address the an opportunity to comment on the program’s
online program user’s entire experience, from performance in each of the eight OPPI areas.
first learning about the course or program to In the Outcome Phase, results from the Discov-
completing it. ery Phase report and surveys are summarized,
generating a numerical rating in each program
When developing the OPPI, Vogel and Radtke area. At the same time, information on student
researched many existing online program eval- outcomes is reviewed, including student grades,
uations in higher education, but found them grade point averages, and course completion
insufficient for building a comprehensive rubric rates. Program decision-makers synthesize all
at the K–12 level. Vogel notes, for example, that of this data in an outcome sheet and use it to
having face-to-face mentors or coaches for stu- set goals for future growth and development.
dents taking online courses is critical at the K–12 Finally, in the Sharing of Best Practices Phase,
level, whereas it is not considered so important program leaders may select particular practices
for older students who are studying online in to share with other programs. Appleton has part-
higher education. To capture this program ele- nered with other online programs to form the
ment, Vogel and Radtke included “Program Sup- Wisconsin eSchool Network, a consortium of vir-
port” as a key component in the OPPI rubric, fo- tual schools that share resources. The Network’s
cusing on the training given to mentors (usually Web site includes a Best Practices Portfolio and
parents in the Appleton eSchool model) as well schools using the OPPI are invited to submit ex-
as training for local school contacts who sup- amples from their evaluations. 6 Practices that are
port and coordinate local student access to on- determined to be in the “proficient” or “exem-
line courses (see table 2, Excerpt from Appleton plary” range are considered for placement in the
15
Key Components of Appleton eSchool’s Online Program

Perceiver Instrument (OPPI)
Practitioners and program administrators use the OPPI to evaluate program performance in eight different areas:
1. Program Information: System provides and updates information necessary for prospective users to understand the
program being offered and determine whether it may be a good fit for students.
2. Program Orientation: System provides an introduction or orientation that prepares students to be successful in the
online course.
3. Program Technology: System provides and supports program users’ hardware and software needs in the online
environment.
4. Program Curriculum: System provides and supports an interactive curriculum for the online course.
5. Program Teaching: System provides and supports teaching personnel dedicated to online learning and their
online students.
6. Characteristics and Skills Displayed by Successful Online Students: System identifies and provides opportunities
for students to practice characteristics necessary for success in an online environment.
7. Program Support: System provides and supports a system of support for all online students and mentors (e.g.,
parents) and coaches.
8. Program Data Collection: System collects data and uses that data to inform program decision-makers and share
information with other programs.
portfolio, and participating schools are cited for performance). They also invested about $15,000
their contributions. The entire evaluation system of their program grant funds to pay a Web de-
is Web-based, allowing for streamlined data col- veloper to design the online system for compil-
lection, analysis, and sharing. ing and displaying evaluation data. For others at-
tempting to develop this type of tailored rubric
Appleton’s decision to develop its own evaluation
and process, accessing outside expertise is critical
rubric and process provides several advantages.
to fill gaps in knowledge or capacity. Appleton
Besides resulting in a perfectly tailored evaluation
leaders collaborated extensively with experienced
process, Appleton leaders also have the ability to
colleagues from other virtual schools, particularly
evaluate their program at any time without waiting
as they were developing their rubric.
for funding or relying on a third-party evaluator.
Still, developing an evaluation process is typically
Share Evaluation Findings With Other
expensive and may not be a practical option for
Programs and Evaluators
many programs. Appleton’s leaders spent many
hours researching and developing outcome mea- Although the OPPI rubric was developed specifi-
sures (i.e., the descriptions of practice for each cally for Appleton, from the beginning Vogel and
program component under each level of program Radtke intended to share it with other programs.
16
Table 2. Excerpt From Appleton eSchool’s Online Program Perceiver Instrument*
Component 7. Program Support. The purpose of the support network is to provide additional support
to the student that complements the instructor. This includes not only a support person from home,
but also other school resources that may include a counselor, social worker, etc.
LEVEL OF PROGRAM PERFORMANCE

ELEMENT DEFICIENT DEVELOPING PROFICIENT EXEMPLARY
Overall Student No student support Student support struc- Student support structure Exemplary student sup-
Support Structure Plan structure plan in place ture plan is in place plan is in place and is port structure plan is in
or the plan is poorly but parts of the plan clear in its purpose and place and is a proven
written and/or incom- are incomplete or may objectives. The plan model for assuring stu-
plete. Students are not be unclear. The plan provides for resources dents have necessary
receiving necessary provides for resources necessary for student support and ongoing
resources. necessary for student success. support.
success but some gaps
in the plan may exist.
Mentor responsibilities Mentor responsibilities Mentor responsibilities Mentor responsibilities Mentor responsibilities
have been developed have not been devel- have been developed are well written, clear, are extremely well writ-
and shared oped or are incomplete and shared but parts and have been shared ten, clear, and shared in
and/or unclear or have may be unclear and/ with all mentors. numerous formats with
not been appropriately or all mentors have not all mentors.
shared. received the information.
Mentors are provided No mentor training pro- Mentor training program Mentor training program Mentor training program
necessary training gram provided or there is provided and there is provided and there is provided and there
and support are no written objectives are written objectives are objectives that out- are extremely clear and
that clearly outline the that outline the purpose line the purpose of the concise objectives that
purpose of the men- of the mentor training. mentor training. Few outline the purpose of
tor training program. Some objectives may objectives are unclear. the mentor training pro-
No ongoing support is be unclear. Ongoing Ongoing support is gram. Ongoing support
available. support is available, but available. is consistently provided.
may be inconsistent.
Mentors provide No system in place to System in place to System in place to mon- Proven system in place
positive support for monitor mentor support monitor mentor support itor mentor support and/ to monitor mentor sup-
student and/or concerns that and/or concerns that or concerns that may port and/or concerns
may arise. may arise, but system arise. System is reliable that may arise. The
may break down from most of the time. system provides op-
time to time. portunities for two-way
communication.
Mentors have the abil- No system in place to System in place to System in place to allow Proven system in
ity to communicate allow mentors to com- allow mentors to com- mentors to communi- place to allow men-
with teacher municate with teacher in municate with teacher cate with teacher in a tors to communicate
a timely manner. in a timely manner but timely manner. System with teacher in a timely
system may break down is reliable most of the manner. The system
from time to time. time. provides check-ins with
teacher on a regular
basis.
* The U.S. Department of Education does not mandate or prescribe particular curricula or lesson plans. The information in this table was provided by the
identified site or program and is included here as an illustration of only one of many resources that educators may find helpful and use at their option. The
Department cannot ensure its accuracy. Furthermore, the inclusion of information in this table does not reflect the relevance, timeliness, or completeness of this
information; nor is it intended to endorse any views, approaches, products, or services mentioned in the table.
17
This rubric is currently accessible free of charge teacher who may not be certified to deliver al-
through the Web site of the Wisconsin eSchool gebra instruction. But once in this classroom,
Network, described above.* Vogel explains that each student has his or her own computer and
the OPPI and its umbrella evaluation system are participates in an online class delivered by a
readily adaptable to other programs: “Internally, highly qualified (i.e., certified) algebra teacher.
this system doesn’t ask people to have an over- The in-class teacher gives students face-to-face
whelming amount of knowledge. It allows peo- assistance, oversees lab activities, proctors tests,
ple to make tweaks as needed for their particu- and is generally responsible for maintaining an
lar programs, but they don’t have to create the atmosphere that is conducive to learning. The
whole wheel over again [by designing their own online teacher delivers the algebra instruction,
evaluation system].” The OPPI system also allows answers students’ questions via an online dis-
for aggregation of results across multiple pro- cussion board, grades assignments via e-mail,
grams—a mechanism that would allow groups of provides students with feedback on homework
schools in a state, for example, to analyze their and tests, and submits grades. The online and
combined data. To assist schools using the OPPI in-class teachers communicate frequently with
for the first time, Appleton offers consultation each other to discuss students’ progress and col-
services to teach other users how to interpret and laborate on how to help students learn the par-
communicate key findings. Through their efforts ticular content being covered. This interaction
to share their evaluation tool and create the on- between teachers not only benefits students; it
line forum, Appleton leaders have developed an also serves as a form of professional develop-
efficient and innovative way to build the knowl- ment for the in-class instructors. In addition to
edge base on online learning programs. providing all students with high-quality algebra
instruction, a secondary goal of the program
In Louisiana, evaluators from Education Devel- is to increase the instructional skills of the in-
opment Center (EDC) have used more conven- class teachers and support them in earning their
tional channels for sharing findings from the mathematics teaching certificate.
evaluation of the state’s Algebra I Online pro-
gram. This program was created by the Louisiana Although its founders believed that the Alge-
Department of Education to address the state’s bra I Online model offered great promise for
shortage of highly qualified algebra teachers, es- addressing Louisiana’s shortage of mathemat-
pecially in urban and rural settings. In addition, ics teachers, when the program was launched
districts desiring to provide certified teachers in 2002 they had no evidence to back up this
access to pedagogy training and mentoring so belief. The key question was whether such a
they can build capacity for strong mathematics program could provide students with learning
instruction are eligible to participate. In Alge- opportunities that were as effective as those
bra I Online courses, students physically attend in traditional settings. If it were as effective,
class in a standard bricks-and-mortar classroom the program could provide a timely and cost-
at their home school, which is managed by a effective solution for the mathematics teacher
* Registration is required to access the OPPI and some consultation with its developers
shortage. Louisiana needed hard evidence to
may be needed to implement the process fully. show whether the program was credible.
18
Following a number of internal evaluation activi- learning experiences differently than students in
ties during the program’s first two years, in 2004 traditional classrooms, the evaluators surveyed
the program’s leaders engaged an external eval- students in both environments and conducted
uation team consisting of Rebecca Carey of EDC, observations in half of the treatment classrooms.
an organization with experience in researching In total, the evaluators studied Algebra I Online
online learning; Laura O’Dwyer of Boston Col- courses and face-to-face courses in six districts.
lege’s Lynch School of Education; and Glenn
Kleiman of the Friday Institute for Educational After completing their assessment, the evaluators
Innovation at North Carolina State University, produced final reports for the Louisiana Depart-
College of Education. The evaluators were im- ment of Education and NCREL and later wrote
pressed by the program leaders’ willingness to two articles about the program for professional
undergo a rigorous evaluation. “We didn’t have journals. The first article, published in the Jour-
to do a lot of convincing,” says EDC’s Carey. nal of Research on Technology in Education,7
“They wanted it to be as rigorous as possible, described Algebra I Online as a viable model for
which was great and, I think, a little bit unusual.” providing effective algebra instruction. In the
The program also was given a boost in the form study, online students showed comparable (and
of a grant from the North Central Regional Edu- sometimes stronger) test scores, stayed on task,
cational Laboratory (NCREL), a federally funded and spent more time interacting with classmates
education laboratory. The grant funded primary about math content than students in traditional
research on the effectiveness of online learning classroom settings. The evaluators speculated
and provided the project with $75,000 beyond its that this was a result of the program’s unique
initial $35,000 evaluation budget from the state model, which brings the online students togeth-
legislature. The additional funding allowed EDC er with their peers at a regularly scheduled time.
to add focus groups and in-class observations, as The evaluators found a few areas for concern as
well as to augment its own evaluation capacity well. For example, a higher percentage of on-
by hiring an external consultant with extensive line students reported that they did not have a
expertise in research methodology and analysis. good learning experience, a finding that is both
supported and contradicted by research studies
The EDC evaluators chose a quasi-experimen- on online learning from higher education. The
tal design (see Glossary of Common Evaluation evaluation also found that the Algebra I Online
Terms, p. 65) to compare students enrolled in students felt less confident in their algebra skills
the online algebra program with those studying than did traditional students, a finding the evalu-
algebra only in a traditional face-to-face class- ators feel is particularly ripe for further research
room format. To examine the impact of the Alge- efforts. (For additional discussion, see Interpret-
bra I Online course, they used hierarchical linear ing the Impact of Program Maturity, p. 40.)
modeling to analyze posttest scores and other
data collected from the treatment and control The Algebra I Online evaluators wrote and pub-
groups. To determine if students in online learn- lished a second article in the Journal of Asyn-
ing programs engaged in different types of peer- chronous Technologies that focused on the pro-
to-peer interactions and if they perceived their gram as a professional development model for
19
uncertified or inexperienced math teachers.8 Given the lack of commonly used outcome mea-
In this piece, the evaluators described the pro- sures for online learning evaluations, individual
grams’ pairing of online and in-class teachers as programs should at least strive for internal consis-
a “viable online model for providing [the in-class] tency, as DLC has. If working with multiple ven-
teachers with an effective model for authentic dors or school sites, online program leaders need
and embedded professional development that is to articulate a clear set of business rules for what
relevant to their classroom experiences.” data are to be collected and how, distributing
these guidelines to all parties who are collecting
Of course, not all programs will have the re- information. Looking ahead, without these com-
sources to contribute to the research literature in mon guidelines, evaluators will be hard pressed
this manner. In Louisiana’s case, the evaluators to compare their program’s outcomes with oth-
had extensive expertise in online evaluation and ers. Some of the evaluators featured in this guide
took the time and initiative required for publish- have made contributions to the field of online
ing their findings in academic journals. In so learning evaluation, like Appleton’s leaders, who
doing, they served two purposes: providing the developed an evaluation model that can be bor-
program leaders with the evidence they needed rowed or adapted by other programs, and the
to confidently proceed with Algebra I Online evaluators of Algebra I Online, who published
and publishing much-needed research to states their study findings in professional journals.
that might be considering similar approaches.
Summary Evaluating Multifaceted Online

The literature on K–12 online learning is grow-
Resources
ing. Several publications and resources docu- Like many traditional education programs, on-
ment emerging best practices and policies in line learning resources sometimes offer par-
online learning (see appendix A). For program ticipants a wide range of learning experiences.
leaders and evaluators who are developing an Their multifaceted offerings are a boon for stu-
evaluation, the quality standards from SREB and dents or teachers with diverse interests, but can
NACOL provide a basic framework for looking at be a dilemma for evaluators seeking uniform
the quality of online courses and teachers. There findings about effectiveness. In the case of an
also is a growing body of studies from which educational Web site like Washington’s DLC, for
evaluators can draw lessons and adapt methods; example, different types of users will explore
evaluators need not reinvent the wheel. At the different resources; some students may take an
same time, they must exercise caution when ap- online course while others may be researching
plying findings, assumptions, or methods from colleges or seeking a mentor. Virtual schools that
other studies, as online programs and resources use multiple course providers present a similar
vary tremendously in whom they serve, what conundrum, and even the same online course
they offer, and how they offer it. What works may offer differentiated learning experiences if,
best for one program evaluation may not be for example, students initiate more or less con-
appropriate for another. tact with the course instructor or receive varying
20
degrees of face-to-face support from a parent that sign up to use DLC, the program provides
or coach. (A similar lack of uniformity can be training for school personnel to assist them in
found in traditional settings with different in- implementing the Web site’s resources.
structors using varying instructional models.)
Initially, DLC’s evaluation strategy was to col-
When faced with a multifaceted resource, how lect broad information about how the Web site
is an evaluator to understand and document the is used. Later, program leaders shifted their
online learning experience, much less deter- strategy to focus on fewer and narrower topics
mine what value it adds? that could substantiate the program’s efficacy.
The evaluators focused on student achieve-
Several of the evaluations featured in this guide
ment in the online courses and on school-level
encountered this issue, albeit in distinct ways.
supports for educators to help them make the
DLC evaluators were challenged to assess how
most of DLC’s resources. Together, the series of
students experienced and benefited from the
DLC evaluations—there have been at least five
Web site’s broad range of resources. Evaluators
distinct efforts to date—combine breadth and
of Maryland Public Television’s Thinkport Web
depth, have built on each other’s findings from
site, with its extensive teacher and student re-
year to year, and have produced important for-
sources from many providers, similarly struggled
mative and summative findings (see Glossary of
to assess its impact on student achievement. In
Common Evaluation Terms, p. 65).
a very different example, the evaluators for the
Arizona Virtual Academy (AZVA) faced the chal- In the project’s first year, Debra Friedman, a lead
lenge of evaluating a hybrid course that includ- administrator at the University of Washington (a
ed both online and face-to-face components DLC partner organization), conducted an evalu-
and in which students’ individual experiences ation that sought information on whom DLC
varied considerably.
serves and what school conditions and poli-
cies best support its use. To answer these ques-
Combine Breadth and Depth to Evaluate tions, the evaluators selected methods designed
Resource-rich Web Sites
to elicit information directly from participants,
With its wide range of services and resources including discussions with DLC administrators,
for students and teachers, DLC is a sprawling, board members, school leaders, and teachers, as
diverse project to evaluate. Through this cen- well as student and teacher surveys that asked
trally hosted Web site, students can access over about their use of computers and the Internet
300 online courses, including all core subjects and about the utility of the DLC training. The
and various electives, plus Advanced Placement evaluator also looked at a few indicators of stu-
(AP) and English as a Second Language courses. dent achievement, such as the grades that stu-
DLC also offers students online mentors, college dents received for DLC online courses.
and career planning resources, and an extensive
digital library. In addition, DLC offers other re- The evaluation yielded broad findings about
sources and tools for teachers, including online operational issues and noted the need for DLC
curricula, activities, and diagnostics. For schools to prioritize among its many purposes and
21
a udiences. It also uncovered an important find- administrators (primarily to help develop survey
ing about student achievement in the online questions). To gain insight into how well stu-
courses: The greatest percentage of students dents were performing in the classes, the evalu-
(52 percent) received Fs, but the next greatest ators analyzed grades and completion rates for
percentage (37 percent) received As. To explain students enrolled in DLC courses. The evalu-
these outcomes, the evaluator pointed to the lack ation activities conducted in the second year
of uniformity in students’ motivation and needs, again pointed to the need for more school-level
and the type of academic support available to support for using DLC resources. The evalua-
them. The evaluator also noted the varying qual- tors found that some schools were excited and
ity of the vendors who provided courses, finding committed to using the Web site’s resources,
that “some vendors are flexible and responsive but were underutilizing it because they lacked
to students’ needs; others are notably inflexible. sufficient structures, such as training and time
Some are highly professional operations, others for teachers to learn about its offerings, inter-
less so.”9 This evaluation also described substan- nal communication mechanisms to track student
tial variation in how well schools were able to progress, and adequate technical support.
support the use of DLC resources. The findings
helped program administrators who, in response, When DLC’s leaders began to contemplate a
stepped up their efforts to train educators about third-year evaluation, they wanted more than
the Web portal’s resources and how to use it basic outcome data, such as student grades and
with students. The evaluation also was a jump- completion rates. “We can count how many
ing-off point for future assessments that would courses, we know the favorite subjects, and we
follow up on the themes of student achievement know the grade averages and all of that,” says
in the online courses and supports for educators Judy Margrath-Huge, DLC president and chief ex-
to help them take advantage of DLC’s offerings. ecutive officer. What they needed, she explains,
was to get at the “so what,” meaning they wanted
In the project’s second year, project leaders to understand “what difference [DLC] makes.”
launched another evaluation. This effort con-
sisted of student focus groups to identify stu- The evaluation team knew that if its effort were
dents’ expectations of DLC’s online courses, stu- to produce reliable information about DLC’s in-
dent pre-course preparation, overall experience, fluence on student achievement, it would need
and suggestions for improving the courses and to zero in on one, or just a few, of the Web
providing better support. As a separate effort, site’s many components. Some features—DLC’s
they also contracted with an independent evalu- vast digital library, for example—simply were
ator, Cohen Research and Evaluation, to learn not good candidates for the kind of study they
more about the behavior and motivations of planned to conduct. As Karl Nelson, DLC’s di-
students and other users, such as school librar- rector of technology and operations, explains,
ians, teachers, and administrators. This aspect of “It is very difficult to evaluate the effectiveness
the evaluation consisted of online surveys with of and to answer a ‘so what’ question about a
students, teachers, and school librarians; and in- library database, for example. It’s just hard to
terviews with selected teachers, librarians, and point to a kid using a library database and then
22
a test score going up.” Ultimately, says Nelson, Learning Commons’ Meeting 21st Century Learn-
DLC’s leaders chose to look primarily at the on- ing Challenges in Washington State, p. 24).
line courses, believing that this was the feature
they could best evaluate. It would be impossible to conduct a compre-
hensive evaluation of everything that DLC has
DLC leaders hired outside evaluators, Fouts & to offer, but certainly the evaluation strategy
Associates, to help them drill down into a spe- of combining breadth and depth has given it a
cific aspect of student achievement—determin- great deal of useful information. DLC’s leaders
ing the role that DLC online courses play in: 1) have used the findings from all the evaluations
enabling students to graduate from high school to improve their offerings and to demonstrate
and 2) helping students become eligible and ful- effectiveness to funders and other stakeholders.
ly prepared for college. In this evaluation, the
researchers identified a sample of 115 graduated In Maryland, Thinkport evaluators faced a similar
seniors from 17 schools who had completed DLC challenge in trying to study a vast Web site that
courses. The evaluators visited the schools to compiles educational resources for teachers and
better understand online course-taking policies students. At first, the evaluation team from Mac-
and graduation requirements and to identify DLC ro International, a research, management, and
courses on the transcripts of these 115 students. information technology firm, conducted such ac-
At the school sites, evaluators interviewed school tivities as gathering satisfaction data, reviewing
coordinators and examined student achievement Web site content, and documenting how the site
data, student transcripts, and DLC documents. was used. But over time, project leaders were
asked by funders to provide more concrete evi-
The evaluation gave DLC’s leaders what they dence about Thinkport’s impact on student per-
wanted: concrete evidence of the impact of formance. The evaluation (and the project itself)
DLC online courses. This study showed that 76 had to evolve to meet this demand.
percent of students who took an online course
through DLC did so because the class was not In response, the team decided to “retrofit” the
available at their school and that one-third of the evaluation in 2005, settling on a two-part evalu-
students in the study would not have graduated ation that would offer both breadth and depth.
without the credits from their online course. This First, the evaluators surveyed all registered us-
and other evaluation findings, show that “we are ers about site usage and satisfaction, and sec-
meeting our mission,” says Margrath-Huge. “We ond, they designed a randomized controlled
are accomplishing what we were set out to ac- trial (see Glossary of Common Evaluation Terms,
complish. And it’s really important for us to be p. 65) to study how one of the site’s most popu-
able to stand and deliver those kinds of messages lar features—an “electronic field trip”—affected
with that kind of data behind us.” DLC has used students’ learning. Several field trips had been
its evaluation findings in multiple ways, includ- developed under this grant; the one selected was
ing when marketing the program to outsiders, Pathways to Freedom, about slavery and the Un-
to demonstrate its range of offerings and its ef- derground Railroad. This particular product was
fect on students (see fig. 1, Excerpt from Digital chosen for a number of reasons: most middle
23
Figure 1. Excerpt From Digital Learning Commons’ Meeting 21st Century

Learning Challenges in Washington State*
* The U.S. Department of Education does not mandate or prescribe particular curricula or lesson plans. The information in this figure was provided by the
Department cannot ensure its accuracy. Furthermore, the inclusion of information in this figure does not reflect the relevance, timeliness, or completeness of this
information; nor is it intended to endorse any views, approaches, products, or services mentioned in the figure.
24
school social studies curricula include the top- challenge is found at the micro level, as when
ic; the evaluators had observed the field trip in students have heterogeneous experiences in
classrooms in an earlier formative study and were the same online class. The leaders of Arizona’s
aware of students’ high interest in its topics and AZVA struggled with this problem when they set
activities; and Web site statistics indicated that it out to evaluate one of their online courses.
was a heavily viewed and utilized resource.
In 2006, AZVA school leaders began to ex-
The evaluators gave pre- and posttests of content periment with hybrid courses—regular online
knowledge about slavery and the Underground classes supplemented with weekly face-to-face
Railroad to students whose teachers used the lessons from a classroom teacher. The in-per-
electronic field trip and control groups of stu- son component was originally designed in a
dents whose teachers used traditional instruction. very structured way: Students received class-
They found that the electronic field trip had a room instruction every week at a specific time
very substantial positive impact on student learn- and location, and they had to commit to this
ing, particularly among students whose teachers weekly instruction for an entire semester. In ad-
had previously used it: These students of experi- dition, students could participate in the hybrid
enced teachers scored 121 percent higher on the class only if they were working either on grade
content knowledge test than the students whose level or no more than one level below grade
teachers used traditional instruction. level. These restrictions allowed the face-to-face
teachers to offer the same lessons to all students
Like the DLC evaluation, the Thinkport evalua-
during the weekly session. School leaders spe-
tion proved useful both for formative and sum-
cifically designed this structure to bring unifor-
mative purposes. Thinkport’s leaders learned
mity to students’ experiences and make it easier
that teachers who were new to the electronic
to evaluate the class. As AZVA’s director, Mary
field trip needed more training and experience
Gifford, explains, “We wanted the hybrid expe-
to successfully incorporate it into their class-
rience to be the same for all the kids so we
rooms. They also learned that once teachers
could actually determine whether or not it is
knew how to use the tool, their students learned
increasing student achievement.”
the unit’s content far better than their peers in
traditional classes. The evaluators’ two-part plan However, when program leaders surveyed par-
gave Thinkport’s leaders what they needed: ents at the semester break, Gifford says, “Par-
broad information about usage and satisfaction ents offered some very specific feedback.” They
and credible evidence that a frequently used didn’t like the semester-long, once-a-week com-
feature has a real impact on students. mitment, and they argued that the structure
prevented students from working at their own
Use Multiple Methods to Capture Wide-ranging pace. Instead, parents wanted a drop-in model
Student Experiences in Online Courses that would offer students more flexibility and
In the examples above, evaluators struggled to tailored assistance. In response, she says, “We
wrap their arms around sprawling Web resourc- totally overhauled the course for the second se-
es that lacked uniformity. Sometimes a similar mester and made it a different kind of a model.”
25
In the new format, teachers do not deliver pre- complementary research methods—can identify
pared lessons, but, instead, work with students the circumstances under which the program or
one-on-one or in small groups on any topic with resource is most likely to succeed or fail and can
which a student is struggling. generate useful recommendations for strength-
ening weak points. Evaluators who are studying
While the flexibility of the new model meets
multifaceted resources should consider a strat-
student needs, students naturally will have more
egy that combines both breadth and depth.
varied experiences using it and school leaders
will not be able to isolate its effect on student If studying an educational Web site that offers an
achievement. In other words, observed gains array of resources, evaluators might collect broad
could be due to any number of factors, such as information about site usage and then select one
how frequently the student drops in, whether or two particular features to examine in more
a student works one-on-one with a teacher or depth. Program leaders can facilitate this process
in a small group, and what content is covered by clearly articulating what each resource is in-
during the drop-in session. In this instance, the tended to do, or what outcomes they would hope
needs of program participants necessarily out-
to see if the resource was being used effectively.
weighed those of the evaluation.
From this list, program leaders and evaluators
can work together to determine what to study
However, because the school has a number of
other data collection efforts in place, Gifford and and how. In some instances, it might be logical
her colleagues will still be able to gather infor- to design a multiyear evaluation that focuses on
mation about whether the hybrid model is help- distinct program components in different years,
ing students. School administrators track how or collects broad information in the first year, and
frequently students attend the drop-in class and narrows in focus in subsequent years.
chart students’ academic progress through the
If evaluating a particular course or resource that
curriculum both before and after they partici-
offers students a wide range of experiences,
pate in the hybrid program. AZVA also separate-
evaluators might consider using a mix of quanti-
ly examines state test scores for students who
attend the hybrid program on a regular basis. In tative and qualitative methods to provide a well-
addition, AZVA frequently uses surveys of par- rounded assessment of it. Rich, descriptive in-
ents, students, and teachers to gather informa- formation about students’ experiences with the
tion about the effectiveness of many aspects of course or resource can be useful when trying to
their program, including the hybrid class. These interpret data about student outcomes.
kinds of activities can provide important insights
when controlled studies are impossible.
Finding Appropriate Comparison
Summary Groups
Though multifaceted resources can make it dif- When evaluators have research questions about
ficult for evaluators to gauge effectiveness, good an online program’s impact on student achieve-
evaluations—especially those using multiple, ment, their best strategy for answering those
26
questions is often an experimental design, like with face-to-face learning settings, and they
a randomized controlled trial, or a quasi-experi- took various approaches to the inherent techni-
mental design that requires them to find matched cal challenges of doing so. Despite some dif-
comparison groups. These two designs are the ficulties along the way, the evaluators of online
best, most widely accepted methods for deter- projects in Louisiana, Alabama, and Maryland all
mining program effects (see table 3, p. 28). successfully conducted comparative studies that
yielded important findings for their programs.
Evaluators who use these methods must proceed
carefully. They must first ensure that compari- Identify Well-matched Control Groups for
sons are appropriate, taking into consideration Quasi-experimental Studies
the population served and the program’s goals
and structure. For example, online programs In designing the evaluation for Louisiana’s Alge-
dedicated to credit recovery* would want to com- bra I Online, program leaders in the Louisiana
pare their student outcomes with those of other Department of Education wanted to address the
credit recovery programs because they are serv- bottom line they knew policymakers cared about
ing similiar populations. Evaluators must also be most: whether students in the program’s online
scrupulous in their efforts to find good compari- courses were performing as well as students
son groups—often a challenging task. For a par- studying algebra in a traditional classroom.
ticular online class, there may be no correspond-
To implement the quasi-experimental design of
ing face-to-face class. Or it may be difficult to
their evaluation (see Glossary of Common Evalu-
avoid a self-selection bias if students (or teach-
ation Terms, p. 65), the program’s administrators
ers) have chosen to participate in an online pro-
needed to identify traditional algebra classrooms
gram. Or the online program might serve a wide
to use as controls. The idea was to identify stan-
range of students, possibly from multiple states,
dard algebra courses serving students similar to
with a broad range of ages, or with different lev-
those participating in the Algebra I Online pro-
els of preparation, making it difficult to compare
gram and then give pre- and post-course tests to
to a traditional setting or another online program.
both groups to compare how much each group
Evaluators attempting to conduct a randomized
learned, on average. Of course, in a design such
controlled trial can encounter an even greater
as this, a number of factors besides the class
challenge in devising a way to randomly assign
format (online or face-to-face) could affect stu-
students to receive a treatment (see Glossary of
dents’ performance: student-level factors, such as
Common Evaluation Terms, p. 65) online.
individual ability and home environment; teach-
How might evaluators work around these com- er-level factors, such as experience and skill in
parison group difficulties? teaching algebra; and school-level factors, such
as out-of-class academic supports for students.
Several of the evaluations featured in this guide Clearly, it was important to the evaluation to find
sought to compare distance learning programs the closest matches possible.
* Credit recovery is a way to “recover” credit for a course that a student took previously but
without successfully earning academic credit. Credit recovery programs have a primary
Under a tight timeline, program administrators
focus of helping students stay in school and accumulate the credits needed to graduate. worked with all their participating districts to
27
Table 3. Evaluation Design Characteristics
Design Characteristics Advantages Disadvantages
Experimental Incorporates random assignment of participants Most sound Institutional policy guidelines
design to treatment and control groups. The purpose of or valid study may make random assignment
randomization is to ensure that all possible ex- design available impossible.
planations for changes (measured and unmea- Most accepted
surable) in outcomes are taken into account, in scientific
randomly distributing participants in both the community
treatment and control groups so there should be
no systematic baseline differences. Treatment
and control groups are compared on outcome
measures. Any differences in outcomes may be
assumed to be attributable to the intervention.
Quasi- Involves developing a treatment group and More practical in Finding and choosing suitable treatment
experimental a carefully matched comparison group (or most educational and comparison groups can be difficult.
design groups). Differences in outcomes between the settings Because of nonrandom group
treatment and comparison groups are analyzed, Widely accepted assignment, the outcomes of interest in
controlling for baseline differences between them in scientific the study may have been influenced not
on background characteristics and variables of community only by the treatment but also by vari-
interest. ables not studied.
Source: Adapted from U.S. Department of Education, Mobilizing for Evidence-Based Character Education (2007). Available from http://www.ed.gov/programs/charactered/
mobilizing.pdf.
identify traditional classrooms that were matched their credit, [the program’s administrators] did the
demographically to the online classes and, then, best they could in the amount of time they had.”
administered a pretest of general mathematics Still, she adds, “the matches weren’t necessarily
ability to students in both sets of classes. External that great across the control and the experimen-
evaluators from the Education Development Cen- tal. … Had we been involved from the beginning,
ter (EDC) joined the effort at this point. Although we might have been a little bit more stringent
impressed by the work that the program staff had about how the control schools would match to
accomplished given time and budget constraints, the intervention schools and, maybe, have made
the EDC evaluators were concerned about the the selection process a little bit more rigorous.”
quality of matches between the treatment and
Ultimately, the evaluation team used students’
control classes. Finding good matches is a diffi-
pretest scores to gauge whether the treatment
cult task under the best of circumstances and, in
and control group students started the course
this case, it proved even more difficult for non-
with comparable skills and knowledge and em-
evaluators, that is, program and school adminis-
ployed advanced statistical techniques to help
trators. The quality of the matches was especially control for some of the poor matches.
problematic in small districts or nonpublic schools
that had fewer control classrooms from which to In their report, the evaluators also provided data
choose. EDC evaluator Rebecca Carey says, “To on differences between the control and treatment
28
groups (e.g., student characteristics, state test teachers, a problem they would have had if the
scores in math, size of the school), and they treatment and control classes were taught by
drew from other data sources (including surveys different people. Martha Donaldson, ACCESS’s
and observations) to triangulate their findings lead program administrator, says, “We were
(see Common Problems When Comparing On- looking to see if it makes a difference whether
line Programs to Face-to-Face Programs, p. 30). students are face-to-face with the teacher or if
To other programs considering a comparative they’re receiving instruction in another part of
study, the Algebra I Online evaluators recom- the state via the distance learning equipment.”
mend involving the evaluation team early in the To compare performance between the two
planning process and having them supervise the groups of students, evaluators gathered a range
matching of treatment and control groups. of data, including grades, scores on Advanced
Placement tests, if relevant, and enrollment and
The evaluators of Alabama’s Alabama Connect- dropout data. The design had some added logis-
ing Classrooms, Educators, & Students Statewide tical benefits for the evaluators: it was easier to
Distance Learning (ACCESS) initiative similarly have the control classrooms come from schools
planned a quasi-experimental design and needed participating in ACCESS, rather than having to
traditional classes to use as matches. ACCESS pro- collect data from people who were unfamiliar
vides a wide range of distance courses, including with the program.
core courses, electives, remedial courses, and
advanced courses, which are either Web-based, Despite these benefits, the strategy of using IVC
utilize interactive videoconferencing (IVC) plat- sending sites as control groups did have a few
forms, or use a combination of both technolo- drawbacks. For instance, the evaluators were
gies. In the case of IVC courses, distance learners not able to match treatment and control groups
receive instruction from a teacher who is deliver- on characteristics that might be important, such
ing a face-to-face class at one location while the as student- and school-level factors. It is pos-
distance learners participate from afar. sible that students in the receiving sites attend-
ed schools with fewer resources, for example,
When external evaluators set out to compare the and the comparison had no way to control for
achievement of ACCESS’s IVC students to that of that. For these reasons, the ACCESS evaluators
students in traditional classrooms, they decided ultimately chose not to repeat the comparison
to take advantage of the program’s distinctive between IVC sending and receiving sites the
format. As controls, they used the classrooms following year. They did, however, suggest that
where the instruction was delivered live by the such a comparison could be strengthened by
same instructor. In other words, the students at collecting data that gives some indication of
the site receiving the IVC feed were considered students’ pretreatment ability level—GPA, for
the treatment group, and students at the sending example—and using statistical techniques to
site were the control group. This design helped control for differences. Another strategy, they
evaluators to isolate the effect of the class format propose, might be to limit the study to such a
(IVC or face-to-face) and to avoid capturing the subject area as math or foreign language, where
effects of differences in style and skill among courses follow a sequence and students in the
29
Common Problems When Comparing Online Programs to Face-

to-Face Programs
Evaluators need to consider both student- and school- or classroom-level variables when they compare online and
face-to-face programs. At the student level, many online program participants enroll because of a particular cir-
cumstance or attribute, and thus they cannot be randomly assigned—for example, a student who takes an online
course over the summer to catch up on credits. The inherent selection bias makes it problematic to compare the
results of online and face-to-face students. Evaluators’ best response is to find, wherever possible, control groups
that are matched as closely as possible to the treatment groups; this includes matching for student demographic
characteristics; their reason for taking the course (e.g., credit recovery); and their achievement level. Classroom-
and school-level factors complicate comparisons as well. If the online program is more prevalent in certain types
of schools (e.g., rural schools) or classrooms (e.g., those lacking a fully certified teacher), then the comparison
unintentionally can capture the effects of these differences. Evaluators need to understand and account for these
factors when selecting control groups.
same course (whether online or traditional) online) and accompanying teacher support ma-
would have roughly similar levels of ability. terials that assist with standards alignment and
lesson planning.
Anticipate the Challenges of Conducting a
Randomized Controlled Trial In collaboration with its evaluation partner,
Macro International, the program’s parent orga-
Evaluators in Maryland had a different challenge
nization, Maryland Public Television, set out to
on their hands when they were engaged to study
study how the Pathways to Freedom electronic
Thinkport and its wide-ranging education offer-
field trip impacted student learning in the class-
ings—including lesson plans, student activities,
room and whether it added value. Rather than
podcasts,* video clips, blogs,** learning games,
conducting a quasi-experimental study in which
and information about how all of these things
the evaluator would have to find control groups
can be used effectively in classrooms. When
that demographically matched existing groups
asked by a key funder to evaluate the program’s that were receiving a treatment, the Thinkport
impact on student learning, the evaluation team evaluators wanted an experimental design study
chose to study one of Thinkport’s most pop- in which students were assigned randomly to
ular features, its collection of “electronic field either treatment or control groups.
trips,” each one a self-contained curricular unit
that includes rich multimedia content (delivered Although they knew it would require some extra
planning and coordination, the evaluators chose
* Podcasts are audio files that are distributed via the Internet, which can be played
back on computers to augment classroom lessons. to conduct a randomized controlled trial. This de-
sign could provide them with the strongest and
** Blogs are regularly updated Web sites that usually provide ongoing information on
a particular topic or serve as personal diaries, and allow readers to leave their own most reliable measure of the program’s effects.
comments. But first, there were challenges to overcome. If
30
students in the treatment and control groups were more time than is typical in a regular classroom.
in the same classroom, evaluators thought, they To ensure that students in both groups would
might share information about the field trip and spend a similar amount of time on the Under-
“contaminate” the experiment. Even having treat- ground Railroad unit and have varied resources to
ment and control groups in the same school could use, the evaluators provided additional curricular
cause problems: In addition to the possibility of materials to the control teachers, including books,
contamination, the evaluators were concerned DVDs, and other supplemental materials. On each
that teachers and students in control classrooms of the six days they delivered the unit, all teach-
would feel cheated by not having access to the ers in the study completed forms to identify the
field trip and would complain to administrators. standards they were covering, and they also com-
pleted a form at the end of the study to provide
To overcome these challenges and maintain the
general information about their lessons. During
rigor of the experimental design, program lead-
the course of the unit, the evaluators found that
ers decided to randomize at the school level.
the control teachers began working together to
They recruited nine schools in two districts and
pool their resources and develop lesson plans.
involved all eighth-grade social studies teachers
The evaluators did not discourage this interaction,
in each school, a total of 23 teachers. The evalu-
believing that it increased the control teachers’
ators then matched the schools based on student
ability to deliver the content effectively and, ul-
demographics, teacher data, and student scores
timately, added credibility to the study. To com-
on the state assessment. (One small school was
pare how well students learned the content of the
coupled with another that matched it demo-
unit, the evaluators assessed students’ knowledge
graphically, and the two schools were counted
of slavery and the Underground Railroad before
as one.) The evaluators then randomly identified
and after the instructional unit was delivered.
one school in each pair as a treatment school
and one as a control school. Teachers did not The evaluators’ approach to these challeng-
know until training day whether they had been es was thoughtful and effective. By balancing
selected as a treatment or control. (The control the experimental design concept with practical
group teachers were told that they would be considerations, the evaluation team was able to
given an orientation and would be able to use get the information they wanted and success-
the electronic field trip after the study.) fully complete the study.
A second challenge for the evaluation team was

Summary
ensuring that teachers in the control classrooms
covered the same content as the teachers who The evaluations of Algebra I Online, ACCESS,
were using the electronic field trip—a problem and Thinkport illustrate a variety of approaches
that might also be found in quasi-experimental to constructing comparative studies. For program
designs that require matched comparison groups. leaders who are considering an evaluation that
They were concerned because the electronic will compare the performance of online and tra-
field trip devotes six class periods to the topic of ditional students, there are several important con-
slavery and the Underground Railroad—perhaps siderations. First, program leaders should work
31
Highlights From the Three Comparative Analyses Featured in

This Section
The comparative analyses described in this section produced a number of important findings for program staff and
leaders to consider.
Algebra I Online (Louisiana). A quasi-experimental study that compared students’ performance on a posttest of
algebra content knowledge showed that students who participated in the Algebra I Online course performed at
least as well as those who participated in the traditional algebra I course, and on average outscored them on 18
of the 25 test items. In addition, the data suggested that students in the online program tended to do better than
control students on those items that required them to create an algebraic expression from a real-world example.
A majority of students in both groups reported having a good or satisfactory learning experience in their algebra
course, but online students were more likely to report not having a good experience and were less likely to report
feeling confident in their algebra skills. Online students reported spending more time interacting with other students
about the math content of the course or working together on course activities than their peers in traditional algebra
classrooms; the amount of time they spent socializing, interacting to understand assignment directions, and work-
ing together on in-class assignments or homework was about the same. The evaluation also suggested that teacher
teams that used small group work and had frequent communication with each other were the most successful.
ACCESS (Alabama). Overall, the evaluations of the program found that ACCESS was succeeding in expanding ac-
cess to a range of courses and was generally well received by users. The comparative analyses suggested that stu-
dents taking Advanced Placement (AP) courses from a distance were almost as likely to receive a passing course
grade as those students who received instruction in person, and showed that both students and faculty found the
educational experience in the distance courses was equal to or better than that of traditional, face-to-face courses.
The evaluation also found that in the fall semester of 2006, the distance course dropout rate was significantly lower
than nationally reported averages.
Thinkport (Maryland). The randomized controlled trial initially revealed that, compared to traditional instruction, the
online field trip did not have a significant positive or negative impact on student learning. However, further analysis
revealed that teachers using the electronic field trip for the first time actually had less impact on student learning than
those using traditional instruction, while teachers who had used the electronic field trip before had a significantly
more positive impact. In a second phase of the study, the evaluators confirmed the importance of experience with the
product: when teachers in one district used the electronic field trip again, they were much more successful on their
second try, and their students were found to have learned far more than students receiving traditional instruction.
with an evaluator to ensure that comparisons are effective as the traditional one, or more effective?
appropriate. Together, they will want to take into In some instances, when online programs are be-
account the program’s goals, the student popula- ing used to expand access to courses or teachers,
tion served, and the program’s structure. for example, a finding of “no significant differ-
ence” between online and traditional formats can
Second, program leaders should clearly articulate be acceptable. In these cases, being clear about
the purpose of the comparison. Is the evaluation the purpose of the evaluation ahead of time will
seeking to find out if the online program is just as help manage stakeholders’ expectations.
32
If considering a quasi-experimental design, or activity logs can be burdensome, and analyz-
evaluators will want to plan carefully for what ing test scores or grades can seem invasive. In
classes will be used as control groups, and as- the online arena, there can be additional obsta-
sess the ways they are different from the treat- cles. The innovative nature of online programs
ment classes. They will want to consider what can sometimes create problems: Some online
kinds of students the class serves, whether the program administrators simply may be struggling
students (or teachers) chose to participate in the to get teachers or students to use a new technol-
treatment or control class, whether students are ogy, let alone respond to a questionnaire about
taking the treatment and control classes for simi- the experience. And in the context of launching
lar reasons (e.g., credit recovery), and whether a new program or instructional tool, program
the students in the treatment and control classes staffers often have little time to spend on such
began the class at a similar achievement level. If data collection matters as tracking survey takers
a randomized controlled trial is desired, evalu- or following up with nonrespondents.
ators will need to consider how feasible it is
for the particular program. Is it possible to ran- Still more difficult is gaining cooperation from
domly assign students either to receive the treat- people who are disconnected from the program
ment or be in the control group? Can the control or evaluation—not uncommon when a new
group students receive the treatment at a future high-tech program is launched in a decades-old
institution. Other difficulties can arise when on-
date? Will control and treatment students be in
line programs serve students in more than one
the same classroom or school, and if so, might
school district or state. Collecting test scores or
this cause “contamination” of data?
attendance data, for example, from multiple bu-
Finally, there are a host of practical consider- reaucracies can impose a formidable burden.
ations if an evaluation will require collecting Privacy laws present another common hurdle
data from control groups. As we describe further when online evaluators must deal with regula-
in the next section, program leaders and evaluations in multiple jurisdictions to access secondary
tors need to work together to communicate the data. The problem can be compounded when
importance of the study to anyone who will col- officials are unfamiliar with a new program and
lect data from control group participants, and do not understand the evaluation goals.
to provide appropriate incentives to both the
When faced with these data collection challeng-
data collectors and the participants. The impor-
es, how should evaluators respond, and how
tance of these tasks can hardly be overstated:
can they avoid such problems in the first place?
The success of a comparative study hinges on
having adequate sets of data to compare. Among the evaluations featured in this guide,
data collection problems were common. When
study participants did not cooperate with data
Solving Data Collection Problems
collection efforts, the evaluators of Chicago Pub-
Evaluators of any kind of program frequently lic Schools’ Virtual High School and Louisiana’s
face resistance to data collection efforts. Surveys Algebra I Online program handled the prob-
33
lem by redesigning their evaluations. Thinkport At first, evaluators wanted a random selection
evaluators headed off the same problem by tak- of students assigned to participate in the orien-
ing proactive steps to ensure cooperation and tation tutorial in order to create treatment and
obtain high data collection rates in their study. control groups (see Glossary of Common Evalu-
When evaluators of Washington’s Digital Learn- ation Terms, p. 65) for an experimental study.
ing Commons (DLC) struggled to collect and While the district approved of the random as-
aggregate data across multiple private vendors signment plan, many school sites were not fa-
who provided courses through the program, miliar with the evaluation and did not follow
the program took steps to define indicators and through on getting students to take the tutorial
improve future evaluation efforts. As these ex- or on tracking who did take it.
amples show, each of these challenges can be
To address this problem, the researchers changed
lessened with advance planning and communi-
course midway through the study, refocusing it
cation. While these steps can be time-consum-
on the preparedness of in-class mentors who sup-
ing and difficult, they are essential for collecting
ported the online courses. This change kept the
the data needed to improve programs.
focus on student support and preparation, but
sidestepped the problem of assigning students
Redesign or Refocus the Evaluation When
randomly to the tutorial. In the revised design,
Necessary
evaluators collected data directly from partici-
In 2004, Chicago Public Schools (CPS) estab- pating students and mentors, gathering informa-
lished the Virtual High School (CPS/VHS) to tion about students’ ability to manage time, the
provide students with access to a wide range of amount of on-task time students needed for suc-
online courses taught by credentialed teachers. cess, and the level of student and mentor tech-
The program seemed like an economical way to nology skills. The evaluators conducted surveys
meet several district goals: Providing all students and focus groups with participants, as well as
with highly qualified teachers; expanding access interviews with the administrators of CPS/VHS
to a wide range of courses, especially for tra- and Illinois Virtual High School (IVHS), the um-
ditionally underserved students; and addressing brella organization that provides online courses
the problem of low enrollment in certain high through CPS/VHS and other districts in the state.
school courses. Concerned about their students’ They also collected data on student grades and
course completion rates, CPS/VHS administra- participation in orientation activities. To retain a
tors wanted to learn how to strengthen student comparative element in the study, evaluators ana-
preparedness and performance in their program. lyzed online course completion data for the entire
Seeing student readiness as critical to a success- state: They found that CPS/VHS course comple-
ful online learning experience, the project’s in- tion rates were 10 to 15 percent lower than com-
dependent evaluator, TA Consulting, and district parable rates for all students in IVHS in the fall of
administrators focused on student orientation 2004 and spring of 2005, but in the fall of 2005,
and support; in particular, they wanted to assess when fewer students enrolled, CPS/VHS showed
the effectiveness of a tutorial tool developed to its highest completion rate ever, at 83.6 percent,
orient students to online course taking. surpassing IVHS’s informal target of 70 percent. 10
34
Through these varied efforts, the researchers son classrooms did not complete posttests; only
were able to get the information they need- about 64 percent of control students were tested
ed. When their planned data collection effort compared to 89 percent of online students. In
stalled, they went back to their study goals and 2005, hurricanes Katrina and Rita created addi-
identified a different indicator for student sup- tional problems, as many of the control group
port and a different means of collecting data on classrooms were scattered and data were lost.
it. And although the experimental study of the
student orientation tutorial was abandoned, the In neither the Chicago nor the Louisiana case
evaluators ultimately provided useful informa- could the problem of collecting data from
tion about this tool by observing and reporting unmotivated respondents be tackled head-on,
on its use to program administrators. with incentives or redoubled efforts to follow
up with nonrespondents, for example. (As is of-
Evaluators very commonly have difficulty col-
lecting data from respondents who do not feel ten the case, such efforts were prohibited by the
personally invested in the evaluation, and online projects’ budgets.) Instead, evaluators turned to
program evaluators are no exception. In the case other, more readily available data. When plan-
of Louisiana’s Algebra I Online program, evalu- ning an evaluation, evaluators are wise to try to
ators faced problems getting data from control anticipate likely response rates and patterns and
group teachers. Initially, the state department of to develop a “Plan B” in case data collection ef-
education had hoped to conduct a quasi-exper- forts do not go as planned. In some cases, the
imental study (see Glossary of Common Evalu- best approach may be to minimize or eliminate
ation Terms, p. 65) comparing the performance any assessments that are unique to the evalua-
of students in the online program with students tion and rely instead on existing state or district
in face-to-face settings. Knowing it might be a assessment data that can be collected without
challenge to find and collect data from control
burdening students or teachers participating in
groups across the state, the program adminis-
control groups. The Federal Education Rights
trators required participating districts to agree
and Privacy Act (FERPA) allows districts and
up front to identify traditional classrooms (with
schools to release student records to a third par-
student demographics matching those of online
courses) that would participate in the collection ty for the purpose of evaluations.11
of data necessary for ongoing program evalu-
Both the Chicago and Louisiana examples serve
ation. It was a proactive move, but even with
as cautionary tales for evaluators who plan to
this agreement in place, the external evaluator
undertake an experimental design. If evalua-
found it difficult to get control teachers to ad-
minister posttests at the end of their courses. tors plan to collect data from those who do not
The control teachers had been identified, but see themselves benefiting from the program or
they were far removed from the program and its evaluation, the evaluation will need adequate
evaluation, and many had valid concerns about money to provide significant incentives or, at a
giving up a day of instruction to issue the test. minimum, to spend substantial time and effort on
In the end, many of the students in the compari- communication with these individuals.
35
Take Proactive Steps to Boost Response Rates Evaluators reported that coordinators were ea-
ger to have their teachers receive training on a
In the evaluation of Thinkport’s electronic field
new classroom tool, especially given the reputa-
trip, Maryland Public Television was more suc-
tion of Maryland Public Television. In turn, the
cessful in collecting data from control group
social studies coordinators presented informa-
classrooms. This program’s evaluation efforts
tion about the study to social studies teachers
differed from others in that it did not rely on
during opening meetings and served as contacts
participation from parties outside of the pro-
for interested teachers. The approach worked:
gram. Instead, evaluators first chose participating
Evaluators were largely able to meet their goal
schools, then assigned them to either a treatment
of having all eighth-grade social studies teachers
group that participated in the electronic field trip
from nine schools participate, rather than having
right away or to a control group that could use
teachers scattered throughout a larger number
the lessons in a later semester. Teachers in both
of schools. Besides having logistical benefits for
groups received a one-hour introduction to the
the evaluators, this accomplishment may have
study, where they learned about what content
boosted compliance with the study’s require-
they were expected to cover with their students,
ments among participating teachers.
and about the data collection activities in which
they were expected to participate. Teachers were Evaluators also took the time to convince teach-
informed on that day whether their classroom ers of the importance of the electronic field
would be a treatment or control classroom; then trip. Before meeting with teachers, evaluators
control teachers were dismissed, and treatment mapped the field trip’s academic content to the
teachers received an additional two-hour train- specific standards of the state and participating
ing on how to use the electronic field trip with schools’ counties. They could then say to teach-
their students. Control teachers were given the ers: “These are the things your students have to
option of receiving the training at the conclusion do, and this is how the electronic field trip helps
of the study, so they could use the tool in their them do it.” Finally, evaluators spent time ex-
classrooms at a future point. As an incentive, the plaining the purpose and benefits of the evalu-
program offered approximately $2,000 to the so- ation itself and communicating to teachers that
cial studies departments that housed both the they played an important role in discovering
treatment and control classes. whether a new concept worked.
Evaluators took other steps that may have paved The overall approach was even more success-
the way for strong compliance among the control ful than the evaluators anticipated and helped
group teachers. They developed enthusiasm for garner commitment among the participating
the project at the ground level by first approach- teachers, including those assigned to the control
ing social studies coordinators who became ad- group. The teachers fulfilled the data collection
vocates for and facilitators of the project. By ap- expectations outlined at the onset of the study,
proaching these content experts, the evaluators including keeping daily reports for the six days of
were better able to promote the advantages of the instructional unit being evaluated and com-
participating in the program and its evaluation. pleting forms about the standards that they were
36
teaching, as well as documenting general infor- program staff to survey frequently. Furthermore,
mation about their lessons. By involving control K12 Inc. compiles the results in Microsoft Excel,
teachers in the experimental process and giving making them easy for any staff member to read
them full access to the treatment, the evaluators or merge into a PowerPoint presentation (see
could collect the data required to complete the fig. 2, Example of Tabulated Results From an
comparative analysis they planned. Arizona Virtual Academy Online Parent Survey,
p. 38). Because the school receives anonymous
Besides facing the challenge of collecting data data from K12 Inc., parents know they may be
from control groups, many evaluators, wheth- candid in their comments. With immediate, easy-
er of online programs or not, struggle even to to-understand feedback, AZVA staff are able to
collect information from regular program par- fine-tune their program on a regular basis.
ticipants. Cognizant of this problem, evaluators
are always looking for ways to collect data that Web sites and online courses can offer other
are unobtrusive and easy for the respondent to opportunities to collect important information
supply. This is an area where online programs with no burden on the participant. For example,
can actually present some advantages. For ex- evaluators could analyze the different pathways
ample, Appleton eSchool evaluators have found users take as they navigate through a particu-
that online courses offer an ideal opportunity to lar online tool or Web site, or how much time
collect survey data from participants. In some is spent on different portions of a Web site or
instances, they have required that students com- online course. If users are participants in a pro-
plete surveys (or remind their parents to do gram and are asked to enter an identifying num-
so) before they can take the final exam for the ber when they sign on to the site, this type of
course. Surveys are e-mailed directly to students information could also be linked to other data
or parents, ensuring high response rates and on participants, such as school records.
eliminating the need to enter data into a data-
base. Appleton’s system keeps individual results Define Data Elements Across Many Sources
anonymous but does allow evaluators to see
Another common problem for online evalua-
which individuals have responded.
tors is the challenge of collecting and aggregat-
With the help of its curriculum provider, K12 ing data from multiple sources. As noted earlier,
Inc., the Arizona Virtual Academy (AZVA) also Washington state’s DLC offers a wide array of
frequently uses Web-based surveys to gather online resources for students and educators. To
feedback from parents or teachers about particu- understand how online courses contribute to
lar events or trainings they have offered. Surveys students’ progress toward a high school diploma
are kept short and are e-mailed to respondents and college readiness, the program evaluators
immediately after the event. AZVA administrators conducted site visits to several high schools of-
believe the burden on respondents is relatively fering courses through DLC. In reviewing stu-
minimal, and report that the strategy consistent- dent transcripts, however, the evaluators discov-
ly leads to response rates of about 60 percent. ered that schools did not have the same practices
The ease of AZVA’s survey strategy encourages for maintaining course completion data and that
37
Figure 2. Example of Tabulated Results From an Arizona Virtual Academy

Online Parent Survey About the Quality of a Recent Student Workshop*
Title I Student Workshop Survey—Phoenix

At this time 12 responses have been received. 100% returned, 12 surveys deployed via e-mail.
Please select all that apply regarding the Phoenix student workshop. Total % of
(Select all that apply) Count Total
My child enjoyed the student workshop. 10 83%
My child enjoyed socializing with peers at the workshop. 12 100%
The activities were varied. 10 83%
My child enjoyed working with the teachers. 11 92%
My child had a chance to learn some new math strategies. 8 67%
My child had the opportunity to review math skills. 10 83%
The activities were too hard. 0 0%
The activities were too easy. 1 8%
Please use the space below to let us know your thoughts about the student workshop. (Text limited to 250 characters.)
(Text input)
Recipient Response
All 3 math teachers were very nice and understanding and they did not yell at you like the brick and mor-
tar schools, which made learning fun.
My son did not want to go when I took him. When I picked him up he told me he was glad he went.
She was bored, “it was stuff she already knew,” she stated. She liked the toothpick activity. She could
have been challenged more. She LOVED lunch!
My child really enjoyed the chance to work in a group setting. She seemed to learn a lot and the games
helped things “click.”
She was happy she went and said she learned a lot and had lots of fun. That is what I was hoping for
when I enrolled her. She is happy and better off for going.
THANKS FOR THE WORKSHOP NEED MORE OF THEM
The teachers were fun, smart and awesome, and the children learn new math strategies.
Thank you keep up the good work.
Source: Arizona Virtual Academy
* The U.S. Department of Education does not mandate or prescribe particular curricula or lesson plans. The information in this figure was provided by the
Department cannot ensure its accuracy. Furthermore, the inclusion of information in this figure does not reflect the relevance, timeliness, or completeness of
this information; nor is it intended to endorse any views, approaches, products, or services mentioned in the figure.
38
DLC courses were awarded varying numbers must abide by privacy laws and regulations at
of credits at different schools—the same DLC the state and federal levels. These protections
course might earn a student .5 credit, 1 credit, can present obstacles for program evaluations,
1.5 credits, or 2 credits. This lack of consistency either by limiting evaluators’ access to certain
made it difficult to aggregate data across sites data sets or by requiring a rigorous or lengthy
and to determine the extent to which DLC cours- data request process. For some online program
es helped students to graduate from high school evaluators, the problem is exacerbated because
or complete a college p reparation curriculum. they are studying student performance at mul-
tiple sites in different jurisdictions.
The evaluation final report recommended de-
veloping guidelines to show schools how to
Of course, evaluators must adhere to all relevant
properly gather student-level data on each par-
privacy protections. To lessen impacts on the
ticipating DLC student in order to help with
study’s progress, researchers should consider
future evaluations and local assessment of the
privacy protections during the design phase and
program’s impact.
budget their time and money accordingly. Some-
times special arrangements also can be made
Plan Ahead When Handling Sensitive Data
to gain access to sensitive data while still pro-
Any time evaluators are dealing with student data, tecting student privacy. Researchers might sign
there are certain to be concerns about privacy. confidentiality agreements, physically travel to
Districts are well aware of this issue, and most a specific site to analyze data (rather than have
have guidelines and data request processes in data released to them in electronic or hardcopy
place to make sure that student data are handled
format), or even employ a neutral third party to
properly. Of course, it is critically important to
receive data sets and strip out any identifying
protect student privacy, but for evaluators, these
information before passing it to researchers.
regulations can create difficulties. For example,
in Chicago, Tom Clark, of TA Consulting, had Evaluators also can incorporate a variety of pre-
a close partnership with the district’s Office of cautions in their own study protocol to protect
Technology Services, yet still had difficulty navi- students’ privacy. In the Thinkport evaluation,
gating through the district’s privacy regulations
for example, researchers scrupulously avoided
and lengthy data request process. Ultimately,
all contact with students’ names, referring to
there was no easy solution to the problem: This
them only through a student identification num-
evaluator did not get all of the data he wanted
ber. The teachers participating in the study took
and had to make do with less. Still, he did receive
attendance on a special form that included name
a limited amount of coded student demographic
and student identification number, then tore the
and performance data and, by combining it with
names off the perforated page and mailed the at-
information from his other data collection activi-
ties, he was able to complete the study. tendance sheet to the evaluators. Through such
procedures, evaluators can maintain students’
District protocols are just one layer of protec- privacy while still conducting the research need-
tion for sensitive student data; researchers also ed to improve programs and instructional tools.
39
Summary creative and flexible arrangements with agen-

cies holding student data, and should incorpo-
As these examples show, program leaders and
rate privacy protections into their study protocol
evaluators can often take proactive steps to avoid
from the beginning, such as referring to students
and address data collection problems. To ensure
only through identification numbers.
cooperation from study participants and boost
data collection rates, they should build in plenty
Finally, if data collection problems cannot be
of time (and funds, if necessary) to communi-
avoided, and evaluators simply do not have
cate their evaluation goals with anyone in charge
what they need to complete a planned analysis,
of collecting data from control groups. Program
sometimes the best response is to redesign or re-
leaders also should communicate with students
focus the evaluation. One strategy involves find-
participating in evaluation studies, both to ex-
ing other sources of data that put more control
plain its goals, and to describe how students will
into the hands of evaluators (e.g., observations,
benefit ultimately from the evaluation. These
focus groups).
communication efforts should begin early and
continue for the project’s duration. Additionally,
program leaders should plan to offer incentives Interpreting the Impact of Program
to data collectors and study participants.
Maturity
Evaluators may want to consider administering
Online learning programs are often on the cutting
surveys online, to boost response rates and eas-
edge of education reform and, like any new tech-
ily compile results. In addition, there may be
nology, may require a period of adaptation. For
ways they can collect data electronically to bet-
example, a district might try creating a new online
ter understand how online resources are used
course and discover some technical glitches when
(e.g., by tracking different pathways users take
students begin to actually use it. Course creators
as they navigate through a particular online tool
also may need to fine-tune content, adjusting how
or Web site, or how much time is spent on dif-
it is presented or explained. For their part, students
ferent portions of a Web site or online course).
who are new to online course taking may need
When collecting data from multiple agencies, some time to get used to the format. Perhaps they
evaluators should consider in advance, the for- need to learn new ways of studying or interacting
mat and comparability of the data, making sure with the teacher to be successful. If an evaluation
to define precisely what information should be is under way as all of this is going on, its findings
collected and communicating these definitions to may have more to do with the program’s newness
all those who are collecting and maintaining it. than its quality or effectiveness.
Evaluators should research relevant privacy laws Because the creators of online programs often
and guidelines well in advance and build in ad- make adjustments to policies and practices while
equate time to navigate the process of request- perfecting their model, it is ideal to wait until
ing and collecting student data from states, dis- the program has had a chance to mature. At
tricts, and schools. They may need to consider the same time, online learning programs are of-
40
ten under pressure to demonstrate effectiveness sary of Common Evaluation Terms, p. 65), with
right away. Although early evaluation efforts the aim of understanding whether students who
can provide valuable formative information for used this electronic field trip learned as much as
program improvement, they can sometimes be students who received traditional instruction in
premature for generating reliable findings about the same content and did not use the field trip.
effectiveness. Worse, if summative evaluations
(see Glossary of Common Evaluation Terms, Initially, the randomized controlled trial re-
p. 65) are undertaken too soon and show dis- vealed a disappointing finding: The electronic
appointing results, they could be damaging to field trip did not impact student performance
programs’ reputations or future chances at fund- on a test of content knowledge any more or
ing and political support. less than traditional instruction. A second phase
of the study was initiated, however, when the
What steps can evaluators and program leaders evaluators dug deeper and analyzed whether
take to interpret appropriately the impact of the teachers who had used an electronic field trip
program’s maturity on evaluation findings? before were more successful than those using it
for the first time. When they disaggregated the
Several of the programs featured in this guide
data, the evaluators found that students whose
had evaluation efforts in place early on, some-
teachers were inexperienced with the electron-
times from the very beginning. In a few cases,
ic field trip actually learned less compared to
evaluators found less-than-positive outcomes at
students who received traditional instruction;
first and suspected that their findings were relat-
however, students whose teachers had used the
ed to the program’s lack of maturity. In these in-
electronic field trip before learned more than
stances, the evaluators needed additional infor-
the traditionally taught students. In the second
mation to confirm their hunch and also needed
semester, the evaluators were able to compare
to help program stakeholders understand and
the effect of the treatment teachers using the
interpret the negative findings appropriately.
electronic field trip for the first time to the effect
In the case of Thinkport, for example, evalu-
of those same teachers using it a second time.
ators designed a follow-up evaluation to pro-
This analysis found that when teachers used the
vide more information about the program as it
electronic field trip a second time, its effective-
matured. In the case of the Algebra I Online
ness rose dramatically. The students of teachers
program, evaluators used multiple measures to
using the field trip a second time scored 121
provide stakeholders with a balanced perspec-
percent higher on a test of knowledge about
tive on the program’s effectiveness.
the Underground Railroad than the students in
the control group (see Glossary of Common
Conduct Follow-up Analyses for Deeper
Evaluation Terms, p. 65) who had received
Understanding
traditional instruction on the same content.
In evaluating the impact of Thinkport’s electronic
field trip about slavery and the Underground Rail- The evaluators also looked at teachers’ respons-
road, evaluators from Macro International con- es to open-ended survey questions to under-
ducted a randomized controlled trial (see Glos- stand better why second-time users were so
41
much more successful than first-timers. Novice give other evaluators “ammunition” when they
users reported that “they did not fully under- are asked to evaluate the effectiveness of a new
stand the Web site’s capabilities when they be- technology too soon after implementation.
gan using it in their classes and that they were
sometimes unable to answer student questions Use Multiple Measures to Gain a Balanced
because they didn’t understand the resource Perspective
well enough themselves.”12 On the other hand,
Teachers are not the only ones who need time
teachers that had used the electronic field trip
to adapt to a new learning technology; students
once before “showed a deeper understanding
need time as well. Evaluators need to keep in
of its resources and had a generally smoother
mind that students’ inexperience or discomfort
experience working with the site.”
with a new online course or tool also can cloud
evaluation efforts—especially if an evaluation is
In this instance, teachers needed time to learn
undertaken early in the program’s implementa-
how to integrate a new learning tool into their
tion. The Algebra I Online program connects
classrooms. Although not successful at first, the
students to a certified algebra teacher via the In-
teachers who got past the initial learning curve
ternet, while another teacher, who may or may
eventually became very effective in using the field
not be certified, provides academic and technical
trip to deliver content to students. This finding
support in the classroom. When comparing the
was important for two reasons. First, it suggested
an area for program improvement: to counter the experiences of Algebra I Online students and tra-
problem of teacher inexperience with the tool, ditional algebra students, EDC evaluators found
the evaluators recommended that the program mixed results. On the one hand, online students
offer additional guidance for first-time users. Sec- reported having less confidence in their alge-
ond, it kept the program’s leaders from reaching bra skills. Specifically, about two-thirds of stu-
a premature conclusion about the effectiveness dents from the control group (those in traditional
of the Pathways to Freedom electronic field trip. classes) reported feeling either confident or very
confident in their algebra skills, compared to
This is an important lesson, say Thinkport’s lead- just under half of the Algebra I Online students.
ers, for them and others. As Helene Jennings, The online students also were less likely to re-
vice-president of Macro International, explains, port having a good learning experience in their
“There’s a lot of enthusiasm when something is algebra class. About one-fifth of Algebra I On-
developed, and patience is very hard for peo- line students reported that they did not have a
ple. … They want to get the results. It’s a natural good learning experience in the class, compared
instinct.” But, she says, it’s important not to rush to only 6 percent of students in regular algebra
to summative judgments: “You just can’t take it classes. On the other hand, the online students
out of the box and have phenomenal success.” showed achievement gains that were just as high
Since the 2005 evaluation, the team has shared or higher than those of traditional students. Spe-
these experiences with other evaluators at several cifically, the Algebra I Online students outscored
professional conferences. They do this, says Mac- students in control classrooms on 18 of 25 post-
ro International Senior Manager Michael Long, to test items, and they also tended to do better on
42
those items that required them to create an alge- for improvement. Then, once users have had
braic expression from a real-world example. time to adapt, and program developers have
had time to incorporate what they’ve learned
In an article they published in the Journal of from early feedback and observations, evalua-
Research on Technology in Education, the evalu- tors can turn to summative evaluations to deter-
ators speculated about why the online students
mine effectiveness.
were less confident in their algebra skills and had
lower opinions of their learning experience: “It When disseminating findings from summative
may be that the model of delayed feedback and evaluations, program leaders should work with
dispersed authority in the online course led to a their evaluator to help program stakeholders
‘lost’ feeling and prevented students from being understand and interpret how program maturity
able to gauge how they were doing.”13 In other may have affected evaluation findings. The use of
words, without immediate reassurance from the multiple measures can help provide a balanced
teacher of record, students may have felt they perspective on the program’s effectiveness. Pro-
weren’t “getting it,” when, in fact, they were. gram leaders also may want to consider repeat-
ing a summative evaluation to provide more in-
This example suggests that students’ unfamiliar-
formation about the program as it matures.
ity with a new program can substantially affect
their perceptions and experiences. The evalu-
Given the sophistication of many online learn-
ators in this case were wise to use a variety of
ing programs today, it takes an extraordinary
measures to understand what students were ex-
amount of time and money up front to create
periencing in the class. Taken alone, the stu-
them. This, and stakeholders’ eagerness for find-
dents’ reports about their confidence and learn-
ings, makes these programs especially vulner-
ing experience could suggest that the Algebra I
able to premature judgments. Evaluators have
Online program is not effective. But when the
an important message to communicate to stake-
evaluators paired the self-reported satisfaction
holders: Evaluation efforts at all stages of devel-
data with test score data, they were able to see
opment are critical to making sure investments
the contradiction and gain a richer understand-
are well spent, but they need to be appropriate
ing of students’ experiences in the program.
for the program’s level of maturity.
Summary
A lack of program maturity is not a reason to Translating Evaluation Findings Into
forego evaluation. On the contrary, evaluation Action
can be extremely useful in the early phases of
program development. Even before a program is As the phases of data collection and analysis
designed, evaluators can conduct needs assess- wind down, work of another sort begins. Evalu-
ments to determine how the target population ators present their findings and, frequently, their
can best be served. In the program’s early im- recommendations; then program leaders begin
plementation phase, evaluators can conduct the task of responding to them. Several factors
formative evaluations that aim to identify areas contribute to the ease and success of this process:
43
the strength of the findings, the clarity and speci- a structured, formal process for turning evalu-
ficity of the recommendations, how they are dis- ation recommendations into program improve-
seminated, and to whom. The relationship be- ments, including establishing timelines, staff as-
tween the evaluators and the program leaders is signments, and regular status reports. The AZVA
key: When evaluators (external or internal) have system, though time-consuming, has helped
ongoing opportunities to talk about and work program administrators implement changes.
with program staff on improvements, there is
greater support for change. Conversely, if evalu- Use Evaluation Findings to Inform and
ators are fairly isolated from program leaders Encourage Change
and leave the process once they have presented Chicago’s CPS/VHS is managed and implement-
their recommendations, there is less support ed collaboratively by three CPS offices: the Of-
and, perhaps, a reduced sense of accountability fice of Technology Services eLearning, the Of-
among program staff. Of course, while frequent fice of High School Programs, and the Office
and open communication is important, main- of Research, Evaluation, and Accountability. In
taining objectivity and a respectful distance are 2005, Chief eLearning Officer Sharnell Jackson
also critical to obtaining valid research findings. initiated an external evaluation to understand
Collaborations between researchers and practi- the cause of low course completion rates and
tioners should not be inappropriately close. find ways to help struggling students.
When program leaders try to act on evaluation Evaluator Tom Clark of TA Consulting found
findings, the structure and overall health of the great variation in students’ ability to work inde-
organization play a role as well. Some program pendently, manage their time, and succeed with-
leaders meet with substantial barriers at this out having an instructor physically present. The
stage, particularly if they are trying to change evaluation report recommended several ways to
the behavior of colleagues in scattered program offer more support for struggling students, in-
sites or other offices or departments. The prob- cluding having a dedicated class period in the
lem is compounded if there is a general lack of school schedule for completing online course
buy-in or familiarity with the evaluation. work and assigning on-site mentors to assist stu-
dents during these periods. But when program
In such circumstances, how can program leaders administrators tried to implement these recom-
and evaluators translate findings into program mendations, they had difficulty compelling all
improvements? Among the programs featured participating schools to change. CPS is a large
in this guide, there are a variety of approaches district with a distributed governance structure,
to using evaluation findings to effect change. making it difficult for the central office to force
In the CPS/VHS, for example, program lead- changes at the school-site level.
ers have used evaluation findings to persuade
reluctant colleagues to make needed changes Facing resistance from schools, the program’s
and have repeatedly returned to the evaluation administrators tried several different tacks to
recommendations to guide and justify internal encourage implementation of the recommen-
decisions. Meanwhile, AZVA program staff use dations. First, they took every opportunity to
44
communicate the evaluation findings to area high schools to boost students’ study skills and
administrators and principals of participating support their achievement in the online pro-
schools, making the case for change with cred- gram. As CPS/VHS expands, some of the ear-
ible data from an external source. Some school lier problems have been avoided by getting site
leaders resisted, saying they simply did not have administrators to agree up front to the practices
the manpower or the funds to assign on-site recommended by the evaluation. Brown says,
mentors. Still, they could not ignore the com- “We constantly reiterate what this study recom-
pelling data showing that students needed help mended whenever we have any type of orienta-
with pacing, study skills, and troubleshooting the tion [for] a new school that’s enrolling.”
technology; without this help many were failing.
The strength of these findings, along with finan- Finally, in some instances, the program adminis-
cial assistance from the district to provide mod- trators changed program requirements outright
est stipends, convinced school site leaders to and forced participating schools to comply. Be-
invest in mentors. Crystal Brown, senior analyst ginning this year, for example, all online classes
in CPS’s Office of Technology Services, reports must have a regularly scheduled time during the
that most CPS/VHS students now have access to school day. (There are a few exceptions made
an on-site mentor, “whereas before they just told for very high-performing students.) This change
a counselor, ‘I want to enroll in this class,’ and ensures that students have dedicated computer
then they were on their own.” Brown says the time and mentor support to help them success-
program leaders say the evaluation also has been fully complete their course work on time. In ad-
useful for prodding principals to provide profes- dition, participating students are now required
sional development for mentors and for persuad- to attend an orientation for the online courses
ing mentors to participate. They use the evalua- where they receive training on study skills.
tion findings “whenever we train a mentor,” she
says, “and that’s how we get a lot of buy-in.” Take a Structured Approach to Improvement
Changing behavior and policy can be difficult in
CPS/VHS administrators also are careful to set
a large organization and, as the above example
a good example by using the evaluation find-
shows, program administrators must be creative
ings and recommendations to guide their own
and persistent to make it happen. Sometimes
practices at the central office. To date, they have
a small organization, such as an online school
implemented several “high priority” recommen-
with a small central staff, has a distinct advan-
dations from the report. For example, program
tage when trying to implement evaluation rec-
leaders strengthened mentor preparation by in-
ommendations. With a nimble staff and a strong
stituting quarterly trainings for mentors and es-
improvement process in place, AZVA, for exam-
tablishing a shared online workspace that pro-
ple, has been especially effective in making pro-
vides guidelines and advice for mentors. The
gram changes based on findings from its many
district also has implemented Advancement Via
evaluation efforts.
Individual Determination* programs in many
* Advancement Via Individual Determination (AVID) is a program that prepares 4th

Several factors explain AZVA’s success in
through 12th grade students for four-year college eligibility. translating evaluation findings into program
45
improvements. First, its evaluation process key SAIP goal was to improve student achieve-
generates detailed recommendations from both ment in math, and the administrative team set the
outsiders and insiders. That is, staff from their specific goal of decreasing by 5 percent the num-
main content provider, K12 Inc., visit AZVA ber of students who score “far below the stan-
approximately every other year and conduct dards” on the state standards test and increasing
a quality assurance audit to identify areas for by 5 percent the number who meet or exceed
program improvement. Following the audit, state standards. To accomplish this, AZVA staff
K12 Inc. develops a series of recommendations took action on several fronts: they aligned their
and, in turn, AZVA creates a detailed plan that curriculum to the state’s testing blueprint, devel-
shows what actions will be taken to address oped a new curriculum sequencing plan, imple-
each recommendation, including who will be mented additional teacher and parent training,
responsible and the target date for comple- worked with students to encourage test prepara-
tion. For example, when the 2005 audit recom- tion and participation, and developed individual
mended that AZVA create a formal feedback math learning plans for students.
loop for teachers, the school assigned a staff
member to administer monthly electronic sur- Another strength of AZVA’s process is that it re-
veys to collect information from teachers about quires program staff to review evaluation recom-
the effectiveness of their professional develop- mendations regularly and continually track the
ment, their training and technology needs, their progress that has been made toward them. The
perceptions of parent training needs, and their SIP and SAIP are evolving plans that are regularly
suggestions for enhancing community relations. updated and revised by “basically everyone that
AZVA has responded in similar fashion to many has any role in instruction,” such as the director
other recommendations generated by K12 Inc.’s of instruction, the high school director, and the
site visit, addressing a range of organization- special education manager, says AZVA’s director,
al, instructional, and operational issues (see Mary Gifford. As part of this process, team mem-
table 4, Excerpts From AZVA’s Next Steps Plan, bers continually return to the document and track
in Response to Recommendations From the K12 how much progress has been made in reaching
Quality Assurance Audit, p. 47). their goals. There also is external accountability
for making progress on SIP and SAIP goals: ap-
In addition to the audit, K12 Inc. also requires proximately once a quarter, AZVA staff review
that AZVA complete an annual School Improve- the plans via conference calls with K12 Inc.
ment Plan (SIP), which consists of two parts: a
self-evaluation of school operations in general Finally, AZVA’s approach succeeds because it
and a Student Achievement Improvement Plan permeates the work of the entire school. As
(SAIP) that specifically focuses on student out- Bridget Schleifer, the K–8 principal, explains,
comes. In developing these plans, AZVA staff ar- “Evaluation is built into everybody’s role and
ticulate a series of goals and specific objectives responsibility.” Staff members at all levels are
for improvement, again including strategies and expected to take evaluation recommendations
timelines for meeting each objective. A strength seriously and to help to implement changes
of this process is its specificity. For example, one based on them.
46
Table 4. Excerpts From AZVA’s Next Steps Plan, In Response to
Recommendations From the K12 Quality Assurance Audit*
K12 Inc. Site Visit Recommendations Actions Timeline Responsible Status

Board and Organizational Structure
AZVA should create additional mecha- The AZVA administration will survey school staff to de- 12/31/2005 Janice Will begin in
nisms to centralize communications. termine communications needs. The administration will Gruneberg summer 2005
evaluate technology to assist with centralizing communi-
cations. The administration will develop a communica-
tions plan and begin implementation in fall 2005.
AZVA should create a formal feedback The AZVA administration will develop a survey instrument 8/1/2005 Jacque Will begin
loop with teachers. to collect information from teachers regarding the effective- Johnson-Hirt development
ness of professional development, outside training needs in July
and effectiveness of outside training opportunities, technol-
ogy needs, parent training needs, and community relations
suggestions. The survey will be administered at monthly
professional development meetings and available in an
electronic format to capture contemporaneous feedback.
Instruction
AZVA should consider designing new The director of instruction is developing the new teacher 8/1/2005 Jacque On track to
teacher training and ongoing teacher training agenda and sessions. Summer projects will be Johnson-Hirt meet timeline
training for each year in chunks and assigned in May; lead teachers will work with the director
stages, rather than as one long week of instruction to develop training modules.
of training; this structure would allow
practice and reflection.
As AZVA continues to grow, it should The director of instruction is revising the Parent Orientation 8/1/2005 Jacque On track to
give attention to scalability in all initia- Guide (includes work sample guidelines). Lead teachers Johnson-Hirt meet timeline
tives from teacher mentoring and group and regular education teachers will have summer projects
Professional Development and training based on teacher training and mentoring needs.
to student work samples.
AZVA should consider a way for admin- AZVA administration has requested a part-time adminis- 9/15/2005 Janice On track to
istration to review aggregate views of trative position to manage and analyze student-level data. Gruneberg, meet timeline
low-performing students as the school The administration has also requested a formal registrar Jacque
continues to grow (one suggestion could position to capture data on incoming students. The assis- Johnson-Hirt
be having teachers submit the aggregate tant director for operations is revising the teacher tracking
view, highlighted with red or yellow tool with this goal in mind.
depending on level of attention needed).
Community Relations
AZVA should continue to develop ways The director of instruction is working with teachers to de- 7/30/2005 Jacque Administrative
to encourage students and parents to velop lesson-based outings based on parent feedback. A Johnson-Hirt team is mak-
attend outings and encourage increased master outing schedule and plans will be developed and ing summer
participation in school activities. included in the revised Parent Orientation Guide. project assign-
ments in May;
on track to
meet timeline
Special Education
AZVA should monitor the special educa- The administration is requesting a part-time staff position 7/15/2005 Lisa Walker SPED team
tion students’ state test scores from to manage and analyze student-level data. The SPED developing
year to year in correlation with their IEP team is currently developing a process to measure prog- process; on
goals to ensure proper growth as well ress on IEP goals, including test score gains. State test track to meet
as comparable growth to same age scores will be available in June. timeline
non-disabled peers.
* The U.S. Department of Education does not mandate or prescribe particular curricula or lesson plans. The information in this table was provided by the identi-
fied site or program and is included here as an illustration of only one of many resources that educators may find helpful and use at their option. The Department
cannot ensure its accuracy. Furthermore, the inclusion of information in this table does not reflect the relevance, timeliness, or completeness of this information;
nor is it intended to endorse any views, approaches, products, or services mentioned in the table.
47
Summary In some instances, recommended changes to an

online program or resource may be technically
As the above examples show, whether and how
hard to implement. For example, it may be diffi-
evaluation findings lead to program improve-
cult and expensive to make changes to the content
ments is a function not only of the quality of the
or format of an online course. It also may be quite
evaluation, but of many other contextual and or-
costly to change, repair, or provide the hardware
ganizational factors. Program leaders can facili-
needed to improve an online program. Insuffi-
tate program change by working from the begin-
cient funding may cause other difficulties as well.
ning to create an ongoing relationship between
If the program has relied on external funding that
evaluators and program staff. Throughout the has subsequently run out, there may be pressure
process, there should be opportunities for staff to dilute the program’s approach; for example,
members to discuss the evaluation, its findings, districts might feel pressured to keep an online
and its implications for improving the program. course but eliminate the face-to-face mentors that
Off-site staff and partners should be included as support students as they proceed through it.
well. In short, program leaders should commu-
nicate early and often about the evaluation with Online program evaluators need to keep these
anyone whose behavior might be expected to kinds of practical challenges in mind when for-
change as a result of its findings. mulating recommendations and should consider
ranking their suggestions both in order of im-
Once findings and recommendations are avail- portance and feasibility. Program leaders should
able, program leaders might want to consider develop a plan for addressing the highest prior-
using a structured, formal process for turning ity recommendations first. In situations where
those recommendations into program improve- program funds are running low, evaluators can
ments. One approach is to decide on a course provide a much-needed external perspective,
of action, set a timeline, and identify who will reminding stakeholders of the project’s goals
be responsible for implementation. Whether us- and helping them identify the most critical pro-
ing a formal process or not, program leaders gram elements to keep, even if it means serving
should revisit recommendations regularly and fewer participants. More broadly, communica-
continually track the progress that has been tion and persistence are essential when attempt-
made toward them. ing to translate evaluation findings into action.
48
PA R T I I
R ecommendations
For Evaluating
Online Learning
Part I draws on seven featured evaluations to and summative; internal and external; experi-
illustrate the challenges of evaluating online mental, quasi-experimental, and other types
learning and to describe how different evalua- of evaluations) and the costs and benefits of
each. What type of evaluation is appropri-
tors have addressed them. Part II offers recom-
ate given your stated purpose? What research
mendations that synthesize the lessons learned methods will best capture program effects?
in these examples and also draw from research
• Budget to meet evaluation needs. Limited
and conversations with experts in evaluating
budgets are a common barrier to evalua-
online learning. These recommendations aim to
tors. When designing evaluations, consider
provide guidance to program leaders who are whether there are available funds to cover
contemplating an evaluation and to assist pro- all planned data collection and analysis ac-
gram leaders and evaluators who are already tivities, plus the costs of any needed back-
working together to complete one. ground research, internal communications
and reporting, and incentives for study par-
ticipants. If available funds are not sufficient,
General Recommendations scale back the evaluation and focus on the
highest-priority activities.
Several recurring themes can be found in the • Develop a program culture that supports eval-
seven featured evaluations, as well as in the re- uation. Discuss evaluation with staff mem-
search and comments from experts in the field bers, clearly explaining its value and their
of online learning. These sources point to a roles in collecting and analyzing data. Incor-
handful of overarching recommendations: porate data collection and analysis activities
into staff members’ everyday responsibilities
• Begin with a clear vision for the evaluation. instead of treating these tasks as one-time ef-
Determine what you want the evaluation to forts or burdens. Make external evaluators
accomplish and what questions you hope to less threatening to program staff members by
answer. Program leaders and evaluators may creating opportunities for dialogue between
want to consider the questions listed in the the evaluator and program leaders and staff.
box on page 51 as they get started. Incorporate evaluation data into the process
of development of annual program goals and
• Determine the most appropriate evaluation
strategic planning activities.
methods for meeting your goals. Consider the
different types of evaluations discussed in the • Communicate early and often with anyone
guide (see Part I for examples of formative who will be affected by the evaluation.
49
• Dedicate adequate time and money to com- achievement, and focus your study of student
municating with internal and external stake- outcomes there.
holders at all phases of the evaluation. How-
• Consider using “dashboard indicators,” the
ever, be sensitive to program concerns about
two or three most critical goals to be mea-
keeping evaluation results confidential and
sured. How can the measures be succinctly
avoiding media scrutiny. Always keep those
managing the program “in the loop.” Com- communicated? Can they be graphed over
municating through the program director is time to demonstrate improved performance
often the best course of action. over time? (For example, an online program
trying to increase access to Advanced Place-
ment (AP)* courses might have a dashboard
Recommendations for Meeting the indicator composed of the number of schools
Needs of Multiple Stakeholders accessing online AP courses, the AP exam
pass rate, and the number of students taking
• Identify up front the various stakeholders who AP exams as a percentage of total number of
will be interested in the evaluation and what AP students in online AP courses.)
specifically they will want to know. Consider
conducting interviews or focus groups to col- • In the reporting phase, think about what
lect this information. findings will be of most interest to different
stakeholders. Consider communicating find-
• Use this information to determine the evalua- ings to different audiences in ways that are
tion’s main purpose(s) and to develop ques- tailored to their needs and interests.
tions that will guide the study. An evaluation
that clearly addresses stakeholder needs and • If already participating in mandatory evalu-
interests is more likely to yield findings and ation activities, think about how those find-
recommendations they will find worthy and ings can be used for other purposes, such as
champion. making internal improvements. Disseminate
any mandatory evaluation reports to the staff
• If trying to fulfill both program improvement and discuss how their findings can be used to
and accountability purposes, consider us- strengthen the program. Consider developing
ing evaluation approaches that can generate additional data collection efforts to supple-
both formative and summative information ment the mandatory evaluation.
(see Glossary of Common Evaluation Terms,
p. 65). Be realistic at this stage and keep in
mind the constraints of the evaluation budget Recommendations for Utilizing and
and timeline. Building on the Existing Base of
• Think early on about how the evaluation will Knowledge
incorporate student learning outcomes for
accountability purposes. Consider a range of • Look to established standards for online
academic outcomes that might be appropriate learning to determine elements of high-
to study, including scores on state-mandated quality programs.
and other standardized tests, course comple- • Program evaluation has a long history, and
tions, grades, and on-time graduation. its basic approaches can be adapted to fit
• If the program has multiple components, de- * Run by the nonprofit College Board, the Advanced Placement program offers col-
termine the ones with the best potential to lege-level course work to high school students. Many institutions of higher education
have a direct measurable impact on student offer college credits to students who take AP courses.
50
Important Questions for K–12 Online Learning Evaluators and
Practitioners
In a synthesis of research on K–12 online learning, Rosina Smith of the Alberta Online Consortium, Tom Clark of TA
Consulting, and Robert Blomeyer of Blomeyer & Clemente Consulting Services developed a list of important ques-
tions for researchers of K–12 online learning to consider.14 That list, which drew from the work of Cathy Cavanaugh
et al.,15 has been adapted here for program and evaluation practitioners. Of course, no evaluation can cover all of
these questions; this list is meant to provide a starting point for discussing evaluation goals and topics of interest.
Learner outcomes. What is the impact of the K–12 online learning program on student achievement? What factors
can increase online course success rates? What impact does the program have on learner process skills, such as
critical and higher-order thinking? How are learner satisfaction and motivation related to outcomes?
Learner characteristics. What are the characteristics of successful learners in this K–12 online learning program,
and can success be predicted? How do learner background, preparation, and screening influence academic out-
comes in the program?
Online learning features. What are the most effective combinations of media and methods in the online learning
program? How do interaction, collaboration, and learner pacing influence academic outcomes? What is the im-
pact of the K–12 online learning program when used as a supplement, in courses, or in full programs of study?
Online teaching and professional development. What are the characteristics of successful teachers in this K–12
online learning program? Are the training, mentoring, and support systems for these teachers effective?
Education context. What kinds of programs and curricula does the K–12 online learning program offer? How can the
program best be used to improve learner outcomes in different content areas, grade levels, and academic programs?
How can it help participating schools meet NCLB requirements? How do resources, policy, and funding impact the suc-
cess of the program? Can an effective model of K–12 online learning be scaled up and sustained by the program?
any unique educational program. The Pro- others. Make them available online. Consider
gram Evaluation Standards (1994) from the publishing in professional journals. Seek out
Joint Committee on Standards for Educational networking venues such as conferences.
Evaluation and other guides (see Evaluation
• Use caution when interpreting evaluation
Methods and Tools in appendix A) can be
findings from other programs, or adapting
helpful to those developing an online learn-
their methods. Online learning programs
ing evaluation.
“come in many shapes and sizes,” and find-
• Don’t reinvent the wheel unless you have ings about one online program or set of Web
to—consult the list of distance education resources are not always generalizable. Take
and evaluation resources in appendix A and time to understand the program being studied
identify program peers you can contact about and its context before deciding if the findings
their evaluation a ctivities. or methods are relevant and appropriate.
• Participate in the community of evaluators • If developing new tools for collecting data
and researchers studying K–12 online learn- or new processes for analyzing it, work
ing. Share evaluation tools and processes with collaboratively with leaders of similar
51
rograms or experts from other agencies to

p want to compare their student outcomes with
fill gaps in knowledge. those of the general student population.
• Clearly articulate the purpose of the compari-
Recommendations for Evaluating son. Is the evaluation seeking to find out if
Multifaceted Resources the online program is just as effective as the
traditional one or more effective? For exam-
• Multifaceted programs include a range of ple, “just as effective” findings are desirable
program activities and may struggle to de- and appropriate when online programs are
fine a clear, central purpose. Evaluators can being used to expand education access.
help such programs define and refine their
purposes and goals. Establishing key perfor- • If considering a quasi-experimental design
mance measures aligned to funder-mandated (see Glossary of Common Evaluation Terms,
and project goals can help program managers p. 65), plan carefully for what classes will be
prioritize and help evaluators identify what is used as control groups, and thoroughly as-
most important to study. sess the ways they are similar to and differ-
ent from the treatment classes. What kinds
• When evaluating a multifaceted educational of students does each class serve? Did the
program, consider an evaluation strategy that students (or teachers) choose to participate
combines both breadth and depth. For ex- in the class or were they assigned to it ran-
ample, collect broad information about usage domly? Are students taking the treatment and
and select a particular program feature or re- control classes for similar reasons (e.g., credit
source to examine more deeply. recovery, advanced learning)? Are the stu-
• Where feasible, consider a multiyear evalua- dents at a similar achievement level? Where
tion plan that narrows its focus in each suc- feasible, use individual student record data to
cessive year, or examines a different resource match treatment and comparison students or
each year. to hold constant differences in prior learning
characteristics across the two groups.
• To select a particular program feature or re-
source for deeper examination, think about • If considering a randomized controlled trial
what each resource is intended to do, or what (see Glossary of Common Evaluation Terms,
outcomes one would hope to see if the re- p. 65), determine how feasible it is for the
source was being used effectively. Work with particular program. Is it possible to assign
an evaluator to determine which resource is students randomly either to receive the treat-
best suited for a more thorough evaluation. ment or be in the control group? Other practi-
cal considerations: Can the control group stu-
dents receive the treatment at a future date?
Recommendations for Programs What other incentives can be offered to en-
Considering a Comparison Study courage them to participate? Will control and
treatment students be in the same classroom
• Seek to determine if a comparison study is ap-
or school? If so, might this cause “contamina-
propriate, and, if it is, whether there is a via-
tion” between treatment and control groups?
ble way to compare program participants with
others. Consider the online program’s goals, • In the event a randomized controlled trial or
the student population being served, and the quasi-experimental study is planned, plan to
program’s structure. For example, online pro- offer meaningful incentives to participating
grams dedicated to credit recovery would not individuals and schools. When deciding on
52
appropriate incentives, consider the total • Consider if there are innovative ways to col-
time commitment that will be asked of study lect data electronically about how online
participants. resources are used (e.g., tracking different
pathways users take as they navigate through
• Study sites with no vested financial interest
a particular online tool or Web site, or how
are more likely to withdraw or to fail to carry
much time is spent on different portions of a
through with study requirements that differ
Web site or online course).
from typical school practices. If compliance
appears unlikely, do not attempt an experi- • If response rates are too low, consider re-
mental or quasi-experimental study, unless designing or refocusing the evaluation. Find
it is an explicit requirement of a mandated other indicators that get at the same phe-
evaluation. nomenon. Find other data sources that put
more control into the hands of evaluators
• When designing data management systems,
(e.g., observations, focus groups).
keep in mind the possibility of comparisons
with traditional settings. Collect and organize • If collecting and aggregating data across mul-
data in a way that makes such comparisons tiple sources, define indicators clearly and
possible. make sure that data are collected in com-
patible formats. Define exactly what is to
be measured, and how, and distribute these
Recommendations for Gathering Valid instructions to all parties who are collecting
Evaluation Data data.
• Build in adequate time to fully communicate • Research relevant data privacy regulations
the purpose and design of the evaluation to well in advance.
everyone involved. Inform program staff who
• Determine the process and build in ade-
will play a role in the evaluation, as well as
quate time for requesting student data from
anyone who will help you gather evaluation
states, districts, or schools. Determine early
evidence. Explain how study participants will
on whether data permissions will be needed
benefit from the evaluation.
and from whom, and how likely it is that
• Be prepared to repurpose the methods you sensitive data will actually be available. Have
use to conduct an evaluation, without los- a “Plan B” if it is not.
ing sight of the evaluation’s original purpose.
• If the program does not have good access
Collecting multiple sources of evidence relat-
to the complete school record of enrolled
ed to the same evaluation question can help
students, encourage it to support school
to ensure that the evaluator can answer the
improvement and program evaluator needs
question if a data source becomes unavail-
by collecting NCLB subgroup and other key
able or a research method proves infeasible.
student record data as part of its regular pro-
• Seek out valid and reliable instruments for gram registration process.
gathering data. Existing data-gathering instru-
• Consider creative and flexible arrangements
ments that have been tested and refined can
for protecting student privacy (e.g., sign con-
offer higher-quality data than locally devel-
fidentiality agreements, physically travel to a
oped instruments.
specific site to analyze data, employ a disin-
• Consider administering surveys online, to boost terested third party to receive data sets and
response rates and easily compile results. strip out any identifying information).
53
• Incorporate precautions into the study pro- • Key stakeholders, such as parents and stu-
tocol to protect students’ privacy. When pos- dents, teachers, schools, administrators,
sible, avoid contact with students’ names and policymakers, and the general public, can
refer to students through student identifica- be informed through the communication of
tion numbers. evaluation results. Communication of evalu-
ation results can help online learning pro-
grams demonstrate their value or worth,
Recommendations for Taking Program demonstrate accountability, and dispel myths
Maturity Into Account about the nature of online learning.
• All education programs go through devel- • If the resulting recommendations are action-
opmental stages. Different kinds of evalua- able, online learning programs should move
tion are appropriate at each stage.16 Based on quickly to implement them. Don’t just file the
when they started and other factors, different evaluation away; make it a living document
components of a program may be in different that informs the program.
development stages during an evaluation.
• Work to create an ongoing relationship among
• In the early stages of implementation, focus evaluators, program leaders, and staff. Cre-
on formative evaluation efforts. Hold off on ate ongoing opportunities to talk about the
summative evaluations until users have had evaluation, its findings, and its implications
time to adapt to it, and there has been ad- for improving the program.
equate time to revise the program design
based on early feedback and observations. • Engage early with anyone whose behavior
will be expected to change as a result of the
• Once a stable program model has been evaluation findings—including personnel at
achieved, study of education outcomes is ap- distant sites who help manage the program.
propriate. Finally, as the program or program Introduce them to evaluators and communi-
component matures, its long-term sustainabil- cate the purpose of the evaluation.
ity and replicability should be considered.
• Institutionalize process improvement and
• Program leaders and evaluators should work performance measurement practices put in
together to help program stakeholders un- place during the evaluation. Consider us-
derstand and interpret how program maturity ing a structured, formal process for turning
can affect evaluation findings. evaluation recommendations into program
• Program leaders and evaluators should work improvements, including timelines and staff
together to establish baselines and bench- assignments.
marks, so that progress over time toward • Review evaluation recommendations regu-
program goals can be measured. larly and continually track the progress that
has been made toward them.
Recommendations for Translating • Rank evaluation recommendations both in
Evaluation Findings Into Action order of importance and feasibility. Develop
a plan for addressing the highest priority rec-
• Online learning programs often exist within
ommendations first.
larger learning organizations and policy and
practice frameworks that encourage or inhibit • Continue to conduct evaluation activities in-
program success. Use evaluation results to en- ternally or externally to continuously improve
courage needed changes in program context. your program over time.
54
Featured
Online Learning
Evaluations
Alabama Connecting Classrooms, dition to distance learning courses for students,

Educators, & Students Statewide ACCESS also provides teachers with professional
development and multimedia tools to enhance
Distance Learning
instruction. ACCESS is coordinated by the Ala-
Alabama
bama State Department of Education and three
The Alabama Connecting Classrooms, Educa- regional support centers. By the end of 2006, AC-
tors, & Students Statewide (ACCESS) Distance CESS was serving almost 4,000 online users and
Learning Initiative aims to provide all Alabama over 1,000 IVC users and, ultimately, is intended
high school students “equal access to high qual- to reach all public high schools in the state.
ity instruction to improve student achievement.”
ACCESS was developed to support and expand
existing distance learning initiatives in Alabama Algebra I Online
and to heighten their impact on student achieve- Louisiana
ment. In particular, the aim was to provide more
Algebra I Online began as a program under the
courses to students in small and rural schools.
Louisiana Virtual School, an initiative of the Lou-
ACCESS courses are Web-based, utilize interac-
isiana Department of Education that provides
tive videoconferencing (IVC) platforms, or com-
bine both technologies. ACCESS offers courses the state’s high school students with access to
in core subjects, plus foreign languages, elec- standards-based high school courses delivered
tives, remedial courses, and advanced courses, by certified Louisiana teachers via the Internet.
including Advanced Placement (AP)* and dual The program has two goals: To increase the
credit courses. The courses are developed and number of students taught by highly qualified
delivered by Alabama-certified teachers. In ad- algebra teachers, especially in rural and urban
areas, and to help uncertified algebra teachers
* Run by the nonprofit College Board, the Advanced Placement program offers
college-level course work to high school students. Many institutions of higher educa-
improve their skills and become certified. In
tion offer college credits to students who take AP courses. Algebra I Online courses, students participate
55
in a face-to-face class that meets at their home Web-based courses around the clock and sub-
school as part of the regular school day, and mit their assignments via the Internet. They can
each student has a computer connected to the communicate with teachers using e-mail and
Internet. A teacher who may or may not be online discussions, and receive oral assessments
certified to deliver algebra instruction is physi- and tutoring by telephone and Web conference
cally present in the classroom to facilitate stu- tools. Teachers regularly communicate student
dent learning, while a highly qualified, certified progress with a contact at the student’s local
teacher delivers the algebra instruction online. school and with each student’s mentor (usually
The online instructor’s responsibilities include the student’s parent), whose role is to provide
grading assignments and tests, and submitting the student with assistance and encouragement.
course grades; the in-class teacher oversees By 2007–08, the school was serving 275 students
the classroom and works to create an effective enrolled in over 500 semester courses. In addi-
learning environment. The online and in-class tion, the 2007 summer session included another
teachers communicate regularly during the year 400 semester course enrollments.
to discuss student progress. The algebra course
is standards-aligned and incorporates e-mail, in-
teractive online components, and video into the Arizona Virtual Academy
lessons. By the 2005–06 school year, 350 stu- Arizona
dents in 20 schools were taking Algebra I On-
Arizona Virtual Academy (AZVA) is a pub-
line courses, and five participating teachers had
lic charter school serving students in grades
earned secondary mathematics certification.
kindergarten through 11 across Arizona. By
2006–07, AZVA was serving approximately
Appleton eSchool 2,800 students up through the 10th grade and
piloting a small 11th-grade program. AZVA of-
Wisconsin
fers a complete selection of core, elective, and
Based in Wisconsin’s Appleton Area School AP courses, including language arts, math, sci-
District, Appleton eSchool is an online char- ence, history, art, music, and physical educa-
ter high school intended to provide high-qual- tion. Courses are developed by veteran public
ity, self-paced online courses. A few students and private school teachers and supplied to
choose to take their entire high school course AZVA by a national curriculum provider, K12
load through Appleton eSchool, but most use Inc. In 2006–07, AZVA used Title I funds17 to
the online courses to supplement those avail- add a supplemental math program for strug-
able at their home school. The school is open gling students in grades 3 through 8. In all
to all district high school students, but it makes AZVA courses, certified teachers provide in-
special efforts to include those with significant struction and keep track of students’ progress.
life challenges. Appleton eSchool offers core Students participate in structured activities, and
subjects, electives (e.g., art, economics, Web also study independently under the guidance
design, “thinking and learning strategies”), and of an assigned mentor. When families enroll
AP courses. Students can gain access to their with AZVA, the program sends them curricular
56
materials, accompanying textbooks and work- Digital Learning Commons
books, supplemental equipment and supplies, Washington
and a computer and printer, the latter two on
loan. Students are assessed regularly, both for Based in the state of Washington, the Digital
placement in the appropriate course level and Learning Commons (DLC) is a centrally hosted
to determine their mastery of course content. Web portal that offers a wide range of services
and resources to students and teachers. DLC’s
core goal is to offer education opportunities and
Chicago Public Schools’ Virtual High choices where they have not previously existed
School due to geographic or socioeconomic barriers.
Chicago Through DLC, middle and high school students
can access over 300 online courses, including
The Chicago Public Schools’ Virtual High School all core subjects and various electives, plus Ad-
(CPS/VHS) is intended to expand access to high- vanced Placement and English as a Second Lan-
quality teachers and courses, especially for stu- guage classes. DLC students also have access to
dents who traditionally have been underserved. online student mentors (made possible through
The program is a partnership with the Illinois university partnerships), and college and ca-
Virtual High School (IVHS)—a well-established reer-planning resources. DLC hosts an extensive
distance learning program serving the entire digital library, categorized by subject area, for
state. CPS/VHS offers a variety of online courses. students, teachers, and parents. In addition, it
To participate, students must meet prerequisites includes resources and tools for teachers, in-
and are advised of the commitment they will cluding online curricula, activities, and diagnos-
need to make to succeed in the class. CPS/VHS tics. It is integrated with Washington’s existing
leaders are clear that, in this program, online K–20 Network—a high-speed telecommunica-
does not mean independent study: Courses run tions infrastructure that allows Washington’s
for a semester, students are typically scheduled K–12 schools to use the Internet and interac-
to work online during the regular school day, tive videoconferencing. When schools join DLC,
and attendance during that time is required. On- program staff help them create a plan for using
site mentors are available to help students as the portal to meet their needs and, also, provide
they progress through the online classes. De- training to help school faculty and librarians in-
corporate its resources into the classroom.
spite this regular schedule, the program still of-
fers flexibility and convenience not available in
a traditional system because students can com- Thinkport
municate with instructors and access course
Maryland
materials outside of regular class hours. Today,
CPS/VHS offers over 100 online courses, and Maryland Public Television has partnered with
new classes are added when the district deter- Johns Hopkins University Center for Technolo-
mines that the course will meet the needs of gy in Education to create Thinkport, a Web site
approximately 60 to 75 students. that functions as a one-stop shop for teacher
57
and student resources. Thinkport brings to- video clips, blogs, and games, with accompa-
gether quality educational resources and tools nying information about how they can be used
from trusted sources, including the Library of effectively in classrooms. One of Thinkport’s
Congress, the U.S. Department of Education, most popular features is its collection of elec-
PBS, the Kennedy Center, the National Coun- tronic field trips—a number of which were de-
cil of Teachers of English, National Geographic, veloped by Maryland Public Television under
and the Smithsonian Institution. The site is or- the U.S. Department of Education’s Star Schools
ganized into four sections: classroom resources, grant, a federal program that supports distance
career resources for teachers, instructional tech- learning projects for teachers and students in
nology, and family and community resources. underserved populations. Each field trip is es-
About 75 percent of the site’s content is for sentially an online curricular unit that focuses in
teachers, including lesson plans and activities depth on a particular topic and includes many
for students, all of which include a technolo- interactive components for students and accom-
gy component. Thinkport also offers podcasts, panying support materials for teachers.
58
APPENDIX A
Resources
The resources listed below are intended to pro- Standards for Quality Online Teaching, state
vider readers with ready access to further infor- needs assessments for online courses and ser-
mation about evaluating online learning. This vices, online course quality evaluations, online
is not a complete list, and there may be other professional development, virtual education
useful resources on the topic. Selection was program administration, funding, and state and
based on the criteria that resources be relevant federal public policy. In addition, NACOL has
to the topic and themes of this guide; current published a primer on K–12 online learning,
and up-to-date; from nationally recognized or- which includes information about teaching,
ganizations, including but not limited to federal learning, and curriculum in online environ-
or federally funded sources; and that they of- ments, and evaluating online learning.
fer materials free of charge. This listing offers a http://www.nacol.org
range of research, practical tools, policy infor-
mation, and other resources. North Central Regional Educational Laboratory/
Learning Point Associates
The North Central Regional Educational Labora-
Distance Learning tory (NCREL) was a federally funded education
International Society of Technology in Education laboratory until 2005. Learning Point Associates
conducted the work of NCREL and now operates
The International Society of Technology in Edu-
a regional educational laboratory (REL) with a
cation (ISTE) is a nonprofit organization with
new scope of work and a new name: REL Mid-
more than 85,000 members. ISTE provides ser-
west. NCREL publications about online learning
vices, including evaluation, to improve teach-
are currently available on the Learning Point As-
ing, learning, and school leadership through
the use of technology. In 2007, ISTE published sociates Web site, including A Synthesis of New
a comprehensive overview of effective online Research on K–12 Online Learning (2005), which
teaching and learning practices, What Works in summarizes a series of research projects spon-
K–12 Online Learning. Chapter topics include sored by NCREL and includes recommendations
virtual course development, online learning for online research, policy, and practice.
in elementary classrooms, differentiating in- http://www.ncrel.org/tech/elearn.htm
struction online, professional development for
Southern Regional Education Board
teachers of virtual courses, and the challenges
that virtual schools will face in the future. The Southern Regional Education Board (SREB)
is a nonprofit organization that provides educa-
http://www.iste.org
tion reform resources to its 16 member states.
The North American Council for Online The SREB Educational Technology Cooperative
Learning focuses on ways to help state leaders create
The North American Council for Online Learn- and expand the use of technology in educa-
ing (NACOL) is an international K–12 nonprofit tion. SREB’s Web site provides the Standards
organization focused on enhancing K–12 online for Quality Online Courses and Standards
learning quality. NACOL publications, programs, for Quality Online Teaching. In 2005, SREB
and research areas include the National Stan- partnered with the AT&T Foundation to create
dards of Quality for Online Courses, National the State Virtual Schools Alliance, to assist SREB’s
59
16 member states to increase middle- and high- esigning an internal evaluation or working
d
school students’ access to rigorous academic with an external agency.
courses through state-supported virtual schools. http://www.ed.gov/rschstat/research/pubs/
Through AT&T Foundation grant funding, the rigorousevid/index.html
alliance facilitates collaboration and information
and resource sharing between states in order to Random Assignment in Program Evaluation and
create and improve state virtual schools. Intervention Research: Questions and Answers
http://www.sreb.org This document, published by the Institute of
Education Sciences in the U.S. Department of
Star Schools Program Education, discusses the purpose of education
The Star Schools Program, housed in the U.S. program evaluation generally, and answers spe-
Department of Education’s Office of Innova- cific questions about studies that use random as-
tion and Improvement, supports distance edu- signment to determine program effectiveness.
cation programs that encourage improved in- http://ies.ed.gov/ncee/pubs/randomqa.asp
struction across subjects and bring technology
to underserved populations. Information about Regional Educational Laboratory Program
the program and its evaluation is available on The Regional Educational Laboratory (REL) Pro-
the Department of Education’s Web site. The gram is a network of 10 applied research lab-
Star Schools Program is funding this guide. oratories that serve the education needs of the
http://www.ed.gov/programs/starschools/ states within a designated region by providing
index.html access to scientifically valid research, studies, and
other related technical assistance services. The
labs employ researchers experienced in scientific
Evaluation Methods and Tools evaluations who can provide technical assistance
on evaluation design. The REL Web site provides
Institute of Education Sciences contact information for each of the labs.
In 2002, the Education Sciences Reform Act cre- http://ies.ed.gov/ncee/edlabs/regions
ated the Institute of Education Sciences (IES)
as an institute dedicated to conducting and Local School System Planning, Implementation,
providing research, evaluation, and statistics in and Evaluation Guide
the education field. IES maintains a registry of This guide from Maryland Virtual Learning Op-
evaluators and has published the two guides portunities is a useful resource when consider-
described below. The Regional Educational ing program implementation. Planning consid-
Laboratory Program is also housed within IES. erations are divided into a three-part checklist
http://ies.ed.gov of planning, implementation, and evaluation.
Additionally, suggested roles and responsibili-
Identifying and Implementing Educational Prac- ties are provided for both district- and school-
tices Supported by Rigorous Evidence: A User based personnel. The guide can be found on-
Friendly Guide line at the following URL, using the link on the
Published by the U.S. Department of Educa- left-hand side for the “Planning, Implementa-
tion, this guide is meant to help practitioners tion, and Evaluation Guide.”
understand the meaning of “rigorous evidence” http://mdk12online.org/docs/PIEGuide.pdf
and important research and evaluation terms,
such as “randomized controlled trials.” Under- Online Program Perceiver Instrument
standing these terms and the different types of The Online Program Perceiver Instrument
evidence in the education field can help when (OPPI) was designed by the staff at Appleton
60
eSchool in Wisconsin, as an online evaluation ssuring Quality in Distance Learning and Best
A
tool to be used by Wisconsin’s network of vir- Practices for Electronically Offered Degree and
tual schools. The tool is featured as an internal Certificate Programs, available at the following
evaluation process in this guide and is currently Web sites, respectively:
available for all online learning programs across http://www.chea.org/Research/
the country to use. The Web site provides infor- Accred-Distance-5-9-02.pdf
mation about the framework and an overview
http://www.ncahlc.org/download/Best_Pract_
of how the instrument works, as well as contact
DEd.pdf
information and directions on how to become a
member of the OPPI network. Quality Matters
http://www.wisconsineschool.net/OPPI/OPPI. Quality Matters is a multi-partner project funded
asp in part by the U.S. Department of Education’s
Program Evaluation Standards Fund for the Improvement of Postsecondary
Education (FIPSE). Quality Matters has created
This document is a 1994 publication of the
a rubric and process for certifying the quality of
Joint Committee on Standards for Educational
online courses.
Evaluation, a coalition of professional associa-
tions concerned with the quality of evaluation. http://www.qualitymatters.org/index.htm
The standards address utility, feasibility, pro- Sloan Consortium
prietary, and accuracy issues, and are intended
The Sloan Consortium (Sloan-C) is a consor-
for use in checking the design and operation of
tium of institutions and organizations commit-
an evaluation.
ted to quality online education. It aims to help
http://www.wmich.edu/evalctr/jc
learning organizations improve the quality of
2005 SETDA National Leadership Institute Tool- their programming, and has a report identifying
kit on Virtual Learning five “pillars” of quality higher education online
The State Educational Technology Directors As- programs: learning effectiveness, student satis-
sociation (SETDA) was developed to provide faction, faculty satisfaction, cost effectiveness,
national leadership and facilitate collaboration and access. Sloan-C also has a Web site that
between states on education technology issues. collects information about best practices within
Each year, SETDA hosts a National Leadership each of these areas.
Institute and develops toolkits intended to help http://www.sloan-c.org
educators effectively use virtual learning. The
2004–05 toolkit includes a lengthy section on These pages identify resources created and
program evaluation.
maintained by other public and private orga-
http://www.setda.org/toolkit/toolkit2004/
nizations. This information is provided for the
index.htm
reader’s convenience. The U.S. Department of
Education is not responsible for controlling or
Resources From Higher Education guaranteeing the accuracy, relevance, timeli-
ness, or completeness of this outside informa-
Council for Higher Education Accreditation
tion. Further, the inclusion of these resources
The Council for Higher Education Accreditation
(CHEA) has a number of publications identi- does not reflect their importance, nor is it in-
fying standards and best practices in distance tended to endorse any views expressed, or prod-
learning. These include: Accreditation and ucts or services offered.
61
62
APPENDIX B
Research
Methodology
The research approach underlying this guide is “virtual schools” that provide students with a full
a combination of case study methodology and range of online courses and services; and Web
benchmarking of best practices. Used in busi- portals that provide teachers, parents, and students
nesses worldwide as they seek to continuously with a variety of online tools and supplementary
improve their operations, benchmarking has education materials. As a first step in the study
more recently been applied to education for underlying this guide, researchers compiled a list
identifying promising practices. Benchmarking of evaluations of K–12 online programs that had
is a structured, efficient process that targets key been conducted by external evaluators, research
operations and identifies promising practices organizations, foundations, and program leaders.
in relationship to traditional practice, previous This initial list, compiled via Web and document
practice at the selected sites (lessons learned), searches, was expanded through referrals from a
and local outcome data. The methodology is six-member advisory group and other knowledge-
further explained in a background document,18 able experts in the field. Forty organizations and
which lays out the justification for identifying programs were on the final list for consideration.
promising practices based on four sources of
rigor in the approach: A matrix of selection criteria was drafted and
revised based on feedback from advisors. The
• Theory and research base
three quality criteria were:
• Expert review
• The evaluation included multiple outcome
• Site evidence of effectiveness
measures, including student achievement.
• Systematic field research and cross-site
• The findings from the evaluation were widely
analysis
communicated to key stakeholders of the pro-
The steps of the research process were: defining gram being studied.
a study scope; seeking input from experts to re- • Program leaders acted on evaluation results.
fine the scope and inform site selection criteria;
screening potential sites; selecting sites to study; Researchers then rated each potential site on
conducting site interviews, visits, or both; collect- these three criteria, using publicly available in-
ing and analyzing data to write case reports; and formation, review of evaluation reports, and gap-
writing a user-friendly guide. filling interviews with program leaders. All the
included sites scored at least six of the possible
Site Selection Criteria and Process nine points across these three criteria.
In this guide, the term “online learning program” Because the goal of the publication was to
is used to describe a range of education programs showcase a variety of types of evaluations, the
and settings in the K–12 arena, including distance potential sites were coded as to such additional
learning courses offered by universities, private characteristics as internal vs. external evalua-
providers, or teachers at other schools; stand-alone tor, type of evaluation design, type of online
63
learning program, organization unit: district or • Describe how data were interpreted and re-
state, and stage of maturity. The final selection ported; and
was made to draw from as wide a range on • Share improvements made in program services
these characteristics as possible while keeping and the evaluation process.
the quality criteria high, as described above.
Analysis and Reporting
Data Collection A case report was written about each program and
Data were collected through a combination of its evaluation and reviewed by program leaders
on-site and virtual visits. Because the program and evaluators for accuracy. Drawing from these
sites themselves were not brick-and-mortar, case reports, program and evaluation documen-
phone interviews were generally sufficient and tation, and interview transcripts, the project team
cost-effective. But some site visits were con- identified common themes about the challenges
ducted face-to-face to ensure access to all avail- faced over the course of the evaluations and what
able information. Semistructured interviews were contributed to the success of the evaluations. This
conducted with program leaders, other key pro- cross-site analysis built on both the research lit-
gram staff, and evaluators. Key interviews were erature and on emerging patterns in the data.
tape recorded to ensure lively descriptions and This descriptive research process suggests prom
quotes using natural language. While conducting ising practices—ways to do things that others
the case studies, staff also obtained copies of lo- have found helpful, lessons they have learned—
cal documents, including evaluation reports and and offers practical how-to guidance. This is not
plans documenting use of evaluation findings. the kind of experimental research that can yield
Program leaders and evaluators were asked to: valid causal claims about what works. Readers
should judge for themselves the merits of these
• Describe the rationale behind the evaluation
practices, based on their understanding of why
and, if applicable, the criteria for choosing an
they should work, how they fit the local context,
external evaluator;
and what happens when they actually try them.
• Explain the challenges and obstacles that were Also, readers should understand that these de
faced throughout the evaluation process, and
scriptions do not constitute an endorsement of
how they were addressed;
specific practices or products.
• Tell how the study design was affected by
available resources; Using the Guide
• If the evaluation was conducted externally,
describe the relationship between the program Ultimately, readers of this guide will need to se-
and contractor; lect, adapt, and implement practices that meet
their individual needs and contexts. Evaluators
• Provide the framework used to design and
of online programs, whether internal or external,
implement the evaluation;
may continue to study the issues identified in this
• Tell how the appropriate measures or indica- guide, using the ideas and practices and, indeed,
tors were established;
the challenges, from these program evaluations
• Explain how the indicators are aligned to lo- as a springboard for further discussion and explo-
cal, state, and/or national standards, as well as ration. In this way, a pool of promising practices
program goals; will grow, and program leaders and evaluators
• Describe the data collection tools; alike can work together toward finding increas-
• Explain the methods used for managing and ingly effective approaches to evaluating online
securing data; learning programs.
64
Glossary of
Common
Evaluation Terms
Control group refers to the population of subjects to treatment and control groups, as with RCTs
(e.g., students) who do not receive or participate (see below). Quasi-experimental studies may
in the treatment being studied (e.g., an online be used, for example, when controlled trials are
class) but whose performance or other outcomes infeasible (e.g., when evaluators cannot assign
are being compared to those of students who students randomly to participate in a treatment)
will receive or participate in the treatment. or are considered too expensive. According to
the U.S. Department of Education’s What Works
Formative evaluations generate information Clearinghouse, strong evidence of a program’s
aimed at helping program stakeholders better effectiveness can be obtained from a quasi-
understand a program or its participants, often experimental study based on one of three de-
by examining the delivery or implementation of signs: one that “equates” treatment and control
a program. Findings from these evaluations are groups, either by matching groups based on key
generally used to make program improvements characteristics of participants or by using statisti-
or influence future decisions. cal methods to account for differences between
groups; one that employs a discontinuity design
Hierarchical linear modeling, also called in which participants are assigned to the treat-
multi-level modeling, is used for the same pur- ment and control groups based on a cutoff score
pose as regression analysis—to understand what on a pretreatment measure that typically assess-
factors are the best predictors of an outcome, es need or merit; or one that uses a “single-case
such as a test score. But researchers use hierar- design” involving repeated measurement of a
chical linear modeling to take into account fac- single subject (e.g., a student or a classroom) in
tors at different levels of an education system, different conditions or phases over time.19
such as the characteristics of the class or school
in which students are situated. Hierarchical lin- Randomized controlled trials (RCTs) are ex-
ear modeling helps statisticians address the fact perimental studies that randomly assign some
that students are generally not grouped ran- study participants to receive a treatment (e.g.,
domly within classes or schools and that class- participation in a class or program) and oth-
room- and school-level factors are often related ers to not receive the treatment. This latter is
to student outcomes. known as the control group. In an RCT, evalu-
ators compare the outcomes (e.g., test scores)
Quasi-experiments are experimental studies of the treatment group with those of the control
in which subjects are not assigned at random group; these results are used to determine the
65
effectiveness of the treatment. RCTs can provide ing to understand the relationship of a program
strong evidence of a program’s effectiveness. to student achievement.
Regression analysis is a statistical technique Summative evaluations examine the effects

used in research to determine the factors or or outcomes of a program. Findings from these
characteristics (e.g., gender, family income lev- evaluations are generally used to assess how
well a program is meeting its stated goals.
el, whether a student participated in a particular
program) that are the best predictors of an out- Treatment group refers to the population of
come. Regression analyses help statisticians iso- subjects (in this case, students) who receive or
late the relationships between individual factors participate in the treatment being studied (e.g.,
and an outcome and, thus, are useful when try- an online class).
66
Notes
1 For further description and discussion of online ecommendations (Chicago: TA Consulting & Evalu-
R
resources, online courses, virtual schools, virtual class- ation Solutions, 2006).
rooms, and virtual schooling, see Cathy Cavanaugh 11 For more information on FERPA, see http://www.
and Robert L. Blomeyer, eds., What Works in K–12 ed.gov/policy/gen/guid/fpco/ferpa/index.html.
Online Learning (Eugene, OR: International Society
12 Michael Long and Helene Jennings, Maryland Pub-
for Technology in Education, December 2007).
lic Television’s Pathways to Freedom Electronic Field
2 John Watson and Jennifier Ryan, Keeping Pace with Trip: A Randomized Controlled Trial Study of Effective-
K–12 Online Learning: A Review of State-Level Policy and ness (Calverton, MD: Macro International, 2005).
Practice (November 2007). Available online from the
13 Laura M. O’Dwyer, Rebecca Carey, and Glenn
North American Council for Online Learning (NACOL)
Kleiman, “A Study of the Effectiveness of the Louisi-
at http://www.nacol.org (last accessed June 27, 2008).
ana Algebra I Online Course,” Journal of Research on
3 Liz Pape, Mickey Revenaugh, John Watson, and Technology in Education 39, no. 3 (2007): 289–306.
Matthew Wicks, “Measuring Outcomes in K-12 Online
14 Rosina Smith, Tom Clark, and Robert L. Blomeyer,
Education Programs: The Need for CommonMetrics.”
A Synthesis of New Research on K–12 Online Learn-
Distance Learning 3, no. 3 (2006): 58.
ing (Naperville, IL: Learning Point Associates, Novem-
4 Ibid, p. 58. ber 2005). Available at http://www.ncrel.org/tech/
5 Cathy Cavanaugh, Kathy Jo Gillan, Jeff Kromrey, synthesis/synthesis.pdf (last accessed Mar. 5, 2008).
Melinda Hess, and Robert L. Blomeyer, The Effects of 15 Cathy Cavanaugh, Kathy Jo Gillan, Jeff Kromrey,
Distance Education on K–12 Student Outcomes: A Melinda Hess, and Robert L. Blomeyer, The Effects of
Meta-analysis (Naperville, IL: North Central Regional Distance Education on K–12 Student Outcomes: A
Educational Laboratory, 2004), p. 25. Available at Meta-analysis (Naperville, IL: North Central Regional
http://www.ncrel.org/tech/distance/k12distance. Educational Laboratory, 2004). Available at http://
pdf (last accessed June 30, 2008). www.ncrel.org/tech/distance/k12distance.pdf (last
6 Available online at http://www.wisconsin accessed June 27, 2008).
eschool.net (last accessed Feb. 15, 2008). 16 Gerald Stahler, “Improving the Quality of Evalu-
7 Laura M. O’Dwyer, Rebecca Carey, and Glenn ations of Federal Human Services Demonstration
Kleiman, “A Study of the Effectiveness of the Louisi- Programs,” Evaluation and Program Planning 18
ana Algebra I Online Course,” Journal of Research on (1995): 129–141.
Technology in Education 39, no. 3 (2007): 289–306. 17 Title I of the Elementary and Secondary Educa-
8 Laura M. O’Dwyer, Rebecca Carey, and Glenn tion Act (ESEA) is intended to provide financial as-
Kleiman, “The Louisiana Algebra I Online Initiative sistance to local education agencies and schools with
as a Model for Teacher Professional Development: high numbers or percentages of economically dis-
Examining Teacher Experiences,” Journal of Asyn- advantaged children to help ensure all children meet
chronous Technologies 11, no. 3 (September 2007). state academic standards.
9 Debra Friedman, Evaluation of the First Phase 18 Nikola Filby, “Approach to Methodological Rigor
of the Washington Digital Learning Commons: in the Innovation Guides,” working paper, WestEd,
Critical Reflections on the First Year (Seattle, WA: San Francisco, 2006.
Digital Learning Commons, 2004). Available at 19 What Works Clearinghouse, “Evidence Stan-
http://www.learningcommons.org/about/files/Year dards for Reviewing Studies” (Washington, D.C.: U.S.
OneEvaluation.pdf (last accessed Apr. 14, 2008). Department of Education, 2006), p. 1. Available at
10 Tom Clark and Elizabeth Oyer, Ensuring Learn- http://ies.ed.gov/ncee/wwc/pdf/study_standards_
er Success in CPS/VHS Courses: 2004–05 Report and final.pdf (last accessed Mar. 24, 2008).
67
The Department of Education’s mission is to promote student achievement
and preparation for global competitiveness by fostering educational
excellence and ensuring equal access.
www.ed.gov

Online Learning Evaluations Uncover Success Strategies

Uploaded by

Document Information

Original Description:

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Online Learning Evaluations Uncover Success Strategies

Uploaded by

Copyright:

Available Formats

U. S .

Evaluating Online Learning

U.S. Depar tment of Education

U.S. Department of Education

Office of Innovation and Improvement

Technology in Education Programs

To order copies of this report (order number ED004344P),

This report is also available on the Department’s Web site at:

Part I: Challenges of Evaluating Online Learning 7

Part II: Recommendations for Evaluating Online Learning 49

Featured Online Learning Evaluations 55

Appendix B: Research Methodology 63

Glossary of Common Evaluation Terms 65

1. Selected Variables of Profiled Online Learning Evaluations 4

2. Excerpt From Appleton eSchool’s Online Program Perceiver Instrument 17

3. Evaluation Design Characteristics 28

4. Excerpts From AZVA’s Next Steps Plan, In Response to Recommendations

Margaret Spellings, Secretary

The seven online programs that participated in Thinkport

Of course, evaluating online learning is not alto- Common Challenges of Evaluating

Researchers awarded sites up to three points on What’s in This Guide

Table 1. Selected Variables of Profiled Online Learning Evaluations

Name of Program Type of Program or Resource/ Type of Evaluation Year Evaluation

­ rimarily at readers who are familiar with basic

consider communicating findings to different measures, making it difficult for evaluators to

Key Components of Appleton eSchool’s Online Program

LEVEL OF PROGRAM PERFORMANCE

Summary Evaluating Multifaceted Online

Figure 1. Excerpt From Digital Learning Commons’ Meeting 21st Century

Table 3. Evaluation Design Characteristics

Design Characteristics Advantages Disadvantages

Common Problems When Comparing Online Programs to Face-

A second challenge for the evaluation team was

Highlights From the Three Comparative Analyses Featured in

Figure 2. Example of Tabulated Results From an Arizona Virtual Academy

Title I Student Workshop Survey—Phoenix

Source: Arizona Virtual Academy

Summary creative and flexible arrangements with agen-

* Advancement Via Individual Determination (AVID) is a program that prepares 4th

K12 Inc. Site Visit Recommendations Actions Timeline Responsible Status

Summary In some instances, recommended changes to an

­ rograms or experts from other agencies to

Alabama Connecting Classrooms, dition to distance learning courses for students,

Regression analysis is a statistical technique Summative evaluations examine the effects

You might also like

rimarily at readers who are familiar with basic

rograms or experts from other agencies to