You are on page 1of 164

Batería IV Woodcock-Muñoz

Pruebas de aprovechamiento
Main Logo_IV_4c

Barbara J. Wendling • Nancy Mather • Fredrick A. Schrank

Manual del examinador


Batería IV Woodcock-Muñoz: Pruebas de aprovechamiento

Examiner’s Manual
Barbara J. Wendling ◆ Nancy Mather ◆ Fredrick A. Schrank
Reference Citations
■ To cite the entire Batería IV battery, use:
Woodcock, R. W., Alvarado, C. G., Schrank, F. A., McGrew, K. S., Mather, N., &
Muñoz-Sandoval, A. F. (2019). Batería IV Woodcock-Muñoz. Itasca, IL: Riverside
Assessments, LLC.
■ To cite the Batería IV Pruebas de aprovechamiento, use:
Woodcock, R. W., Alvarado, C. G., Schrank, F. A., Mather, N., McGrew, K. S., & Muñoz-
Sandoval, A. F. (2019). Batería IV Woodcock-Muñoz: Pruebas de aprovechamiento. Itasca,
IL: Riverside Assessments, LLC.
■ To cite this manual, use:
Wendling, B. J., Mather, N., & Schrank, F. A. (2019). Examiner’s Manual. Batería IV
Woodcock-Muñoz: Pruebas de aprovechamiento. Itasca, IL: Riverside Assessments, LLC.

Copyright © 2019 by Riverside Assessments, LLC. All rights reserved. No part of this work
may be reproduced or transmitted in any form or by any means, electronic or mechanical,
including photocopying and recording, or by any information storage or retrieval system
without the prior written permission of Riverside Assessments, LLC unless such copying
is expressly permitted by federal copyright law. Requests for permission to make copies of
any part of the work should be addressed to Riverside Insights, Attention: Permissions, One
Pierce Place, Suite 900W, Itasca, Illinois 60143.
Published in Itasca, IL
Batería III Woodcock-Muñoz, WJ-III, WJ-R, Woodcock-Johnson, Woodcock Language
Proficiency Battery, and Woodcock-Muñoz Language Survey are registered trademarks of
Riverside Assessments, LLC.
The Batería IV logo, Bilingual Verbal Ability Tests, Dean-Woodcock, Riverside Insights,
the Riverside Insights logo, Scales of Independent Behavior–Revised, WJ IV, WJ IV
Interpretation and Instructional Interventions Program, Woodcock Interpretation and
Instructional Interventions Program, and Woodcock Language Proficiency Battery are
trademarks of Riverside Assessments, LLC.
The MindHub is a registered trademark of the Institute for Applied Psychometrics (IAP)
and Interactive Metronome.
All other trademarks are the property of their respective owners.
The Batería IV tests are not to be used in any program operating under statutes or
regulations that require disclosure of specific item content and/or correct responses to the
public, including examinees or their parents. Any unauthorized distribution of the specific
item content and/or correct responses is prohibited by copyright law.
For technical information, please visit http://www.wj-iv.com or call Riverside Insights
Customer Service at 800.323.9540.
About the Authors of the
Batería IV
Richard W. Woodcock
Richard W. Woodcock, EdD, has an extensive background in education and psychology. He
has held a variety of positions, including elementary school teacher, school psychologist,
director of special education, university professor, and editor of a test publishing company.
He earned his doctorate degree from the University of Oregon with a dual major in statistics
and psycho-education, and he completed a post-doctoral fellowship in neuropsychology at
Tufts University School of Medicine. He has published more than 135 professional books and
articles.
Since 1957, Dr. Woodcock has held positions and appointments at Western Oregon
University, the University of Northern Colorado, George Peabody College for Teachers, the
University of Arizona, the University of Southern California, the University of Virginia, and
Vanderbilt University. He is currently serving as a research professor in the Department
of Psychology and Philosophy at Texas Woman’s University. Dr. Woodcock is a Fellow of
the American Psychological Association and the American Academy of School Psychology
and a Diplomate of the American Board of Professional Psychology. In 2013 the Texas
Statewide Evaluation Project Conference honored Dr. Woodcock with its first Lifetime
Achievement Award.
Among Dr. Woodcock’s publications are the Colorado Braille Battery, the Peabody Rebus
Reading Program, the Goldman-Fristoe-Woodcock Auditory Skills Test Battery, the Woodcock
Reading Mastery Tests®–Revised, the Woodcock-Johnson® Psycho-Educational Battery–Revised,
the Batería Woodcock-Muñoz: Pruebas de habilidad cognitiva–Revisada, the Batería Woodcock-
Muñoz: Pruebas de aprovechamiento–Revisada, the Woodcock Language Proficiency Battery®–
Revised, the Scales of Independent Behavior–Revised™, the Woodcock-Muñoz Language Survey®,
the Woodcock Diagnostic Reading Battery, the Woodcock-Johnson III Tests of Cognitive Abilities,
the Woodcock-Johnson III Tests of Achievement, the Dean-Woodcock™ Neuropsychological
Assessment System, the Bilingual Verbal Ability Tests™, the Batería III Woodcock-Muñoz®:
Pruebas de habilidades cognitivas, the Batería III Woodcock-Muñoz: Pruebas de aprovechamiento,
and the Woodcock-Muñoz Language Survey–Revised.

iii
Criselda G. Alvarado
Criselda Guajardo Alvarado received her BS degree with teaching certificates in elementary
education, psychology, bilingual education, and mathematics and her MEd as an educational
diagnostician from the University of Texas Rio Grande Valley. She received her EdD in
Curriculum and Instruction with a specialization in bilingual education at the University of
Houston.
Dr. Alvarado devoted her professional life to education and assessment. She began her
career as a bilingual education teacher in an elementary school in Donna, Texas, before
becoming a middle school mathematics and reading teacher in Robstown, Texas. Dr. Alvarado
then moved to Houston, Texas, and worked as a bilingual elementary school teacher. She
also worked as an educational diagnostician and special education supervisor in several
elementary, middle, and high schools as well as in a juvenile justice charter school. She was
an adjunct professor at the University of Houston from 1994 to 1999 and in 2016, and she
was an assistant professor at the University of Houston–Clear Lake from 2012 to 2015. In
1994, she founded Education and Evaluation Consultants, where she worked for various
school districts and state departments as a consultant on dyslexia, special education, bilingual
education, and educational assessment. For several years, Dr. Alvarado taught in and directed
an all-volunteer after-school tutorial program for children and an English as a second
language (ESL) program for adults in her local community.
From 1991 to 2015, Dr. Alvarado was the director of bilingual studies at Measurement
Learning Consultants, where she worked on several English and Spanish standardization
projects, including the Woodcock-Muñoz Language Survey, Batería Woodcock-Muñoz: Pruebas
de habilidades cognitivas–Revisada, Batería Woodcock-Muñoz: Pruebas de aprovechamiento–
Revisada, Woodcock-Johnson III Tests of Cognitive Abilities, Woodcock-Johnson III Tests of
Achievement, Batería III Woodcock-Muñoz: Pruebas de habilidades cognitivas, and Batería III
Woodcock-Muñoz: Pruebas de aprovechamiento. She is a coauthor of the Woodcock-Muñoz
Language Survey–Revised, the Bilingual Verbal Ability Tests, and the Woodcock-Muñoz
Language Survey III.

Fredrick A. Schrank
Fredrick A. (Fred) Schrank is the senior author of the Woodcock-Johnson IV (WJ IV™) family
of tests. Fred is a licensed psychologist in the state of Washington and is a board-certified
specialist in school psychology by the American Board of Professional Psychology (ABPP).
He began his professional career as a counselor and diagnostician in the Dodgeville, North
Fond du Lac, and De Forest, Wisconsin, public school districts while simultaneously earning
an EdS degree in Educational Psychology from the University of Tennessee–Knoxville. He
subsequently earned a PhD from the University of Wisconsin–Madison. Dr. Schrank then
taught in graduate-level professional preparation programs at Truman State University
(Missouri) and the University of Puget Sound (Washington) prior to a three-decade career
devoted almost exclusively to the development and publication of the Woodcock-Johnson
family of tests. As part of a commitment to professional psychology, he has served as an oral
examiner for the American Board of School Psychology (ABSP) and was elected president of
the American Academy of School Psychology (AASP).
Dr. Schrank is the lead author on several WJ IV test batteries, intervention programs,
books, and book chapters, including the Woodcock-Johnson IV Tests of Cognitive Abilities, the
Woodcock-Johnson IV Tests of Oral Language, the Woodcock-Johnson IV Tests of Achievement, the
Woodcock-Johnson IV Tests of Early Cognitive and Academic Development, the WJ IV Interpretation

iv
and Instructional Interventions Program, and Essentials of WJ IV Cognitive Abilities Assessment. In
2018, Dr. Schrank was named a J. William Fulbright Specialist by the U.S. Department of State,
Bureau of Educational and Cultural Affairs. He is hosted by the Department of Educational and
Counselling Psychology of McGill University in Montreal, Quebec.

Nancy Mather
Nancy Mather is a professor at the University of Arizona in the Department of Disability
and Psychoeducational Studies. She holds an MA in Behavior Disorders and a PhD from
the University of Arizona in Special Education and Learning Disabilities. She completed
a postdoctoral fellowship under the mentorship of Dr. Samuel Kirk at the University of
Arizona.
Dr. Mather assisted Dr. Richard Woodcock with several aspects of test development for
the Woodcock-Johnson Psycho-Educational Battery–Revised (WJ-R®), including coauthoring
the Examiner’s Manuals for the WJ-R Tests of Cognitive Ability and the WJ-R Tests of
Achievement. She has been a coauthor of both the Woodcock-Johnson III (WJ III®) and the
WJ IV and has coauthored two books on the interpretation and application of the WJ IV—
Essentials of WJ IV Tests of Achievement and Woodcock-Johnson IV: Reports, Recommendations,
and Strategies.
She has served as a learning disabilities teacher, a diagnostician, a university professor,
and an educational consultant. Dr. Mather conducts research in the areas of reading and
writing development and conducts workshops on assessment and instruction both nationally
and internationally. She has published numerous articles and has coauthored several
books linking assessment and intervention, including Learning Disabilities and Challenging
Behaviors: A Guide to Intervention and Classroom Management (3rd ed.), Evidence-Based
Interventions for Students with Learning and Behavioral Challenges, Essentials of Assessment
Report Writing (2nd ed.), Essentials of Evidence-Based Academic Interventions, Writing
Assessment and Instruction for Students with Learning Disabilities (2nd ed.), and Essentials of
Dyslexia: Assessment and Intervention.

Kevin S. McGrew
Kevin S. McGrew is director of the Institute for Applied Psychometrics (IAP), LLC, a private
research and consulting organization he established in 1998. He also is a visiting lecturer
in Educational Psychology (School Psychology Program) at the University of Minnesota
and director of research for Interactive Metronome, a neurotechnology and rehabilitation
company. He holds a PhD in Educational Psychology (Special Education) from the University
of Minnesota and an MS in School Psychology and a BA in Psychology from Minnesota State
University–Moorhead.
Dr. McGrew was a practicing school psychologist for 12 years in Iowa and Minnesota.
From 1989 to 2000, he was a professor in the Department of Applied Psychology at St.
Cloud State University, St. Cloud, Minnesota. He has served as a measurement consultant
to a number of psychological test publishers, national research studies, and organizations.
Since 2009 Dr. McGrew has served as a consulting expert, via declarations or testimony, to
the courts regarding the measurement of intelligence and psychometric issues relevant to
intellectual assessment in death penalty cases (capital punishment cases involving individuals
with intellectual disabilities). Since 2014 he has served as an intelligence theory and testing
consultant at the Dharma Bermakna Foundation and the Universitas Gadjah Mada for the
Indonesia AJT Cognitive Assessment Development Project.

v
He has authored numerous publications and made state, national, and international
presentations in his primary areas of research interest in human intelligence, intellectual
assessment, human competence, applied psychometrics, and the Cattell-Horn-Carroll (CHC)
theory of cognitive abilities. He is a frequent speaker at state, national, and international
conferences. He is an active distributor of theoretical and research information via three
professional blogs and The MindHub® web portal.

Ana F. Muñoz-Sandoval
Dr. Ana F. Muñoz-Sandoval was associate director of Measurement Learning Consultants
for two decades. During that time, she directed and authored the Spanish adaptation of
several assessment instruments. Most recently, she authored the Batería III Woodcock-
Muñoz: Pruebas de habilidades cognitivas, the Batería III Woodcock-Muñoz: Pruebas de
aprovechamiento, and the Bilingual Verbal Ability Tests™. Dr. Muñoz-Sandoval also coauthored
the Woodcock-Muñoz Language Survey, the Woodcock Language Proficiency Battery–Revised
(Spanish Form), the Batería Woodcock-Muñoz: Pruebas de habilidad cognitiva–Revisada, and
the Batería Woodcock-Muñoz: Pruebas de aprovechamiento–Revisada. In 1997, Dr. Muñoz-
Sandoval participated in the development and standardization of the Woodcock-Johnson III.
By invitation, she has presented assessment workshops in Argentina, Brazil, Costa Rica, and
Mexico.
Dr. Muñoz-Sandoval holds teaching credentials from Mendoza, Argentina, where she
lived before coming to the United States in 1970. She has also lived in Italy, Nepal, Pakistan,
and South Africa. She studied German at the Tribhuvan University in Kathmandu, Nepal.
She devoted a decade to teaching Spanish language at the university level. She received a
BA degree in Anthropology from the State University College at Buffalo. That was followed
by an MS in Student Personnel Administration. She earned her EdD from the University of
Southern California with specialization in international and multicultural education.

Contributing Author
Barbara J. Wendling coauthored the Batería IV Examiner’s Manuals with Nancy Mather and
Fredrick Schrank. Barbara is an educational consultant with expertise in assessment, test
interpretation, and academic interventions. She holds an MA in Learning Disabilities, and she
has over 17 years of experience as an educator and diagnostician in Illinois public schools
and 20 years of experience in educational and assessment publishing.
Barbara has coauthored several books on assessment and intervention, including Essentials
of Evidence-Based Academic Interventions, Writing Assessment and Instruction for Students with
Learning Disabilities, and Essentials of Dyslexia: Assessment and Intervention. In addition,
she has coauthored the following books on the use and interpretation of the Woodcock-
Johnson: Essentials of the WJ III Tests of Achievement Assessment, Essentials of the WJ III Tests
of Cognitive Abilities Assessment (2nd ed.), and Essentials of the WJ IV Tests of Achievement.
She is also coauthor of the WJ III and WJ IV versions of the Woodcock Interpretation and
Instructional Interventions Program™.

vi
Acknowledgments
The Batería IV Woodcock-Muñoz was developed from the contributions of many individuals
of all ages and from all walks of life, each motivated by a desire or call to make a valuable
contribution to Spanish-language cognitive ability, oral language, and achievement
assessment. First and foremost, there would be no Batería IV without the volunteerism of
the standardization examinees who contributed their time and effort and the examiners
who administered the tests, gathered the data, and built the initial foundation for test
interpretation.
In retrospect, a few key people have made such significant contributions that even special
mention seems inadequate as an expression of their impact. Mary Ruef provided initial
project guidance, inspiration, and encouragement that was derived from her many years of
experience developing prior editions of the Batería and Woodcock-Muñoz Language Survey.
Christina Kowalczyk commanded a remarkable effort in coordinating the preparation of the
standardization materials and scoring the test results.
Jennifer Alvarado and Vanessa Cisneros were exceptionally competent and supportive
project staff at the Texas office of Education & Evaluation Consultants. In particular, Ms.
Alvarado rose to the task of completing many of the details of manuscript and data delivery
to Riverside Insights™ following the untimely death of her mother-in-law and mentor,
Criselda Alvarado. The consistency and accuracy of the Spanish test materials is in large part
thanks to the painstaking and meticulous translation and editing work of Ms. Cisneros and
Martha Rivera.
The critical task of managing all of the nuances of the Batería IV blueprint and converting
standardization data into derived scores was accomplished through the highly skilled efforts
of Erica LaForte. She, with the assistance of Michael Custer, was primarily responsible for
leading the complex and involved data analysis procedures and score interpretation plan.
Cindy Currie provided valuable development support by managing the flow of materials
through the production and publication process and reviewing test materials in careful detail.
Brenda Gilliam and Virginia Gonzalez served as content consultants during the later
stages of the project. They assisted with the item selection for several tests, reviewed
manuscript and page proofs, and consulted on verbiage for administration instructions.
Their real-world knowledge of bilingual education and their experience administering prior
versions of the Batería Woodcock-Muñoz to English learners proved to be an invaluable asset
to the project.
Last but not least, the Batería IV never would have become a reality without the supportive
product management and development leaders at Riverside Insights, including Katy Genseke,
Jamie Whitaker, and Mark Ledbetter.
FAS
BJW

vii
Dedication
This edition of the Batería Woodcock-Munoz is dedicated to the life and memory of our
colleague and friend Criselda G. Alvarado, a consummate educator and assessment
professional, who is primarily responsible for the development of the Spanish-language
adaptation and translation of the test materials you hold in your hands today. To those of
us who knew her well, she will always be remembered for the missionary zeal that she
commanded—which we are assured will continue its expression in the form of increased
understanding and better educational planning for hundreds of thousands of Spanish-
speaking bilingual individuals.
Dr. Alvarado’s professional legacy is embodied in the Batería IV; in the minds of countless
numbers of assessment professionals she trained; as well as through the healthy, engaged lives
of the students and colleagues she encouraged, mentored, and supported. To the authors of
the WJ IV, and to the thousands of psychologists and diagnosticians whom she so positively
influenced, the Batería IV will always be viewed as a testament to a life well lived in pursuit
of making a difference in equity of educational experience for English learners.
FAS
KSM
NM

viii
Table of Contents
About the Authors of the Batería IV   iii
Acknowledgments   vii
Dedication   viii

Chapter 1: Overview   1
Comparison to the Batería III Pruebas de aprovechamiento   5
Organization of the Batería IV Pruebas de aprovechamiento   6
Components of the Batería IV Pruebas de aprovechamiento   8
Test Book (Libro de pruebas)   9
Examiner’s Manual (Manual del examinador)   9
Technical Manual   9
Online Scoring and Reporting Program   9
Test Record (Protocolo de pruebas)   9
Response Booklet (Folleto de respuestas)   9
Relationship of the Batería IV to the CHC Theory of Cognitive Abilities   10
Uses of the Batería IV Pruebas de aprovechamiento   11
Use With the Batería IV COG   11
Use With the WJ IV OL   11
Diagnosis   11
Determination of Variations and Comparisons   11
Educational Programming   12
Planning Individual Programs   12
Guidance   12
Assessing Growth   13
Program Evaluation   13
Research   13
Psychometric Training   13
Examiner Qualifications   14
Confidentiality of Test Materials and Content   14

Chapter 2: Descriptions of the Batería IV APROV


Tests and Clusters   17
Batería IV APROV Tests   17
Prueba 1: Identificación de letras y palabras (Test 1: Letter-Word Identification)   18
Prueba 2: Problemas aplicados (Test 2: Applied Problems)   18
Prueba 3: Ortografía (Test 3: Spelling)   18
Prueba 4: Comprensión de textos (Test 4: Passage Comprehension)   19
Prueba 5: Cálculo (Test 5: Calculation)   19
Prueba 6: Expresión de lenguaje escrito (Test 6: Written Language Expression)   19

ix
Prueba 7: Análisis de palabras (Test 7: Word Attack)   19
Prueba 8: Lectura oral (Test 8: Oral Reading)   20
Prueba 9: Fluidez en lectura de frases (Test 9: Sentence Reading Fluency)   20
Prueba 10: Fluidez en datos matemáticos (Test 10: Math Facts Fluency)   20
Prueba 11: Fluidez en escritura de frases (Test 11: Sentence Writing Fluency)   20
Prueba 12: Rememoración de lectura (Test 12: Reading Recall)   20
Prueba 13: Números matrices (Test 13: Number Matrices)   20
Batería IV APROV Clusters   21
Reading Clusters   21
Math Clusters   22
Written Language Clusters   22
Cross-Domain Clusters   23

Chapter 3: General Administration and Scoring Procedures   25


Practice Administration   25
Exact Administration   25
Brisk Administration   26
Preparation for Testing   26
Arranging the Test Setting   26
Setting Up the Testing Materials   27
Establishing Rapport   27
Identifying Information   27
Language Background Information   28
Academic Language Exposure   28
Administration and Scoring   28
Test Selection   28
Order of Administration   29
Time Requirements   29
Suggested Starting Points (Puntos de partida sugeridos)   29
Basals and Ceilings (Niveles básicos y máximos)   30
Meeting Basal and Ceiling Criteria   30
Tests Requiring the Response Booklet (Folleto de respuestas)   35
Timed Tests   35
Examinee Requests for Information   35
Examiner Queries   35
Evaluating Test Behavior   35
“Qualitative Observation” (Observación cualitativa) Checklists   37
Scoring (Calificación)   38
Item Scoring   38
Use of Judgment in Scoring Responses   38
Additional Notations for Recording Responses   38
Scoring Multiple Responses   39
Computing Raw Scores   39
Obtaining Age- and Grade-Equivalent Scores   39
Using the Online Scoring and Reporting Program   39

x
Accommodations   40
Recommended Accommodations   41
Young Children   41
English Learners   42
Individuals With Learning and/or Reading Difficulties   43
Individuals With Attentional and Behavioral Difficulties   43
Individuals With Hearing Impairments   45
Individuals With Visual Impairments   48
Individuals With Physical Impairments   52
Interpretive Cautions   52
Use of Derived Scores   52

Chapter 4: Administering and Scoring the Batería IV


APROV Tests   53
Batería IV APROV Tests   53
Prueba 1: Identificación de letras y palabras (Test 1: Letter-Word Identification)   53
Prueba 2: Problemas aplicados (Test 2: Applied Problems)   54
Prueba 3: Ortografía (Test 3: Spelling)   55
Prueba 4: Comprensión de textos (Test 4: Passage Comprehension)   56
Prueba 5: Cálculo (Test 5: Calculation)   57
Prueba 6: Expresión de lenguaje escrito (Test 6: Written Language Expression)   57
Prueba 7: Análisis de palabras (Test 7: Word Attack)   59
Prueba 8: Lectura oral (Test 8: Oral Reading)   60
Prueba 9: Fluidez en lectura de frases (Test 9: Sentence Reading Fluency)   61
Prueba 10: Fluidez en datos matemáticos (Test 10: Math Facts Fluency)   62
Prueba 11: Fluidez en escritura de frases (Test 11: Sentence Writing Fluency)   63
Prueba 12: Rememoración de lectura (Test 12: Reading Recall)   64
Prueba 13: Números matrices (Test 13: Number Matrices)   65

Chapter 5: Scores and Interpretation   67


Levels of Interpretive Information   67
Age- and Grade-Based Norms   69
Types of Scores   70
Raw Score (Puntaje bruto)   70
W Score (Puntuación W )   71
Grade Equivalent (Equivalente de grado)   71
Age Equivalent (Equivalente de edad)   72
W Difference Score (Diferencia W )   72
Relative Proficiency Index (Índice de proficiencia relativa)   72
Instructional Zone (Zona de instrucción)   73
CALP Levels (Niveles CALP)   73
Percentile Rank (Rango percentil)   75
Standard Score (Puntuación estándar)   75
Standard Error of Measurement (Error estándar de medición)   76
Interpreting Tests   76
Interpreting the Reading Tests   77
Interpreting the Math Tests   82

xi
Interpreting the Written Language Tests   86
Interpreting Variations and Comparisons   90
Intra-Ability Variations   91
Ability/Achievement Comparisons   95
Comparative Language Index (CLI)   97
Discrepancy Scores   97
Implications Derived From Test Results   98

References   99

Appendix A: Norming and Calibration Site States and Cities   105

Appendix B: Batería IV Pruebas de aprovechamiento Examiner


Training Checklist   121

Appendix C: Batería IV General Test Observations Checklist   129

Appendix D: Glossary of Batería IV Terms in


English and Spanish   131
Tests in the Cognitive Battery   131
Tests in the Achievement Battery   132
Clusters   132
Test Components and Elements   133
Scores   134

Appendix E: Batería IV Technical Supplement   135


Translation/Adaptation   135
Calibration Study   137
Construction of Calibration Study Forms   137
Calibration Study Data Collection   138
Calibration and Equating of Items   139
Review of Item Statistics   140
Item Bias Analysis   140
Assembly and Evaluation of Final Test Forms   141
Reliability   141
Test Reliabilities   142
Cluster Reliabilities   142

List of Tables
Table 1-1  Comparison of the WJ IV and Batería IV Tests   2
Table 1-2  Comparison of the WJ IV and Batería IV Clusters   3
Table 1-3  Organization of the Batería IV APROV Tests   7
Table 1-4  Organization of the Batería IV APROV Clusters   8
Table 2-1  Batería IV APROV Selective Testing Table   18

xii
Table 3-1  Batería IV APROV Tests Useful for Individuals With Hearing Impairments   47
Table 3-2  Batería IV APROV Tests Useful for Individuals With Visual Impairments   51
Table 5-1  Hierarchy of Batería IV APROV Test Information   68
Table 5-2  APROV Clusters That Yield a CALP Level   73
Table 5-3  CALP Levels and Corresponding Implications   74
Table 5-4  Classification of Standard Score and Percentile Rank Ranges   76
Table 5-5  Percentage by Age of Occurrence of Qualitative Observations for
Prueba 1: Identificación de letras y palabras    79
Table 5-6  Percentage by Age of Occurrence of Qualitative Observations for
Prueba 4: Comprensión de textos   81
Table 5-7  Percentage by Age of Occurrence of Qualitative Observations for
Prueba 9: Fluidez en lectura de frases   82
Table 5-8  Percentage by Age of Occurrence of Qualitative Observations for
Prueba 2: Problemas aplicados   84
Table 5-9  Percentage by Age of Occurrence of Qualitative Observations for
Prueba 5: Cálculo   85
Table 5-10  Percentage by Age of Occurrence of Qualitative Observations for
Prueba 10: Fluidez en datos matemáticos    86
Table 5-11  Percentage by Age of Occurrence of Qualitative Observations for
Prueba 3: Ortografía   88
Table 5-12  Percentage by Age of Occurrence of Qualitative Observations for
WJ IV ACH Test 6: Writing Samples   89
Table 5-13  Percentage by Age of Occurrence of Qualitative Observations for
Prueba 11: Fluidez en escritura de frases   90
Table 5-14  Batería IV Intra-Ability Variation and Ability/Achievement Comparison
Procedures   91
Table 5-15  Batería IV Intra-Achievement Variations   93
Table 5-16  Batería IV Academic Skills/Academic Fluency/Academic
Applications Variations   94
Table 5-17  Batería IV Oral Language/Achievement Comparisons   97
Table E-1  Translated and Adapted Tests of the Batería IV   136
Table E-2  Distribution of the Batería IV Calibration Sample by Age Group   138
Table E-3  Distribution of Sampling Variables in the Batería IV Calibration Study   139
Table E-4  Percentage of Batería IV Calibration Items Flagged for
Potential Gender DIF   141
Table E-5  Reliability Coefficients for Batería IV Nonspeeded Tests by Age Group   143
Table E-6  Test-Retest Reliability Coefficients From the WJ IV/Batería IV
Speeded Test-Retest Study   145
Table E-7  Reliability Coefficients for Batería IV Clusters by Age Group   145

List of Figures
Figure 1-1  Components of the Batería IV APROV.   8
Figure 3-1  Recommended arrangement for administering the test.   27
Figure 3-2  Suggested Starting Points table for Prueba 2: Problemas aplicados.   30

xiii
Figure 3-3  Example of Item 1 used as the basal on Prueba 1: Identificación de letras
y palabras.   32
Figure 3-4  Determination of basal and ceiling with two apparent basals and two
apparent ceilings.   34
Figure 3-5  The “Test Session Observations Checklist” from the Test Record.   36
Figure 3-6  “Qualitative Observation” checklist for Prueba 1: Identificación de letras
y palabras.   37
Figure 4-1  Reading error types in Prueba 8: Lectura oral.   60
Figure 4-2  Example of Test Record and “Qualitative Observation Tally” for
Prueba 8: Lectura oral.   61
Figure 5-1  Comparison of the traditional and extended percentile rank scales with the
standard score scale (M = 100, SD = 15).   75
Figure 5-2  Various skills measured by the Batería IV APROV reading tests.   77
Figure 5-3  Various skills measured by the Batería IV APROV math tests.   83
Figure 5-4  Various skills measured by the Batería IV APROV writing tests.   87
Figure 5-5  Three types of intra-ability variation models in the Batería IV.   92
Figure 5-6  Four types of ability/achievement comparison models in the Batería IV.  95

xiv
Chapter 1

Overview
The Batería IV Woodcock-Muñoz (Batería IV; Woodcock, Alvarado, Schrank, McGrew, Mather,
& Muñoz-Sandoval, 2019a) is a comprehensive, Spanish-language psychoeducational
assessment system that includes two test batteries: the Batería IV Woodcock-Muñoz: Pruebas
de habilidades cognitivas (Batería IV COG; Woodcock, Alvarado, Schrank, McGrew, Mather,
& Muñoz-Sandoval, 2019b) and the Batería IV Woodcock-Muñoz: Pruebas de aprovechamiento
(Batería IV APROV; Woodcock, Alvarado, Schrank, Mather, McGrew, & Muñoz-Sandoval,
2019). Tests included in the Batería IV are either adaptations or translations of tests from the
Woodcock-Johnson IV (WJ IV; Schrank, McGrew, & Mather, 2014). The Batería IV batteries
can be used in conjunction with the Woodcock-Johnson IV Tests of Oral Language (WJ IV OL;
Schrank, Mather, & McGrew, 2014) to form a broad-based assessment of cognitive abilities,
achievement, and comparative oral language abilities. The WJ IV OL includes oral language
tests in English as well as Spanish to provide information on relative language proficiency and
language dominance.
Some of the Batería IV tests can be used with Spanish-speaking individuals as young as 24
months, but the majority of the Batería IV tests are best suited for use with individuals from 5
to 95 years of age. Spanish-language calibration data, based on a sample of 601 native Spanish
speakers, are equated to the large, nationally representative WJ IV norming sample of 7,416
individuals ranging from 2 to 90+ years of age. The Spanish standardization data were used to
calibrate the new test items and to equate the items to the scales underlying the WJ IV tests.
The equating procedure produces a psychometrically sound interpretive model that allows an
examiner to describe an individual’s performance on the Spanish tests in terms of comparable
ability in English. See Appendix E for technical information about the development, norming,
and calibration of the Batería IV. Additional information about the WJ IV norming sample is
provided in the Woodcock-Johnson IV Technical Manual (McGrew, LaForte, & Schrank, 2014),
available via download from the online scoring and reporting program.
The Batería IV is based on the Cattell-Horn-Carroll (CHC) theory of cognitive abilities,
sometimes referred to as CHC theory version 2 (McGrew et al., 2014; Schneider & McGrew,
2012, 2018). As part of the translation-adaptation process, some of the most educationally
and diagnostically useful WJ IV tests were selected for inclusion in the Batería IV.
Consequently, there are some differences in the number of tests and clusters that are available
in the Batería IV. Table 1-1 lists the tests in the WJ IV and the Batería IV. Table 1-2 lists the
clusters in the WJ IV and the Batería IV.

Overview 1
Table 1-1. WJ IV BATERÍA IV
Comparison of the WJ IV
Woodcock-Johnson IV Tests of Cognitive Batería IV Woodcock-Muñoz: Pruebas de habilidades
and Batería IV Tests Abilities (two Test Books) cognitivas (one Test Book)
Test 1: Oral Vocabulary Prueba 1: Vocabulario oral
Test 2: Number Series Prueba 2: Series numéricas
Test 3: Verbal Attention Prueba 3: Atención verbal
Test 4: Letter-Pattern Matching Prueba 4: Pareo de letras idénticas
Test 5: Phonological Processing Prueba 5: Procesamiento fonético
Test 6: Story Recall Prueba 6: Rememoración de cuentos
Test 7: Visualization Prueba 7: Visualización
Test 8: General Information Prueba 8: Información general
Test 9: Concept Formation Prueba 9: Formación de conceptos
Test 10: Numbers Reversed Prueba 10: Inversión de números
Test 11: Number-Pattern Matching Prueba 11: Pareo de números idénticos
Test 12: Nonword Repetition Prueba 12: Repetición de palabras sin sentido
Test 13: Visual-Auditory Learning (Can be administered from Batería III COG Standard Test Book
and scored using Batería IV norms)
Test 14: Picture Recognition (Can be administered from Batería III COG Extended Test Book
and scored using Batería IV norms)
Test 15: Analysis-Synthesis
Test 16: Object-Number Sequencing
Test 17: Pair Cancellation Prueba 13: Cancelación de pares
Test 18: Memory for Words
(Included from WJ IV OL Test 4: Rapid Picture Naming) Prueba 14: Rapidez en la identificación de dibujos

Woodcock-Johnson IV Tests of Achievement Batería IV Woodcock-Muñoz: Pruebas de


(two Test Books) aprovechamiento (one Test Book)
Test 1: Letter-Word Identification Prueba 1: Identificación de letras y palabras
Test 2: Applied Problems Prueba 2: Problemas aplicados
Test 3: Spelling Prueba 3: Ortografía
Test 4: Passage Comprehension Prueba 4: Comprensión de textos
Test 5: Calculation Prueba 5: Cálculo
Test 6: Writing Samples Prueba 6: Expresión de lenguaje escrito*
Test 7: Word Attack Prueba 7: Análisis de palabras
Test 8: Oral Reading Prueba 8: Lectura oral
Test 9: Sentence Reading Fluency Prueba 9: Fluidez en lectura de frases
Test 10: Math Facts Fluency Prueba 10: Fluidez en datos matemáticos
Test 11: Sentence Writing Fluency Prueba 11: Fluidez en escritura de frases
Test 12: Reading Recall Prueba 12: Rememoración de lectura
Test 13: Number Matrices Prueba 13: Números matrices
Test 14: Editing
Test 15: Word Reading Fluency

2 Overview
Table 1-1. (cont.) WJ IV BATERÍA IV
Comparison of the WJ IV
Woodcock-Johnson IV Tests of Achievement Batería IV Woodcock-Muñoz: Pruebas de
and Batería IV Tests
(two Test Books) aprovechamiento (one Test Book)
Test 16: Spelling of Sounds
Test 17: Reading Vocabulary
Test 18: Science
Test 19: Social Studies
Test 20: Humanities

Woodcock-Johnson IV Tests of Oral Language (No separate oral language battery)


Test 1: Picture Vocabulary
Test 2: Oral Comprehension
Test 3: Segmentation
Test 4: Rapid Picture Naming (Included in Batería IV COG Test 14)
Test 5: Sentence Repetition
Test 6: Understanding Directions
Test 7: Sound Blending
Test 8: Retrieval Fluency
Test 9: Sound Awareness
Test 10: Vocabulario sobre dibujos (If administered, can be included in Batería IV reports)
Test 11: Comprensión oral (If administered, can be included in Batería IV reports)
Test 12: Comprensión de indicaciones (If administered, can be included in Batería IV reports)
*The Spanish test Expresión de lenguaje escrito (Written Language Expression) is similar to the English Writing Samples test but is easier to score.

Table 1-2. WJ IV BATERÍA IV


Comparison of the WJ IV
Woodcock-Johnson IV Tests of Cognitive Batería IV Woodcock-Muñoz: Pruebas de habilidades
and Batería IV Clusters Abilities cognitivas
General Intellectual Ability (GIA) Habilidad intelectual general
Brief Intellectual Ability (BIA) Habilidad intelectual breve
Gf-Gc Composite Gf-Gc combinado
Comprehension-Knowledge (Gc) Comprensión-conocimiento
Fluid Reasoning (Gf  ) Razonamiento fluido
Short-Term Working Memory (Gwm) Memoria de trabajo a corto plazo
Cognitive Processing Speed (Gs) Velocidad de procesamiento cognitivo
Auditory Processing (Ga) Procesamiento auditivo
Long-Term Storage and Retrieval (Glr ) Almacenamiento y recuperación a largo plazo*
Visual Processing (Gv ) Procesamiento visual*
Quantitative Reasoning
Auditory Memory Span
Number Facility Destreza numérica
Perceptual Speed Rapidez perceptual
Vocabulary Vocabulario (requires WJ IV OL Prueba 10)

Overview 3
Table 1-2. (cont.) WJ IV BATERÍA IV
Comparison of the WJ IV
Woodcock-Johnson IV Tests of Cognitive Batería IV Woodcock-Muñoz: Pruebas de habilidades
and Batería IV Clusters Abilities cognitivas
Cognitive Efficiency Eficiencia cognitiva
Reading Aptitude Aptitud de lectura
Math Aptitude Aptitud matemática
Writing Aptitude Aptitud de escritura

Woodcock-Johnson IV Tests of Achievement Batería IV Woodcock-Muñoz: Pruebas de


aprovechamiento
Reading Lectura
Broad Reading Lectura amplia
Basic Reading Skills Destrezas básicas en lectura
Reading Comprehension Comprensión de lectura
Reading Fluency Fluidez en la lectura
Reading Rate
Mathematics Matemáticas
Broad Mathematics Matemáticas amplias
Math Calculation Skills Destrezas en cálculos matemáticos
Math Problem Solving Resolución de problemas matemáticos
Written Language Lenguaje escrito
Broad Written Language Lenguaje escrito amplio
Basic Writing Skills
Written Expression Expresión escrita
Academic Skills Destrezas académicas
Academic Fluency Fluidez académica
Academic Applications Aplicaciones académicas
Academic Knowledge
Phoneme-Grapheme Knowledge
Brief Achievement Aprovechamiento breve
Broad Achievement Aprovechamiento amplio

Woodcock-Johnson IV Tests of Oral Language (Requires administration of WJ IV OL Pruebas


10–12)
Broad Oral Language Amplio lenguaje oral
Oral Language Lenguaje oral
Listening Comprehension Comprensión auditiva
*Requires the administration of one test from the Batería III COG

Multiple goals guided the WJ IV revision blueprint that underlies the Batería IV. First,
this comprehensive assessment system was designed to be on the cutting edge of practice.
It facilitates exploring individual strengths and weaknesses across cognitive, linguistic,
and academic abilities; complements response to intervention (RTI) models; and reframes
variations and ability/achievement comparisons. Second, the blueprint pushed the tests
beyond CHC theory as it was conceived in the Woodcock-Johnson III (WJ III ; Woodcock, ®

4 Overview
McGrew, & Mather, 2001, 2007) and the Batería III Woodcock-Muñoz (Muñoz-Sandoval,
Woodcock, McGrew, & Mather, 2005, 2007a). Whereas the third editions of these tests
focused primarily on broad CHC abilities, the fourth editions focus on the most important
broad and narrow CHC abilities for describing cognitive performance and understanding
the nature of learning problems (McGrew, 2012; McGrew & Wendling, 2010; Schneider &
McGrew, 2012, 2018). Some of the tests and clusters emphasize narrow CHC abilities, and
others were designed to reflect the importance of cognitive complexity through the influence
of two or more narrow abilities on task requirements. Additional goals addressed ease and
flexibility of use. New features allow novice examiners to use the tests with confidence while
providing experienced examiners with a rich array of interpretive options to customize and
enhance their evaluations. The structure of the WJ IV and Batería IV systems also facilitates
examiner use by creating comprehensive cognitive, achievement, and oral language batteries
that can be used in conjunction with one another or as standalone batteries.
The WJ IV and Batería IV interpretation plan includes a full array of derived scores for
reporting results. The accompanying online scoring and reporting program quickly calculates
and reports all derived scores.
This manual describes the Batería IV APROV, which can be used independently or in
combination with the WJ IV OL and/or the Batería IV COG batteries. While the manual is
written in English, all test and cluster names are presented in Spanish. Initially, these names
are in Spanish with English in parentheses. Later the names are presented only in Spanish.
Test components and types of scores are presented one time in English with Spanish in
parentheses, and all future references are in English. Examiners can refer to Tables 1-1
and 1-2 in this chapter and the glossary in Appendix D of this manual for assistance with
translation.

Comparison to the Batería III Pruebas de aprovechamiento


The Batería IV APROV is a revised and expanded version of the Batería III Woodcock-Muñoz:
Pruebas de aprovechamiento (Batería III APROV; Muñoz-Sandoval, Woodcock, McGrew,
& Mather, 2005, 2007b). While many of the features of the Batería III APROV have been
retained, extensive renorming and the addition of several new tests, clusters, and interpretive
procedures improve and increase the diagnostic power of this instrument.
Following is a summary of the major changes from the Batería III APROV to the
Batería IV APROV.
■ The Batería IV APROV includes a core set of tests (Tests 1 through 6) that are used for

calculating the Lectura (Reading), Matemáticas (Mathematics), Lenguaje escrito (Written


Language), Destrezas académicas (Academic Skills), Aplicaciones académicas (Academic
Applications), and Aprovechamiento breve (Brief Achievement) clusters and that provide
the basis for the intra-achievement variation procedure. Additional tests may be added
to the core variation procedure on a selective testing basis, and any derived clusters are
also evaluated in a pattern of strengths and weaknesses (PSW) analysis.
■ Batería IV APROV has just one Test Book, which contains 13 tests measuring academic

performance in reading, written language, and mathematics.


■ Four new tests have been added to the Batería IV APROV: Lectura oral (Oral Reading),

Rememoración de lectura (Reading Recall), Expresión de lenguaje escrito (Written


Language Expression), and Números matrices (Number Matrices).
■ There are 17 clusters, including 5 new clusters: Lectura (Reading), Fluidez en la lectura

(Reading Fluency), Matemáticas (Mathematics), Lenguaje escrito (Written Language),


and Aprovechamiento breve (Brief Achievement).

Overview 5
■ Several Batería III APROV tests have been relocated. The Spanish forms of Vocabulario
sobre dibujos (Picture Vocabulary), Comprensión de indicaciones (Understanding
Directions), and Comprensión oral (Oral Comprehension) are now in the WJ IV Tests
of Oral Language. These three tests may be used in conjunction with the Batería
IV tests and entered into the online scoring and reporting program for inclusion in
interpretation and reporting. Rememoración de cuentos (Story Recall) is now in the
Batería IV COG.
■ Three Batería III APROV tests have been replaced: Quantitative Concepts has been
replaced by Números matrices (Number Matrices), a stronger measure of mathematical
reasoning; Writing Samples has been replaced by Expresión de lenguaje escrito (Written
Language Expression), a test that is easier to score; and Reading Vocabulary has been
replaced by Rememoración de lectura (Reading Recall), a more comprehensive measure
of reading comprehension.
■ The addition of the new test Lectura oral (Oral Reading), when combined with Fluidez
en lectura de frases (Sentence Reading Fluency), allows a Fluidez en la lectura (Reading
Fluency) cluster score to be obtained.
■ There are no audio-recorded tests in the Batería IV APROV.
■ Scoring is available through the web-based online scoring and reporting program that is
included with the purchase of each Test Record.
■ The following Batería III APROV tests have been eliminated from the Batería IV
APROV: Editing, Spelling of Sounds, Academic Knowledge, Sound Awareness, Story
Recall–Delayed, and Punctuation and Capitalization.
■ Three test names were changed to more accurately reflect the task: Writing Fluency is
now Prueba 11: Fluidez en escritura de frases (Test 11: Sentence Writing Fluency); Math
Fluency is now Prueba 10: Fluidez en datos matemáticos (Test 10: Math Facts Fluency);
and Reading Fluency is now Prueba 9: Fluidez en lectura de frases (Test 9: Sentence
Reading Fluency).
■ The procedures for evaluating ability/achievement comparisons and intra-ability
variations have been simplified and offer increased flexibility for the examiner.
∘ Three types of intra-ability variations are available: intra-cognitive, intra-
achievement, and academic skills/academic fluency/academic applications.
∘ Four types of ability/achievement comparisons are available: general intellectual
ability (GIA), Gf-Gc composite, scholastic aptitude, and oral language ability
(requires administration of the Spanish forms of three tests from WJ IV OL).
■ The Batería III predicted achievement/achievement discrepancy procedure has been
replaced with the scholastic aptitude/achievement comparison procedure. There are four
specific aptitude clusters: two for reading, one for math, and one for writing. Each of
these four aptitude clusters contains four cognitive tests that best predict performance
in the specific achievement area.

Organization of the Batería IV Pruebas de aprovechamiento


One goal of the revision was to increase ease of use and flexibility of the Batería IV APROV,
and the organization of the tests reflects this goal. For example, Tests 1 through 6 represent a
core set of tests that yields clusters in Lectura (Reading), Lenguaje escrito (Written Language),
Matemáticas (Mathematics), Destrezas académicas (Academic Skills), Aplicaciones académicas
(Academic Applications), and Aprovechamiento breve (Brief Achievement) and serve as the
basis for the intra-achievement variation procedure. Additional tests can be selected to
address the individual’s specific referral questions.

6 Overview
An examiner seldom needs to administer all of the tests or complete all of the interpretive
options for a single person. The importance of selective testing becomes apparent as the
examiner gains familiarity with the Batería IV APROV. An analogy to craftsmanship is
appropriate: The Batería IV APROV provides an extensive tool chest that can be used
selectively by a variety of skilled assessment professionals. Different assessments require
different combinations of tools.
Table 1-3 lists the tests included in the Batería IV APROV. Icons following several tests
indicate tests that are administered using the Response Booklet ( ) and tests that are
timed ( ). The table groups the tests by content area rather than by order of appearance in
the Test Book.

Table 1-3. ACADEMIC AREA TESTS


Organization of the Reading Prueba 1: Identificación de letras y palabras
Batería IV APROV Tests
Prueba 4: Comprensión de textos
Prueba 7: Análisis de palabras
Prueba 8: Lectura oral
Prueba 9: Fluidez en lectura de frases
Prueba 12: Rememoración de lectura
Mathematics Prueba 2: Problemas aplicados
Prueba 5: Cálculo
Prueba 10: Fluidez en datos matemáticos
Prueba 13: Números matrices
Writing Prueba 3: Ortografía
Prueba 6: Expresión de lenguaje escrito
Prueba 11: Fluidez en escritura de frases
= requires the Response Booklet
= timed test

Table 1-4 illustrates the 17 clusters, or groupings of tests, that are available from the
Batería IV APROV. These clusters are the primary source of interpretive information to help
identify performance levels, determine educational progress, and identify an individual’s
strengths and weaknesses.

Overview 7
Table 1-4. CURRICULAR AREA CLUSTERS
Organization of the
Reading Lectura
Batería IV APROV Clusters
Lectura amplia
Destrezas básicas en lectura
Comprensión de lectura
Fluidez en la lectura
Mathematics Matemáticas
Matemáticas amplias
Destrezas en cálculos matemáticos
Resolución de problemas matemáticos
Writing Lenguaje escrito
Lenguaje escrito amplio
Expresión escrita
Cross-Domain Clusters Destrezas académicas
Fluidez académica
Aplicaciones académicas
Aprovechamiento breve
Aprovechamiento amplio

Components of the Batería IV Pruebas de aprovechamiento


The Batería IV APROV contains one Test Book easel, this Examiner’s Manual, a package of
Test Records and examinee Response Booklets, scoring guides, and an optional carrying case.
Figure 1-1 shows the components of the Batería IV APROV. In addition, access to download
the WJ IV Technical Manual from the online scoring and reporting program is included with
each Batería IV test kit.

Figure 1-1.
Components of the
Batería IV APROV.

8 Overview
Test Book (Libro de pruebas)
The Test Book is in an easel format positioned so the stimulus pictures or words face
the examinee and the directions face the examiner. Specific administration directions are
provided page by page for all tests.

Examiner’s Manual (Manual del examinador)


The Examiner’s Manual includes detailed information for using the Batería IV APROV.
Chapter 1 is an overview. Chapter 2 provides descriptions of the tests and the clusters.
General administration and scoring procedures and accommodations for special populations
are discussed in Chapter 3. Chapter 4 provides specific administration and scoring
instructions for each test. Chapter 5 provides a discussion of the scores and levels of
interpretive information that are available.
This manual also includes several appendices. Appendix A contains a list of norming and
calibration sites. Appendices B and C contain reproducible checklists to assist examiners
in building competency with the Batería IV APROV. Appendix B is the “Batería IV Pruebas
de aprovechamiento Examiner Training Checklist,” a test-by-test form that may be used as
an observation or self-study tool. Appendix C is the “Batería IV General Test Observations
Checklist,” which covers general testing procedures and may be used by an experienced
examiner when observing a new examiner. Appendix D presents an English/Spanish glossary
for the names of tests, clusters, scores, components, and other terms used in the Batería IV.
Appendix E contains technical information about the Batería IV Spanish calibration study.

Technical Manual
The WJ IV Technical Manual may be downloaded as a PDF from the online scoring and
reporting program and provides a summary of the development, standardization, and technical
characteristics of the WJ IV; most of this technical information also applies to the Batería IV.

Online Scoring and Reporting Program


The online scoring and reporting program eliminates the time-consuming norm table
searches required when scoring a test by hand and reduces the possibility of clerical errors.
The automated online scoring quickly and accurately provides all derived scores for the tests
and clusters and computes variations and comparisons.

Test Record (Protocolo de pruebas)


The Test Record includes guidelines for examiner scoring and is used to record identifying
information, observations of behavior, examinee responses, raw scores, and other information
that may be helpful in interpreting test results. Built-in scoring tables for each test enable the
examiner to immediately obtain estimated age- and grade-equivalent scores.

Response Booklet (Folleto de respuestas)


The Response Booklet provides space for the examinee to respond to items requiring written
responses or mathematical calculations. Prueba 3: Ortografía (Test 3: Spelling), Prueba 5:
Cálculo (Test 5: Calculation), Prueba 6: Expresión de lenguaje escrito (Test 6: Written Language
Expression), Prueba 9: Fluidez en lectura de frases (Test 9: Sentence Reading Fluency), Prueba
10: Fluidez en datos matemáticos (Test 10: Math Facts Fluency), and Prueba 11: Fluidez en
escritura de frases (Test 11: Sentence Writing Fluency) all require the Response Booklet. In

Overview 9
addition, a worksheet is provided in the Response Booklet for Prueba 2: Problemas aplicados
(Test 2: Applied Problems) and Prueba 13: Números matrices (Test 13: Number Matrices).

Relationship of the Batería IV to the CHC Theory of Cognitive Abilities


The Batería IV APROV, Batería IV COG, and WJ IV OL are three parts of a comprehensive
diagnostic system. Interpretation of the Batería IV tests and clusters is based on the Cattell-
Horn-Carroll (CHC) theory of cognitive abilities. Additional information on CHC theory can
be found in the Batería IV Pruebas de habilidades cognitivas Examiner’s Manual (Wendling,
Mather, & Schrank, 2019), as well as in the WJ IV Technical Manual.
The Batería IV COG has seven CHC factors. Two of the CHC factors, fluid reasoning (Gf)
and comprehension-knowledge (Gc), can be traced to Cattell (1941, 1943, 1950) and his
work on Gf-Gc, or fluid and crystallized intelligence. Later, Horn (1965) identified short-
term memory (Gsm), long-term retrieval (Glr), processing speed (Gs), and visual-spatial
thinking (Gv) as distinct abilities. Auditory processing (Ga) was identified by Horn and
Stankov (1982). The CHC abilities have been refined and integrated by Woodcock (McArdle
& Woodcock, 1998; Woodcock, 1988, 1990, 1993, 1994, 1998) and McGrew (1997, 2005,
2009) and recently revised by Schneider and McGrew (2012, 2018).
The Batería IV APROV contains tests that tap two other identified cognitive abilities:
quantitative knowledge (Gq; identified by Horn, 1988, 1989) and reading-writing ability
(Grw; identified by Carroll and Maxwell, 1979, and Woodcock, 1998). The Batería IV APROV
also includes additional measures of comprehension-knowledge (Gc), long-term storage and
retrieval (Glr), and auditory processing (Ga). Because most achievement tests require the
integration of multiple cognitive abilities, information about processing can be obtained by
a skilled examiner. For example, processing speed (Gs) is involved in all speeded or timed
tasks, including Prueba 9: Fluidez en lectura de frases (Test 9: Sentence Reading Fluency),
Prueba 10: Fluidez en datos matemáticos (Test 10: Math Facts Fluency), and Prueba 11: Fluidez
en escritura de frases (Test 11: Sentence Writing Fluency).
Gq is represented by Prueba 2: Problemas aplicados (Test 2: Applied Problems), Prueba 5:
Cálculo (Test 5: Calculation), Prueba 10: Fluidez en datos matemáticos (Test 10: Math Facts
Fluency), and Prueba 13: Números matrices (Test 13: Number Matrices).
Grw is represented by Prueba 1: Identificación de letras y palabras (Test 1: Letter-Word
Identification), Prueba 3: Ortografía (Test 3: Spelling), Prueba 4: Comprensión de textos (Test
4: Passage Comprehension), Prueba 6: Expresión de lenguaje escrito (Test 6: Written Language
Expression), Prueba 8: Lectura oral (Test 8: Oral Reading), Prueba 9: Fluidez en lectura de
frases (Test 9: Sentence Reading Fluency), Prueba 11: Fluidez en escritura de frases (Test 11:
Sentence Writing Fluency), and Prueba 12: Rememoración de lectura (Test 12: Reading Recall).
Glr, especially the narrow ability of meaningful memory, is required in Prueba 12:
Rememoración de lectura (Test 12: Reading Recall). Associative memory, another narrow
Glr ability, is required in many of the tests that measure decoding, encoding, or recall of
math facts.
Ga, in particular the narrow ability of phonetic coding, is required in Prueba 7: Análisis de
palabras (Test 7: Word Attack).

10 Overview
Uses of the Batería IV Pruebas de aprovechamiento
The procedures followed in developing and standardizing the Batería IV APROV have
produced an instrument that can be used with confidence in a variety of settings. The
wide age range and breadth of coverage allow the Batería IV APROV tests to be used for
educational, clinical, or research purposes from the preschool to the geriatric level. Because
the Batería IV APROV is co-normed with both the Batería IV COG and the Spanish forms of
three tests from the WJ IV OL, accurate predictions and comparisons can be made among the
batteries.

Use With the Batería IV COG


When the Batería IV APROV is used with the Batería IV COG, the relationships between
cognitive abilities and achievement can be explored and strengths and weaknesses can be
documented. Further, in cases where an ability/achievement discrepancy is desired, actual
discrepancy norms are available.

Use With the WJ IV OL


When the Batería IV APROV is used with the WJ IV OL, the relationship between oral
language ability and academic achievement can be explored using the oral language/
achievement comparison procedure. For example, Amplio lenguaje oral (Broad Oral Language)
can be used as an ability measure and selected as an option when making ability/achievement
comparisons. In addition, if the parallel Spanish and English tests are administered,
information on language dominance and relative language proficiency can be obtained.
Understanding the role of oral language in academic performance is often an important
component of an evaluation for a specific learning disability. The tests from the WJ IV OL
must be administered within 30 days of the Batería IV administration. The Test Record from
the WJ IV OL must be committed in the online scoring and reporting program prior to
running the Batería IV report.

Diagnosis
An examiner can use the Batería IV APROV to determine and describe a profile of an
individual’s academic strengths and weaknesses. Additionally, test results help determine how
certain factors affect related aspects of development. For example, a weakness in phoneme/
grapheme knowledge may interfere with overall development in reading and spelling.
Similarly, a weakness in spelling may help explain an individual’s difficulties on school
assignments requiring writing.
An examiner also can use the Batería IV APROV for a more in-depth evaluation after an
individual has failed a screening procedure (e.g., a kindergarten screening) or to substantiate
the results of other tests or prior evaluations.

Determination of Variations and Comparisons


The information provided by the Batería IV APROV, the Batería IV COG, and the Spanish
forms of three tests from the WJ IV OL is particularly appropriate for documenting the
nature of, and differentiating between, intra-ability (intra-achievement, academic skills/
academic fluency/academic applications, intra-cognitive) variations and ability/achievement
discrepancies or comparisons (general intellectual ability/achievement, Gf-Gc/other ability,
scholastic aptitude/achievement, oral language ability/achievement).

Overview 11
The Batería IV intra-ability variations are useful for understanding an individual’s strengths
and weaknesses, diagnosing and documenting the existence of specific abilities and disabilities,
and acquiring the most relevant information for educational and vocational planning. Analysis
of this in-depth assessment data, which goes well beyond the historical and traditional
singular focus on ability/achievement discrepancy data, can be linked more directly to
recommendations for service delivery and the design of an appropriate educational program.
Although many unresolved issues characterize the appropriate determination and
application of discrepancy information in the field of learning disabilities, an ability/
achievement discrepancy may be used as part of the selection criteria for learning disability
(LD) programs. Even though a discrepancy may be statistically significant, this type of
comparison is rarely appropriate as the sole criterion for determining the existence or
nonexistence of a learning disability or for determining eligibility for special services.
Analyses of other abilities and an understanding of the relationships and interactions
among various abilities and skills are needed to determine whether a person does or does
not have a learning disability. Given the problems inherent in employing and interpreting
ability/achievement discrepancies, multiple sources of information, including background
information (e.g., educational history, classroom performance), as well as clinical experience,
are needed to make an accurate diagnosis.

Educational Programming
When combined with behavioral observations, work samples, and other pertinent
information, Batería IV APROV results will help the skilled clinician make decisions
regarding educational programming. The test results demonstrate a student’s most appropriate
instructional level and the types of services that may be needed. The Batería IV APROV also
can assist in vocational planning, particularly when successful job performance depends on
specific types of skills, such as reading, writing, or mathematics performance.

Planning Individual Programs


The Batería IV APROV reliability and validity characteristics meet basic technical
requirements for use as a basis for planning individual programs (McGrew et al., 2014).
In schools, Batería IV APROV results can be useful in setting broad instructional goals
when developing an Individualized Education Program (IEP) or in recommending
accommodations or curricular adjustments for an individual. Batería IV APROV results can
be helpful in determining the instructional needs of individuals working toward a General
Equivalency Diploma (GED) or preparing to take a minimum competency examination. In
a rehabilitation setting, the Batería IV APROV can provide information to help establish an
appropriate service delivery program. To develop an individualized program, the examiner
can use information regarding the examinee’s strengths and weaknesses among the various
achievement areas. The data may indicate the need for a more in-depth assessment within a
specific achievement area, such as mathematics, using criterion-referenced, curriculum-based
measurements or informal assessments.

Guidance
The Batería IV APROV can provide guidance in educational and clinical settings. The
results of the evaluation can help teachers, counselors, social workers, and other personnel
understand the nature of an individual’s academic strengths and weaknesses and determine
the necessary levels of assistance. The Batería IV APROV also can provide valuable
information to help parents understand their child’s particular academic problems or needs.

12 Overview
Assessing Growth
The Batería IV APROV can provide a record of functioning and growth throughout an
individual’s lifetime. The Batería IV APROV also can be used to assess changes in a person’s
performance following a specific time interval, such as after a year of receiving special
educational services.

Program Evaluation
The Batería IV APROV can provide information about program effectiveness at all levels
of education, from preschool through adult. For example, Batería IV APROV tests can be
administered to evaluate the effects of specific school programs or the relative performance
levels (in a certain skill) of students in a class or school.
The continuous-year feature of the Batería IV school-age norms meets the reporting
requirements for educational programs. This feature is especially useful because it provides
norms based on data gathered continuously throughout the school year as opposed to norms
based on data gathered at, perhaps, two points in the school year and then presented as fall
and spring norms.

Research
The possibilities for using the Batería IV APROV in research are unlimited. The wide age
range and breadth of coverage are important advantages underlying its use for research at all
age levels, from preschool through geriatric. Computer scoring allows easy storage of clinical
data. Because the Batería IV APROV tests are individually administered, the researcher has
more control over the quality of the data obtained.
The Batería IV APROV provides predictor or criterion measures that can be used in
studies investigating a variety of experimental effects. Additionally, the wide age range
allows longitudinal or cohort research data to be gathered using the same set of tests and
test content. In educational research, the Batería IV APROV provides a comprehensive set
of related measures for evaluating the comparative efficacy of several programs or services
or for evaluating the effectiveness of curricular interventions. The Batería IV APROV also is
useful for describing the characteristics of examinees included in a sample or experimental
condition and for pairing students in certain experimental designs.
The range of interpretive information available for each test and cluster includes error
analysis, description of developmental status (age and grade equivalents), description of
quality of performance (relative proficiency indexes [RPIs] and instructional zones), and
comparison with grade or age mates to determine group standing (percentile ranks and
standard scores). The W score and standard score scales (discussed in Chapter 5) are both
equal-interval scales that can be used in statistical analyses based on the assumption of equal-
interval metrics. As described in the WJ IV Technical Manual, the W score is the preferred
metric for most statistical analyses.

Psychometric Training
This manual contains the basic principles of individual clinical assessment and specific
administration, scoring, and interpretive information for the Batería IV APROV, which makes
the Batería IV APROV an ideal instrument for introducing individualized assessment in
college and university courses. The Batería IV APROV provides new examiners with a broad
foundation in the administration, scoring, and interpretation of individualized assessments.
Experience in clinical assessment with the Batería IV APROV provides a solid foundation for
learning to administer and interpret other test instruments.

Overview 13
Examiner Qualifications
The examiner qualifications for the Batería IV APROV have been informed by the joint
Standards for Educational and Psychological Testing (American Educational Research
Association [AERA], American Psychological Association [APA], & National Council on
Measurement in Education [NCME], 2014).
Any person administering the Batería IV APROV needs thorough knowledge of the exact
administration and scoring procedures and an understanding of the importance of adhering
to standardized procedures. To become proficient in administering the Batería IV APROV,
examiners need to study the administration and scoring procedures carefully and follow the
procedures precisely. This Examiner’s Manual provides guidelines for examiner training and
includes specific instructions for administering and scoring each test. In addition, examiners
who administer the Batería IV APROV must be fluent and literate in Spanish.
Competent interpretation of the Batería IV APROV requires a higher degree of knowledge
and experience than is required for administering and scoring the tests. Graduate-level
training in educational assessment and a background in diagnostic decision making are
recommended for individuals who will interpret Batería IV APROV results. Only trained
and knowledgeable professionals who are sensitive to the conditions that may compromise,
or even invalidate, standardized test results should make interpretations and decisions.
The level of formal education recommended to interpret the Batería IV APROV is typically
documented by successful completion of an applicable graduate-level program of study that
includes, at a minimum, a practicum-type course covering administration and interpretation
of standardized tests of academic achievement. In addition, many qualified examiners possess
state, provincial, or professional certification, registration, or licensure in a field or profession
that includes as part of its formal training and code of ethics the responsibility for rendering
educational assessment and interpretation services.
Because professional titles, roles, and responsibilities vary among states (or provinces),
or even from one school district to another, it is impossible to equate competency to
professional titles. Consequently, the Standards for Educational and Psychological Testing
(AERA, APA, & NCME, 2014) suggest that it is the responsibility of each school district to
be informed by this statement of examiner qualifications and subsequently determine who,
under its aegis, is qualified to administer and interpret the Batería IV APROV.

Confidentiality of Test Materials and Content


Professionals who use the Batería IV APROV (including examiners, program administrators,
and others) are responsible not only for maintaining the integrity of the test by following
proper administration, scoring, and interpretation procedures but also for maintaining
test security. Test security has two aspects: (a) carefully storing the test materials and (b)
protecting test content.
If the Batería IV test materials are stored in an area accessible to people with a
nonprofessional interest in the tests, the materials should be kept in locked cabinets. Also,
the test materials should not be left unattended in a classroom where students can see the
materials and look at the test items.
The issue of test confidentiality is important. Test content should not be shared with
curious nonprofessionals or made available for public inspection. Disclosing specific test
content invalidates future administrations. As noted on the copyright page of this manual
and the Test Book, the Batería IV is not to be used in programs that require disclosure of test
items or answers.

14 Overview
An examiner should not inform examinees of the correct answers to any of the questions
during or after testing. When discussing test results, examiners may describe the nature of
the items included in a test, but they should not review specific test content. Examiners
should use examples similar to the test items without revealing actual items.
Questions often arise about the federal requirement that families be given access to
certain educational records. To comply with this requirement, a school or school district
may be required to permit “access” to test protocols; however, “access” does not include the
right to make copies of the materials provided. The Family Educational Rights and Privacy
Act (FERPA) provides that parents are to be given the right to “inspect and review” the
educational records of their children (U.S. Department of Education, Family Educational
Rights and Privacy Act. (1974). 20 U.S.C. § 1232g; 34 CFR §99.10). The right to inspect
and review is defined as including the right to a response from the participating agency “to
reasonable requests for explanations and interpretations of the records” (34 CFR §99.10(c))
and, if circumstances prevent inspection or review, the agency may either (a) provide a copy
or (b) make other arrangements that allow for inspection and review (34 CFR §99.10(d)).
So long as the test protocols are made available to the parent, or the parent’s representative,
for review, all requirements of the law are met without violating the publisher’s rights or the
obligations of the educational institution to keep the test materials confidential. There is,
therefore, no obligation to provide copies or to permit the parent, or the legal representative
of the parent, to make copies.
Similar concerns arise when a party seeks to introduce testing materials in a trial or
other legal proceeding. In such cases, it is important that the court take steps to protect the
confidentiality of the test and to prevent further copying or dissemination of any of the test
materials. Such steps include: (a) issuing a protective order prohibiting parties from copying
the materials, (b) requiring the return of the materials to the qualified professional upon
the conclusion of the proceedings, and (c) ensuring that the materials and all references to
the content of the materials will not become part of the public record of the proceedings. To
ensure that these protections are obtained, Riverside Insights should be contacted whenever it
appears likely that testing materials will be introduced as evidence in a legal proceeding.
Examiners or school districts with questions about copyright ownership or confidentiality
obligations should contact Riverside Insights at the toll-free telephone number listed on the
copyright page of this manual.

Overview 15
Chapter 2

Descriptions of the
Batería IV APROV Tests
and Clusters
The Batería IV Woodcock-Muñoz: Pruebas de aprovechamiento (Batería IV APROV) contains 13
tests measuring three curricular areas—reading, mathematics, and written language. Specific
combinations, or groupings, of these 13 tests form clusters for interpretive purposes. (For
administration and scoring procedures, see Chapters 3 and 4 of this manual.)
The tests combine to form 17 cluster scores, including an Aprovechamiento breve (Brief
Achievement) score and an Aprovechamiento amplio (Broad Achievement) score. Although
tests are the basic administration components of the Batería IV APROV, clusters of tests
provide the primary basis for test interpretation. Cluster interpretation minimizes the danger
of generalizing from the score for a single narrow ability to a broad, multifaceted ability or
skill. Cluster interpretation results in higher validity because more than one component
of a broad ability constitutes the score that serves as the basis for interpretation. In some
situations, however, the narrow abilities and skills that are measured by the individual
tests should be considered. This is particularly important when significant differences
exist between or among the tests in a cluster. In these cases, more information is obtained
by analyzing performance on each test, which may indicate the need for further testing.
Occasions exist when it is more meaningful to describe a narrow ability than it is to report
performance on a broad ability. To increase the validity of narrow ability interpretation,
the Batería IV provides clusters for a number of important narrow abilities. These narrow
abilities often have more relevance for informing instruction and intervention (McGrew &
Wendling, 2010).

Batería IV APROV Tests


The Selective Testing Table, presented in Table 2-1, illustrates the scope of the Batería IV
APROV interpretive information via the combinations of tests that form various clusters.
Note that Tests 1 through 6, the core set of tests, provide a number of important interpretive
options, including Lectura (Reading), Lenguaje escrito (Written Language), Matemáticas
(Mathematics), Destrezas académicas (Academic Skills), Aplicaciones académicas (Academic
Applications), and Aprovechamiento breve (Brief Achievement) clusters, and are required for
calculating the intra-achievement variation procedure (see Chapter 5 for a description of the
variation procedures).

Descriptions of the Batería IV APROV Tests and Clusters 17


Table 2-1. Cross-Domain
Reading Mathematics Writing
Batería IV APROV Selective Clusters
Testing Table

ticos
s ma cos
temá
Leng n de prob matemáti
lectu ra
lectu

as

o
mplio
ectur ra

mpli
breve
vech académic
lema

icas
plias
ulos
s en

nto a
acion mica
Expre escrito a

Fluid s académ

iento
ta
Reso s en cálc
Fluid nsión de
ásica

uaje rito
as am

escri
p lia

amie
l

adé
esc

es
Mate ticas
a

m
l
b
ra am

a
mátic
ez en

ez ac
sión
ezas

vech
lució
uaje

eza
eza
pre


ra
Lectu
Lectu

Destr
Destr

Destr

Aplic
Mate

Leng
Com

Ap r o
Apro
APROV 1 Identificación de letras y palabras
APROV 2 Problemas aplicados
APROV 3 Ortografía
Batería IV APROV Battery

APROV 4 Comprensión de textos


APROV 5 Cálculo
APROV 6 Expresión de lenguaje escrito
APROV 7 Análisis de palabras
APROV 8 Lectura oral
APROV 9 Fluidez en lectura de frases
APROV 10 Fluidez en datos matemáticos
APROV 11 Fluidez en escritura de frases
APROV 12 Rememoración de lectura
APROV 13 Números matrices
Tests required to create the cluster listed.

Prueba 1: Identificación de letras y palabras (Test 1: Letter-Word


Identification)
Identificación de letras y palabras measures the examinee’s word identification skills, a reading-
writing (Grw) ability. The initial items require the individual to identify letters that appear in
large type on the examinee’s side of the Test Book. The remaining items require the person to
read aloud individual words correctly. The examinee is not required to know the meaning of
any word. The items become increasingly difficult as the selected words appear less frequently
in written Spanish. Identificación de letras y palabras has a median reliability of .92 in the 5 to
19 age range and .94 in the adult age range.

Prueba 2: Problemas aplicados (Test 2: Applied Problems)


Problemas aplicados requires the person to analyze and solve math problems, a quantitative
knowledge (Gq) ability. To solve the problems, the person must listen to the problem,
recognize the procedure to be followed, and then perform relatively simple calculations.
Because many of the problems include extraneous information, the individual must decide
not only the appropriate mathematical operations to use but also which numbers to include
in the calculation. Item difficulty increases with more complex calculations. This test has a
median reliability of .90 in the 5 to 19 age range and .92 in the adult age range.

Prueba 3: Ortografía (Test 3: Spelling)


Ortografía, a reading-writing (Grw) ability, requires the person to write words that are
presented orally. The initial items measure prewriting skills, such as drawing lines and tracing
letters. The next set of items requires the person to produce uppercase and lowercase letters.

18 Descriptions of the Batería IV APROV Tests and Clusters


The remaining items measure the person’s ability to spell words correctly. The items become
increasingly difficult as the words become more complex. This test has a median reliability
of .90 in the 5 to 19 age range and .93 in the adult age range.

Prueba 4: Comprensión de textos (Test 4: Passage Comprehension)


Comprensión de textos measures the ability to use syntactic and semantic cues to identify a
missing word in text, a reading-writing (Grw) ability. The initial Comprensión de textos items
involve symbolic learning, or the ability to match a rebus (pictographic representation of a
word) with an actual picture of the object. The next items are presented in a multiple-choice
format and require the person to point to the picture represented by a word or phrase or to
point to the word represented by a picture. The remaining items require the person to read a
short passage and identify a missing key word that makes sense in the context of that passage
(a cloze approach to reading comprehension assessment). The items become increasingly
difficult by removing pictorial stimuli and by increasing passage length, level of vocabulary,
and complexity of syntax. Comprensión de textos has a median reliability of .89 in the 5 to 19
age range and .92 in the adult age range.

Prueba 5: Cálculo (Test 5: Calculation)


Cálculo is a test of math achievement measuring the ability to perform mathematical
computations, a quantitative knowledge (Gq) ability. The initial items in Cálculo require
the individual to write single numbers. The remaining items require the person to perform
addition, subtraction, multiplication, division, and combinations of these basic operations, as
well as some geometric, trigonometric, logarithmic, and calculus operations. The calculations
involve negative numbers, percentages, decimals, fractions, and whole numbers. Because the
calculations are presented in a traditional problem format in the Response Booklet, the person
is not required to make any decisions about what operations to use or what data to include.
Cálculo has a median reliability of .93 in the 5 to 19 age range and .93 in the adult age range.

Prueba 6: Expresión de lenguaje escrito (Test 6: Written Language


Expression)
Expresión de lenguaje escrito measures the examinee’s skill in writing responses to a variety of
demands, a reading-writing (Grw) ability. Early items require writing one word to complete a
sentence. Later items require writing a complete sentence. The person’s responses are evaluated
for their quality of expression. Item difficulty increases by increasing passage length, the level of
vocabulary, and the sophistication of the content. The individual is not penalized for errors in
basic writing skills, such as spelling or punctuation. Expresión de lenguaje escrito has a median
reliability of .89 in the 5 to 19 age range and .79 in the adult age range.

Prueba 7: Análisis de palabras (Test 7: Word Attack)


Análisis de palabras measures a person’s ability to apply phonic and structural analysis skills
to the pronunciation of unfamiliar printed words, a reading-writing (Grw) ability. The initial
items require the individual to produce the sounds for single letters. The remaining items
require the person to read aloud letter combinations that are phonically consistent or are
regular patterns in Spanish orthography but are nonsense or low-frequency words. The items
become more difficult as the complexity of the nonsense words increases. Análisis de palabras
has a median reliability of .91 in the 5 to 19 age range and .93 in the adult age range.

Descriptions of the Batería IV APROV Tests and Clusters 19


Prueba 8: Lectura oral (Test 8: Oral Reading)
Lectura oral is a measure of story reading accuracy and prosody, a reading-writing (Grw)
ability. The individual reads aloud sentences that gradually increase in difficulty. Performance
is scored for both accuracy and fluency of expression. Lectura oral has a median reliability of
.91 in the 5 to 19 age range and .90 in the adult age range.

Prueba 9: Fluidez en lectura de frases (Test 9: Sentence Reading


Fluency)
Fluidez en lectura de frases measures reading comprehension under timed conditions,
requiring both reading-writing (Grw) and cognitive processing speed (Gs) abilities. The task
involves reading simple sentences silently and quickly in the Response Booklet, deciding if
the statement is true or false, and then circling Yes or No. The difficulty level of the sentences
gradually increases to a moderate level. The individual attempts to complete as many items as
possible within a 3-minute time limit. Fluidez en lectura de frases has test-retest reliabilities of
.95 in the 7 to 11 age range, .93 in the 14 to 17 age range, and .89 in the adult age range.

Prueba 10: Fluidez en datos matemáticos (Test 10: Math Facts


Fluency)
Fluidez en datos matemáticos measures speed of computation, or the ability to solve simple
addition, subtraction, and multiplication facts quickly, requiring both quantitative knowledge
(Gq) and cognitive processing speed (Gs) abilities. The person is presented with a series of
simple arithmetic problems in the Response Booklet. This test has a 3-minute time limit.
Fluidez en datos matemáticos has test-retest reliabilities of .95 in the 7 to 11 age range, .97 in
the 14 to 17 age range, and .95 in the adult age range.

Prueba 11: Fluidez en escritura de frases (Test 11: Sentence Writing


Fluency)
Fluidez en escritura de frases measures an individual’s skill in formulating and writing simple
sentences quickly, requiring both reading-writing (Grw) and cognitive processing speed (Gs)
abilities. Each sentence must relate to a given stimulus picture in the Response Booklet and
must include a given set of three words. The words gradually require the formulation of more
complex sentence structures. This test has a 5-minute time limit. It has test-retest reliabilities
of .83 in the 7 to 11 age range, .76 in the 14 to 17 age range, and .88 in the adult age range.

Prueba 12: Rememoración de lectura (Test 12: Reading Recall)


Rememoración de lectura is a measure of reading comprehension (a reading-writing [Grw]
ability) and meaningful memory (a long-term storage and retrieval [Glr] ability). The
individual reads a short story silently and then retells as much of the story as he or she can
recall. This test has a median reliability of .96 in the 5 to 19 age range and .97 in the adult
age range.

Prueba 13: Números matrices (Test 13: Number Matrices)


Números matrices is a measure of quantitative reasoning, requiring both quantitative
knowledge (Gq) and fluid reasoning (Gf) abilities. A matrix is presented and the individual
must identify the missing number. Although the test is not timed, there is a general guideline
of either 30 seconds or 1 minute per problem. It has a median reliability of .91 in the 5 to 19
age range and .93 in the adult age range.

20 Descriptions of the Batería IV APROV Tests and Clusters


Batería IV APROV Clusters
There are 17 clusters available for interpretation (see Table 2-1).

Reading Clusters
Five reading clusters are available.

Lectura (Reading)
The Lectura cluster is a measure of reading achievement (a reading-writing [Grw] ability),
including reading decoding and the ability to comprehend connected text while reading.
This cluster is a combination of Prueba 1: Identificación de letras y palabras and Prueba 4:
Comprensión de textos. It has a median reliability of .94 in the 5 to 19 age range and .96 in the
adult age range.

Lectura amplia (Broad Reading)


The Lectura amplia cluster provides a comprehensive measure of reading achievement (a
reading-writing [Grw] ability), including reading decoding, reading comprehension speed,
and the ability to comprehend connected text while reading. This cluster is a combination
of Prueba 1: Identificación de letras y palabras, Prueba 4: Comprensión de textos, and Prueba 9:
Fluidez en lectura de frases. It has a median reliability of .96 in the 5 to 19 age range and .97
in the adult age range.

Destrezas básicas en lectura (Basic Reading Skills)


The Destrezas básicas en lectura cluster is an aggregate measure of sight vocabulary, phonics,
and structural analysis that provides a measure of basic reading skills (a reading-writing
[Grw] ability). This cluster is a combination of Prueba 1: Identificación de letras y palabras
and Prueba 7: Análisis de palabras. It has a median reliability of .95 in the 5 to 19 age range
and .96 in the adult age range.

Comprensión de lectura (Reading Comprehension)


The Comprensión de lectura cluster is an aggregate measure of comprehension and reasoning
(reading-writing [Grw] and, to a lesser extent, long-term storage and retrieval [Glr] abilities).
It is a combination of Prueba 4: Comprensión de textos and Prueba 12: Rememoración de lectura.
This cluster has a median reliability of .94 in the 5 to 19 age range and .95 in the adult age
range.

Fluidez en la lectura (Reading Fluency)


The Fluidez en la lectura cluster provides a measure of several aspects of reading fluency,
including prosody, automaticity, accuracy, and reading comprehension under timed conditions
(reading-writing [Grw] and cognitive processing speed [Gs] abilities). It is a combination
of Prueba 8: Lectura oral and Prueba 9: Fluidez en lectura de frases. This cluster has a median
reliability of .95 in the 5 to 19 age range and .95 in the adult age range.

Descriptions of the Batería IV APROV Tests and Clusters 21


Math Clusters
Four math clusters are available.

Matemáticas (Mathematics)
The Matemáticas cluster provides a measure of math achievement (quantitative knowledge
[Gq] ability), including problem solving and computational skills. This cluster includes
Prueba 2: Problemas aplicados and Prueba 5: Cálculo. It has a median reliability of .95 in the 5
to 19 age range and .96 in the adult age range.

Matemáticas amplias (Broad Mathematics)


The Matemáticas amplias cluster provides a comprehensive measure of math achievement,
including problem solving, number facility, automaticity, and reasoning (quantitative
knowledge [Gq] and cognitive processing speed [Gs] abilities). This cluster includes Prueba
2: Problemas aplicados, Prueba 5: Cálculo, and Prueba 10: Fluidez en datos matemáticos. It has a
median reliability of .97 in the 5 to 19 age range and .97 in the adult age range.

Destrezas en cálculos matemáticos (Math Calculation Skills)


The Destrezas en cálculos matemáticos cluster is an aggregate measure of computational skills
and automaticity with basic math facts, and it provides a measure of basic mathematical skills
(quantitative knowledge [Gq] and cognitive processing speed [Gs] abilities). This cluster
includes Prueba 5: Cálculo and Prueba 10: Fluidez en datos matemáticos. It has a median
reliability of .96 in the 5 to 19 age range and .97 in the adult age range.

Resolución de problemas matemáticos (Math Problem Solving)


The Resolución de problemas matemáticos cluster provides a measure of mathematical
knowledge and reasoning (quantitative knowledge [Gq] and fluid reasoning [Gf] abilities).
It is an aggregate measure of problem solving, analysis, and reasoning. This cluster is a
combination of Prueba 2: Problemas aplicados and Prueba 13: Números matrices. It has a
median reliability of .94 in the 5 to 19 age range and .96 in the adult age range.

Written Language Clusters


Three written language clusters are available.

Lenguaje escrito (Written Language)


The Lenguaje escrito cluster provides a measure of written language achievement, including
spelling of single-word responses and quality of expression (reading-writing [Grw] ability).
This cluster includes Prueba 3: Ortografía and Prueba 6: Expresión de lenguaje escrito. It has a
median reliability of .93 in the 5 to 19 age range and .92 in the adult age range.

Lenguaje escrito amplio (Broad Written Language)


The Lenguaje escrito amplio cluster provides a comprehensive measure of written language
achievement, including spelling of single-word responses, fluency of production, and quality
of expression (reading-writing [Grw] and cognitive processing speed [Gs] abilities). It
includes Prueba 3: Ortografía, Prueba 6: Expresión de lenguaje escrito, and Prueba 11: Fluidez en
escritura de frases. This cluster has a median reliability of .94 in the 5 to 19 age range and .94
in the adult age range.

22 Descriptions of the Batería IV APROV Tests and Clusters


Expresión escrita (Written Expression)
The Expresión escrita cluster is an aggregate measure of meaningful written expression and
fluency (reading-writing [Grw] and cognitive processing speed [Gs] abilities). This cluster
is a combination of Prueba 6: Expresión de lenguaje escrito and Prueba 11: Fluidez en escritura
de frases. It has a median reliability of .91 in the 5 to 19 age range and .88 in the adult age
range.

Cross-Domain Clusters
Five cross-domain clusters are available, including two general academic proficiency cluster
scores, Aprovechamiento breve (Brief Achievement) and Aprovechamiento amplio (Broad
Achievement). Various combinations of tests are used to form three additional cluster
scores: Destrezas académicas (Academic Skills), Fluidez académica (Academic Fluency),
and Aplicaciones académicas (Academic Applications). These three clusters (skills, fluency,
and applications) contain tests of reading, math, and written language and can be used to
determine whether the person exhibits significant strengths and/or weaknesses among these
three types of tasks across academic areas.

Aprovechamiento breve (Brief Achievement)


The Aprovechamiento breve cluster is a combination of three tests: Prueba 1: Identificación
de letras y palabras, Prueba 2: Problemas aplicados, and Prueba 3: Ortografía. This cluster
represents a screening of the person’s performance across reading, writing, and math. It has a
median reliability of .96 in the 5 to 19 age range and .97 in the adult age range.

Aprovechamiento amplio (Broad Achievement)


The Aprovechamiento amplio cluster is a combination of the nine tests (Tests 1 through 6 and
Tests 9 through 11) included in the Lectura amplia, Matemáticas amplias, and Lenguaje escrito
amplio clusters. The Aprovechamiento amplio cluster represents a person’s overall performance
across the various achievement domains. It has a median reliability of .98 in the 5 to 19 age
range and .99 in the adult age range.

Destrezas académicas (Academic Skills)


The Destrezas académicas cluster is an aggregate measure of reading decoding, math
calculation, and spelling of single-word responses, providing an overall score of basic
achievement skills. It is a combination of Prueba 1: Identificación de letras y palabras, Prueba
3: Ortografía, and Prueba 5: Cálculo. This cluster has a median reliability of .94 in the 5 to 19
age range and .96 in the adult age range.

Fluidez académica (Academic Fluency)


The Fluidez académica cluster provides an overall index of academic fluency. It is a
combination of Prueba 9: Fluidez en lectura de frases, Prueba 10: Fluidez en datos matemáticos,
and Prueba 11: Fluidez en escritura de frases. This cluster has a median reliability of .97 in the
5 to 19 age range and .97 in the adult age range.

Aplicaciones académicas (Academic Applications)


The Aplicaciones académicas cluster is a combination of Prueba 2: Problemas aplicados,
Prueba 4: Comprensión de textos, and Prueba 6: Expresión de lenguaje escrito. These three tests
require the individual to apply academic skills to academic problems. This cluster has a
median reliability of .95 in the 5 to 19 age range and .95 in the adult age range.

Descriptions of the Batería IV APROV Tests and Clusters 23


Chapter 3

General Administration
and Scoring Procedures
To become proficient in administering and scoring the Batería IV Pruebas de aprovechamiento
(Batería IV APROV), examiners should carefully study the general administration and scoring
procedures in this chapter and the specific procedures for each test in Chapter 4 and in
the Test Book. Additionally, two appendices of this manual provide reproducible checklists
to help examiners build competency administering and scoring the tests. Appendix B,
the “Batería IV Pruebas de aprovechamiento Examiner Training Checklist,” is a test-by-test
form that may be used as a self-study or observation tool. Appendix C is the “Batería IV
General Test Observations Checklist,” which may be used by an experienced examiner when
observing a new examiner.

Practice Administration
After thoroughly studying this Examiner’s Manual, the Test Book, the Test Record, and
the Response Booklet, both experienced and novice examiners should administer several
practice tests. When administering practice tests, try to replicate an actual testing situation,
pretending that the practice session is an actual administration. Do not discuss the test or the
answers to specific items. After completing each practice administration, record any questions
that arose during the practice session. Before administering another practice test, answer
the questions by reviewing the Examiner’s Manual or consulting an experienced examiner.
While administering practice tests, strive for these two goals: exact administration and brisk
administration.

Exact Administration
The goal of standardized testing is to see how well a person can respond when given
instructions identical to those presented to individuals in the norming sample. When learning
to administer the Batería IV APROV tests, study the contents of the Test Book, paying
particular attention to the information on the introductory page of each test, the specific
instructions on the test pages, and the boxes with special instructions.
The first page after the tab in each test provides general information and instructions
specific to that test. Review this information frequently. This page usually includes
administration information, scoring information, suggested starting points, basal and ceiling
requirements, and information about materials required to administer the test.
The directions for administering each item are located on the examiner’s side of the
pages in the Test Book. The directions include the script to be read to the examinee (printed
in bold blue type) and, if applicable, specific pointing instructions. Always use the exact

General Administration and Scoring Procedures 25


wording. Do not change, reword, or modify the instructions in any way or the results will be
compromised.
The Test Book examiner pages frequently include boxes containing supplemental
administration and scoring information. This information outlines procedures to follow if an
individual responds incorrectly to a sample item or if he or she responds incorrectly or does
not respond to a test item.
During the first couple of practice administrations, be certain to administer the tests
correctly, regardless of how long it takes. At this beginning stage, testing may proceed
quite slowly.

Brisk Administration
After the initial practice sessions, strive for a brisk testing pace. Inefficient testing procedures
bore the examinee, invite distraction, and increase testing time. It is not appropriate to stop
testing and visit with the examinee during the testing session. When the person has finished
responding to an item, immediately begin the next item.
In most instances, an examinee does not need a break before beginning the next test.
Each test begins with easy questions presented in a different format, thus providing a built-
in change of pace from one test to the next. Using a brisk testing pace enhances rapport and
helps an examinee maintain attention.
Continue to practice administering the tests until the two goals of exact and brisk
administration have been met.

Preparation for Testing


Before actual test administration, arrange the test setting, set up the test materials, and
establish rapport with the examinee.

Arranging the Test Setting


As recommended in the Standards for Educational and Psychological Testing (AERA et al.,
2014), select a testing room that is quiet and comfortable and has adequate ventilation
and lighting. If possible, the only two people in the room should be the examiner and the
examinee. To avoid interruptions, post a sign such as the following on the door:

Testing—Please Do Not Disturb—Thank You

The room should have a table (or other flat working space of adequate size) and two
chairs, one being an appropriate size for the examinee. A suitable seating arrangement allows
the examiner to view both sides of the Test Book easel, point to all parts of the examinee’s
page and the Response Booklet, and record responses on the Test Record out of the
examinee’s view. The examinee should be able to view only the examinee’s test pages. When
the Test Book easel is set up for administration, it becomes a screen allowing the examiner to
record responses on the Test Record out of the examinee’s view.
The best seating arrangement is one in which the examiner and the examinee sit
diagonally across from each other at the corner of a table. This arrangement is illustrated in
Figure 3-1 for a right-handed examiner. The arrangement (seating and setup of materials)
should be reversed for a left-handed examiner.

26 General Administration and Scoring Procedures


Figure 3-1.
Recommended arrangement
for administering the test.

Another possible seating arrangement is for the examiner and the examinee to sit directly
across the table from each other. With this arrangement, the table must be narrow and low
enough so that the examiner can see over the upright Test Book easel and accurately point to
the examinee’s page when necessary.

Setting Up the Testing Materials


The materials necessary for administering the Batería IV APROV are the Test Book, the
accompanying Test Record and Response Booklet, and at least two sharpened pencils with
erasers. For timed tests, a stopwatch or a watch or clock with a second hand is necessary.

Establishing Rapport
In most instances, the examiner will have little difficulty establishing a good relationship
with the examinee. Do not begin testing unless the person seems relatively at ease. If he
or she does not feel well or will not respond appropriately, do not attempt testing. Often
examiners begin the testing session with a short period of conversation while completing
the “Identifying Information” portion of the Test Record. A brief explanation of the test is
provided in the “Introduction” section in the front of the Test Book.
To help put the individual at ease, smile frequently throughout the testing session and call
the person by name. Between tests, let the examinee know that he or she is doing a good job,
using such comments as “Bien” (Fine) and “Muy bien” (Good). Encourage a response even
when items are difficult. It is fine to say, “Te atreves a hacer este?” (Would you like to take
a guess on that one?), but the comments should not reveal whether answers are correct or
incorrect. Do not say, “Muy bien” (Good) only after correct responses or pause longer after
incorrect responses before proceeding to the next item.

Identifying Information
For the most part, the “Identifying Information” section on the first page of the Test Record
is self-explanatory. For younger examinees, verify the date of birth using school records
or with a parent. Prior to testing, check to see if the person should be wearing glasses or a
hearing aid.

General Administration and Scoring Procedures 27


If an examinee is not attending school (i.e., kindergarten through college), it is not
necessary to record a grade placement unless it would be useful to compare the examinee’s
performance with the average performance of students at some specified grade placement.
For example, if an adult is applying for admission to a college, that adult’s performance might
be compared with the average performance of students starting college (13.0). Or, if a child
is being considered for early entrance into the first grade, that child’s performance might be
compared with the average performance of students beginning grade 1 (1.0). If the person
is tested during the summer months, record the grade that he or she has just completed.
If an individual is enrolled in some type of nongraded program, record the normal grade
placement for students of this person’s age at that time of the school year; this may provide
the most appropriate grade level for test interpretation. Another option is to record the exact
starting and stopping dates of the examinee’s school year. This option may be appropriate for
students enrolled in year-round schools or in schools with starting and stopping dates that
fall more than 2 weeks before or after the default dates of August 16 and June 15. When the
exact starting and ending dates are entered into the online scoring and reporting program, the
program automatically calculates the exact grade placement in tenths of the school year.

Language Background Information


Information related to the native language and any other language(s) spoken by the examinee
should be recorded on the Test Record. Although not required, this language information
is vital in ensuring more accurate interpretation of test results. The “Language Background
Information” section allows examiners to document the language use and exposure of the
examinee. In this section, the examiner is asked to indicate whether the examinee can be
considered a native English speaker, a second-language learner of English, a native English
speaker learning a foreign language or heritage language, or a simultaneous bilingual
individual. Other information recorded in this section includes the examinee’s native
language; the language(s) spoken by others in the examinee’s home; and the language(s)
spoken by the examinee at home, with peers, and in the classroom.

Academic Language Exposure


The “Academic Language Exposure” section of the Test Record elicits information about
current and prior language programs and the amount of time the examinee has spent in these
programs, as well as information about academic instruction outside of the United States. In
the case of an examinee just entering a formal education setting or entering a new setting,
there is also a space for the examiner to record information about the examinee’s upcoming
educational enrollment.

Administration and Scoring


This section contains general procedures for administering and scoring the Batería IV APROV.

Test Selection
It is important to select tests that are appropriate for the individual being evaluated. Consider
the individual’s age, developmental level, and achievement levels as part of this test selection
process. For example, it would be inappropriate to give a test that requires reading ability to a
young child with limited reading experience. Whereas some tests, such as Prueba 1:
Identificación de letras y palabras and Prueba 4: Comprensión de textos, have a number of
prereading items, other tests do not. For example, on Prueba 9: Fluidez en lectura de frases,

28 General Administration and Scoring Procedures


the individual is asked to read each sentence, decide whether it is true or false, and circle yes
or no. If this test is administered to a person who cannot read, the individual may randomly
mark yes or no without reading the sentences at all and obtain a score that would not be a
valid indicator of his or her reading skill.
Examiners are encouraged to use selective testing principles for choosing the most
appropriate set of tests for each individual. To help examiners determine whether or not a test
is appropriate for an individual, many of the Batería IV APROV tests provide sample items
and practice exercises. Examiners are directed to discontinue a test without administering
the test items if the examinee does not get a specified number of sample items correct. Other
tests provide early cut-offs if an individual’s performance is limited.

Order of Administration
In most cases, administer the first six tests in the order that they appear in the Test Book.
These are the core tests (Tests 1 through 6), and they have been organized to alternate
between different tasks and achievement areas (e.g., reading and math) to facilitate optimal
attention and interest. However, the tests may be administered in any order. For example,
testing may begin with Prueba 5: Cálculo, rather than with Prueba 1: Identificación de letras y
palabras. Furthermore, testing may be discontinued between the administration of any two
tests. The decision to administer any of the remaining tests should be based upon the referral
question(s) and the examinee’s age and interests. These additional tests may be administered
in any order with one or two exceptions.
If an examinee struggles with a certain type of task, as a general rule, do not administer
two such tests in a row (e.g., timed tests, reading tests, or tests involving sustained writing,
such as Prueba 6: Expresión de lenguaje escrito or Prueba 11: Fluidez en escritura de frases).
Additionally, if you are planning to administer Prueba 9: Fluidez en lectura de frases, Prueba
10: Fluidez en datos matemáticos, and Prueba 11: Fluidez en escritura de frases, these timed
tests should be interspersed in the administration sequence rather than administered
consecutively.

Time Requirements
Always schedule adequate time for testing. Generally, experienced examiners will
require approximately 40 minutes to administer the core set of tests (Tests 1 through 6).
Administration of Prueba 6: Expresión de lenguaje escrito requires about 15 to 20 minutes,
whereas the other tests require about 5 to 10 minutes each. When administering each test,
allow a reasonable amount of time for a person to respond and then suggest moving on to the
next item. Also allow more time for a specific item if the person requests it or if more time is
allowed under the specific test directions.
Very young individuals or those who have unique characteristics that may impact
test administration may require additional testing time. These individuals may produce
a scattering of correct responses requiring administration of a greater number of items.
Some people may respond more slowly, change their answers more frequently, or require
more prompting and querying. In addition, an examiner may inadvertently begin at an
inappropriate starting point, which extends the testing time.

Suggested Starting Points (Puntos de partida sugeridos)


On most of the Batería IV APROV tests, the first page(s) after the tab provide special
instructions or procedures to be followed and indicate where to begin. For example,
the instructions may say that all examinees should take the sample items or that certain

General Administration and Scoring Procedures 29


examinees should go to a specific starting point in the test. The starting points located
on the Suggested Starting Points table are determined by an estimate of the individual’s
present achievement level rather than by the age or grade placement (see Figure 3-2). Using
suggested starting points with basal and ceiling levels (discussed in the following section)
reduces unnecessary testing time. It is usually apparent whether the person performs
markedly above or below the estimated achievement level after completing the first few tests.
After determining how an examinee will perform, use the starting point that seems most
appropriate.

Figure 3-2. Puntos de partida sugeridos


Suggested Starting Aprovechamiento Preescolar a 3.er a 8.o 9.o grado a
Points table for Prueba 2: estimado del examinado Kindergarten 1.er grado 2.o grado grado adulto
Problemas aplicados. Comience con Ítem 1 Ítem 9 Ítem 17 Ítem 22 Ítem 27
Página 43 Página 49 Página 53 Página 55 Página 57

Basals and Ceilings (Niveles básicos y máximos)


Many of the Batería IV APROV tests require the examiner to establish a basal and a ceiling.
Exceptions are timed tests, such as Prueba 9: Fluidez en lectura de frases, and tests that require
the administration of a preselected block of items, such as Prueba 6: Expresión de lenguaje
escrito. Not administering items that are extremely easy or difficult minimizes the number of
items administered and maximizes the individual’s tolerance for the testing situation.
The purpose of basal and ceiling requirements is to limit the number of items administered
but still be able to estimate, with high probability, the score that the examinee would have
obtained if all items were administered.

Meeting Basal and Ceiling Criteria


When required, the basal and ceiling criteria are included in each test in the Test Book and
are stated briefly at the top of each test on the Test Record. Because the basal and ceiling
criteria are not the same for each test, review the criteria before testing.
For example, in Prueba 1: Identificación de letras y palabras, the basal criterion is met when
the examinee responds correctly to the 6 lowest-numbered items administered or when Item
1 has been administered. If the basal is not obtained, test backward until the examinee has
met the basal criterion or until the page with Item 1 has been administered. Then return to
the point at which testing was interrupted and continue testing.
Using the same example of Prueba 1: Identificación de letras y palabras, the ceiling criterion
is met when the examinee responds incorrectly to the last 6 consecutive items administered
or when the page with the last test item has been administered.
The best practice is to test by complete pages when stimulus material appears on the
examinee’s side of the Test Book. If an examinee reaches a ceiling in the middle of a test
page and there is no stimulus material on the examinee’s side, the examiner may discontinue
testing. Because examinees do not see any of the pages that fall below the basal level or above
the ceiling level, they are essentially unaware that the test has additional items.

30 General Administration and Scoring Procedures


No Apparent Basal or No Apparent Ceiling
Sometimes, upon completing a test, an individual may not show a consecutive set of correctly
answered items at the beginning of the test (i.e., a basal level). This is expected for a young
child or an individual who is performing at a low level of ability on that test. Figure 3-3
shows an example of an examinee who began Prueba 1: Identificación de letras y palabras with
Item 1. The person missed Item 1. The examiner continued testing to establish the ceiling.
Although the examinee answered 6 consecutive items correctly (Items 2 through 7), they are
not the lowest-numbered items administered. In this case, with no apparent basal, Item 1 is
used as the basal. The examinee would not receive credit for Item 1. In situations where the
testing begins with Item 1, give credit only for the items the person answers correctly. Testing
continued by complete pages until the ceiling was reached (6 consecutive items incorrect). In
this example, the total Number Correct (Número de respuestas correctas) for this test is six.
In other instances, an individual with a high level of ability may not reach a ceiling level at
the end of a test. In cases with no apparent ceiling, the last test item is used as the ceiling.

General Administration and Scoring Procedures 31


Figure 3-3.
Example of Item 1 used Prueba 1 Identificación de letras y palabras
as the basal on Prueba 1: Nivel básico: 6 ítems correctos de numeración más baja
Nivel máximo: 6 ítems incorrectos de numeración más alta
Identificación de letras y
palabras. Calificación 1, 0
1 0 L 54 apóstrofe Prueba 1 Identificación de letras y palabras
55 Astronomía Tabla de puntuaciones
2 1 A 56 esencial Encierre la línea completa del número de respuestas correctas.
57 precipitar
3 1 E 58 melodioso
Número de
respuestas
Número de
respuestas
correctas AE (Est)* GE (Est)* correctas AE (Est)* G
4 1 M
59
60
perjuicio
cromosoma 0-2 <2-0 <K.0 40 7-7

5 1 K 61 municipalidad 3
4
2-1
2-9
<K.0
<K.0
41
42
7-9
7-10

6 1 i 62 irregularidad 5
6
3-4
3-9
<K.0
<K.0
43
44
8-0
8-1
63 almohadón
7 1 s 64 terapéutico 7 4-0 <K.0 45 8-3
8 0 p 65 isósceles 8 4-2 <K.0 46 8-5

9 0 N 66 ungüento 9 4-5 <K.0 47 8-7

10 0 R 67 eufemismo
10
11
4-7
4-10
<K.0
<K.0
48
49
8-8
8-10

11 0 mes 68 caleidoscopio 12 5-0 <K.0 50 9-0


12 0 las 69 psicosis 13 5-3 <K.0 51 9-3
13 0 de 70 triquinosis 14
15
5-4
5-6
<K.0
K.1
52
53
9-5
9-8
71 arbitrariedad
16 5-7 K.2 54 9-10
14 la 72 pampsiquismo
15 yo 73 usufructuario 17 5-8 K.3 55 10-1
16 mí 18 5-9 K.3 56 10-4
17 tú 74 mnemotécnico 19 5-10 K.4 57 10-7
75 zaramagullón 20 5-11 K.5 58 10-10
18 pan
21 6-0 K.6 59 11-1
19 ver 76 quincuagésimo
20 por 77 resquebrajadizo 22 6-1 K.7 60 11-5
21 mesa 78 globulariáceas 23 6-2 K.7 61 11-9
24 6-3 K.8 62 12-2
22 uno Número de respuestas 25 6-4 K.9 63 12-7
23 vez
6 correctas (0-78) 26 6-4 1.0 64 13-0
24 ir 27 6-5 1.0 65 13-5
25 libro Observación cualitativa 28 6-6 1.1 66 13-11
26 isla 29 6-7 1.2 67 14-5
¿Cuál de las siguientes frases 30 6-8 1.2 68 15-1
27 mejor describe mejor la facilidad con 31 6-9 1.3 69 15-10
28 usar que el examinado identificó
29 parte 32 6-10 1.4 70 17-0
las palabras en la prueba 33 6-11 1.5 71 19-7 >
30 joven Identificación de letras y palabras? 34 7-0 1.6 72 23 >
31 suyo (Marque solo una respuesta). 35 7-1 1.7 73 28 >
32 tenía ❏ 1. Identificó las palabras 36 7-2 1.8 >73 >30 >
33 frasco rápidamente y con exactitud, 37 7-3 1.9
34 llevar sin mucho esfuerzo (Con 38 7-5 2.0
35 chistoso habilidad automática para 39 7-6 2.1
36 historia identificación de las palabras).
* El software del programa de calificación provee los
37 construir ❏ 2. Identificó los ítems iniciales valores precisos de las puntuaciones estimadas AE
rápidamente y con exactitud,
38 respuesta e identificó los ítems más
39 señalar difíciles mediante una creciente
40 página aplicación de las relaciones
41 derretir fonema-grafema (Normal).
42 ordinario ❏ 3. Identificó los ítems iniciales
43 vertical rápidamente y con exactitud,
44 ciudad pero tuvo dificultades para
45 aparato aplicar las relaciones fonema-
46 práctica grafema en los últimos ítems.
47 desalmado ❏ 4. Necesitó de más tiempo
48 situación y mayor atención a las
49 interrogar relaciones fonema-grafema
50 silueta para determinar la respuesta
51 vejez (Sin habilidad automática
52 descendiente para identificación de las
53 acrílico palabras).
❏ 5. No logró aplicar las relaciones
fonema-grafema.
❏ 6. Ninguna de las anteriores, no
se le observó, o no aplica.
2

32 General Administration and Scoring Procedures


Two Apparent Basals or Two Apparent Ceilings
When scoring an individual’s responses, a pattern of two apparent basals may appear. When
this occurs, use the lowest-numbered set of consecutive correct responses as the true basal. In
the same respect, a pattern may exist with two apparent ceilings. In this case, use the highest-
numbered set of consecutive incorrect responses as the true ceiling. These guidelines will
ensure that the examinee’s ability is more accurately estimated. An examiner should continue
testing if there is a clinically informed reason (other than chance) to believe that a person
may fail an item below an apparent basal or may correctly answer an item above an apparent
ceiling. The basal and ceiling criteria are simply guides to minimize testing time and reduce
examinee frustration. When calculating the raw score for a test, take into account all the
items the person passed and all the items he or she missed.
Figure 3-4 illustrates how a basal and a ceiling were determined on Prueba 1: Identificación
de letras y palabras for a sixth-grade boy referred for reading difficulties. The examiner
initially estimated that this boy’s reading ability was similar to that of students in grade 3.
Step 1. After referring to the Suggested Starting Points table for Prueba 1: Identificación de
letras y palabras, the examiner began this test with Item 30, the suggested starting point for
an individual whose reading ability was estimated at grade 3. The entire page of items (Items
30 through 37) was administered. The basal level was not established because the person
missed Item 33.
Step 2. The examiner then turned back one page and presented Items 22 through 29.
The examinee missed Item 26. Although 6 consecutive items (Items 27 through 32) were
answered correctly, the basal level was still not established because the person did not answer
the 6 lowest-numbered items administered (Items 22 through 27) correctly.
Step 3. The examiner went back one more page and administered Items 14 through 21, all
of which the examinee answered correctly. The basal level for Prueba 1: Identificación de letras
y palabras was then established because the person answered the 6 lowest-numbered items
administered (Items 14 through 19) correctly.
Step 4. The examiner then returned to the point at which testing was interrupted and
resumed testing with Item 38. Because there is stimulus material on the examinee’s side of
the Test Book, the examiner administered all of the items on that page (Items 38 through
45). The examinee missed seven consecutive items (Items 38 through 44); however, a ceiling
was not yet established because the individual answered the last item on the page (Item 45)
correctly. Because the examiner could not be confident that this examinee’s true ceiling level
had been reached, testing continued.
Step 5. The examiner administered all the items on the next page (Items 46 through 53)
and obtained a ceiling when the examinee answered all of them incorrectly.
Step 6. The examiner stopped testing with Item 53 because the ceiling level had been
reached and the page was completed. The examiner then totaled the number of correct
responses and included a point for each item below the basal to obtain the raw score of 34.
The total of 34 was entered in the Number Correct box on the Test Record.

General Administration and Scoring Procedures 33


Figure 3-4.
Determination of basal and Prueba 1 Identificación de letras y palabras
ceiling with two apparent Nivel básico: 6 ítems correctos de numeración más baja
Nivel máximo: 6 ítems incorrectos de numeración más alta
basals and two apparent
ceilings. Calificación 1, 0
1 L 54 apóstrofe Prueba 1 Identificación de letras y palabras
55 Astronomía Tabla de puntuaciones
2 A 56 esencial Encierre la línea completa del número de respuestas correctas.
57 precipitar Número de Número de
3 E 58 melodioso respuestas respuestas
correctas AE (Est)* GE (Est)* correctas AE (Est)* G
59 perjuicio
4 M 0-2 <2-0 <K.0 40 7-7
60 cromosoma
3 2-1 <K.0 41 7-9
5 K 61 municipalidad
4 2-9 <K.0 42 7-10
5 3-4 <K.0 43 8-0
6 i 62 irregularidad
6 3-9 <K.0 44 8-1
63 almohadón
7 s 64 terapéutico 7 4-0 <K.0 45 8-3
65 isósceles 8 4-2 <K.0 46 8-5
8 p
9 4-5 <K.0 47 8-7
9 N 66 ungüento
10 4-7 <K.0 48 8-8
10 R 67 eufemismo 11 4-10 <K.0 49 8-10

11 mes 68 caleidoscopio 12 5-0 <K.0 50 9-0


12 las 69 psicosis 13 5-3 <K.0 51 9-3
70 triquinosis 14 5-4 <K.0 52 9-5
13 de
15 5-6 K.1 53 9-8
71 arbitrariedad
14 1 la 72 pampsiquismo
16 5-7 K.2 54 9-10

15 1 yo 73 usufructuario 17 5-8 K.3 55 10-1


16 1 mí 18 5-9 K.3 56 10-4
17 1 tú 74 mnemotécnico 19 5-10 K.4 57 10-7
18 1 pan 75 zaramagullón 20 5-11 K.5 58 10-10

19 1 ver 76 quincuagésimo 21 6-0 K.6 59 11-1

20 1 por 77 resquebrajadizo 22 6-1 K.7 60 11-5


21 1 mesa 78 globulariáceas 23
24
6-2
6-3
K.7
K.8
61
62
11-9
12-2
22 1 uno
34 Número de respuestas 25 6-4 K.9 63 12-7
23 1 vez correctas (0-78) 26 6-4 1.0 64 13-0
24 1 ir 27 6-5 1.0 65 13-5
25 1 libro STEP Observación
3: cualitativa 28 6-6 1.1 66 13-11
26 0 isla 29 6-7 1.2 67 14-5
27 1 mejor Tested ¿Cuál de lasone
backward siguientes
more pagefrases
and administered Items
30 14–21.
6-8 The1.2 68 15-1
1 describe mejor la facilidad con 31 6-9 1.3 69 15-10
28 usar basal isque
established because
el examinado the examinee answered all correctly.
identificó
29 1 parte las palabras en la prueba
The 6 lowest-numbered
32
consecutive items administered
6-10 1.4 70 17-0
33 were correct1.5
6-11 71 19-7 >
30 1 joven Identificación de letras y palabras?
(Items (Marque
14–19) solo
and una
formrespuesta).
the basal. 34 7-0 1.6 72 23 >
31 1 suyo 35 7-1 1.7 73 28 >
32 1 tenía ❏ 1. Identificó las palabras 36 7-2 1.8 >73 >30 >
33 0 frasco STEP 2: rápidamente y con exactitud, 37 7-3 1.9
34 1 llevar sin mucho esfuerzo (Con
Tested backward one page and administered Items 38
22–29. No7-5
basal 2.0
35 1 chistoso habilidad automática para 39 7-6 2.1
36 0 historia was established because de
identificación thelas
examinee
palabras).missed Item 26. (The 6 lowest-
37 0 construir ❏ 2.items
numbered Identificó
[Itemslos 22–27]
ítems iniciales
* El software del programa de calificación provee los
administered were not all correct.)
valores precisos de las puntuaciones estimadas AE
rápidamente y con exactitud,
38 0 respuesta e identificó los ítems más
39 0 señalar STEP 1: difíciles mediante una creciente
40 0 página
Testing beganaplicación
with Itemde las
30.relaciones
After completing the page, no basal
41 0 derretir fonema-grafema (Normal).
42 0 ordinario was established
❏ 3. Identificó los ítems6 iniciales
because the lowest-numbered consecutive items
43 0 vertical administeredrápidamente
(Items 30–35) y conwere not all correct. The examinee
exactitud,
44 0 ciudad
45 1 aparato missed Item pero 33. tuvo dificultades para
aplicar las relaciones fonema-
46 0 práctica
STEP ❏
grafema en los últimos ítems.
4: 4. Necesitó de más tiempo
47 0 desalmado
48 0 situación Resumed testing y mayorwithatención
Item 38a and
las administered the complete page
49 0 interrogar
(Items 38–45).
relaciones fonema-grafema
No ceiling was established because the examinee
50 0 silueta para determinar la respuesta
51 0 vejez (Sin habilidad
answered Item 45 correctly. automática
52 0 descendiente para identificación de las
53 0 acrílico STEP 5:
palabras).
❏ 5. No logró aplicar las relaciones
STEP 6: The examinerfonema-grafema.
continued testing and administered Items 46–53. The
Discontinued testing and calculated the is 6.established
ceiling❏ Ninguna debecause
las anteriores, no
the examinee missed the 6 highest-
se le observó, o no aplica.
Number Correct (34). numbered items answered (Items 48–53) and completed a page.
2

34 General Administration and Scoring Procedures


Tests Requiring the Response Booklet (Folleto de respuestas)
The Batería IV APROV Response Booklet includes test material that the examinee uses to
complete any test requiring writing or calculating. The Response Booklet is needed when
administering Prueba 3: Ortografía, Prueba 5: Cálculo, Prueba 6: Expresión de lenguaje escrito,
Prueba 9: Fluidez en lectura de frases, Prueba 10: Fluidez en datos matemáticos, and Prueba 11:
Fluidez en escritura de frases. In addition, the front cover of the Response Booklet is designed
as a worksheet that the examinee can use with Prueba 2: Problemas aplicados and Prueba 13:
Números matrices. Provide the examinee with the Response Booklet and a sharpened pencil with
an eraser when directed to do so by the Test Book instructions. At the completion of each test,
collect the Response Booklet and pencil.

Timed Tests
Prueba 9: Fluidez en lectura de frases, Prueba 10: Fluidez en datos matemáticos, and Prueba
11: Fluidez en escritura de frases are timed tests. Prueba 9: Fluidez en lectura de frases and
Prueba 10: Fluidez en datos matemáticos have a 3-minute time limit, and Prueba 11: Fluidez en
escritura de frases has a 5-minute time limit. Although Tests 9 through 11 are in a numeric
sequence, it is recommended that these three timed tests not be administered consecutively.
The time limits are noted in both the Test Book and the Test Record. Administer these
tests using a stopwatch. If not using a stopwatch, write the exact starting and finishing times
in minutes and seconds in the space provided on the Test Record. For example, 17:23 would
indicate that the test started at 17 minutes and 23 seconds after the hour. The test then would
end exactly 3 minutes later at 20 minutes and 23 seconds (20:23) after the hour. A watch or
clock with a second hand is also useful for administering tests with the instruction to proceed
to the next item if an examinee has not responded to an item within a specified period of time.

Examinee Requests for Information


Occasionally examinees will request other information, and it will generally be easy to recognize
at once whether it is appropriate to supply the requested information. Even after testing has
been completed, never tell the person whether specific answers are correct or incorrect. If an
individual requests information that cannot be supplied, respond with a comment such as “No
estoy autorizado a darte [darle] esa información” (I’m not supposed to help you with that).

Examiner Queries
For certain responses, the Query keys in the Test Book provide prompts designed to elicit
another answer from the examinee. For example, Item 36 on Prueba 5: Cálculo requires the
examinee to reduce the fraction to obtain credit. The query on this item is a reminder to
ask the examinee to simplify his or her answer. Use professional judgment when querying
responses that are not listed in the Query key. For example, if an individual provides a
response that seems to be partially correct, it is permissible to query with a comment such as
”Dame [Deme] otra respuesta” (Tell me another answer).

Evaluating Test Behavior


Good testing practice requires careful observation and documentation of the examinee’s
behaviors under standardized test administration conditions.

Test Session Observations Checklist


The “Test Session Observations Checklist” is a brief, seven-category behavior rating scale
intended to systematize and document a number of salient examiner observations. The

General Administration and Scoring Procedures 35


categories include levels of conversational proficiency, cooperation, and activity; attention and
concentration; self-confidence; care in responding; and response to difficult tasks. As noted in
Figure 3-5, a range of possible responses is provided for each category.

Figure 3-5. Test Session Observations Checklist


The “Test Session Check only one category for each item.
Level of conversational proficiency Self-confidence
Observations Checklist” ❑ 1. Very advanced ❑ 1. Appeared confident and self-assured
from the Test Record. ❑ 2. Advanced ❑ 2. Appeared at ease and comfortable (typical for age/grade)
❑ 3. Typical for age/grade ❑ 3. Appeared tense or worried at times
❑ 4. Limited ❑ 4. Appeared overly anxious
❑ 5. Very limited Care in responding
Level of cooperation ❑ 1. Very slow and hesitant in responding
❑ 1. Exceptionally cooperative throughout the examination ❑ 2. Slow and careful in responding
❑ 2. Cooperative (typical for age/grade) ❑ 3. Prompt but careful in responding (typical for age/grade)
❑ 3. Uncooperative at times ❑ 4. At times responded too quickly
❑ 4. Uncooperative throughout the examination ❑ 5. Impulsive and careless in responding
Level of activity Response to difficult tasks
❑ 1. Seemed lethargic ❑ 1. Noticeably increased level of effort for difficult tasks
❑ 2. Typical for age/grade ❑ 2. Generally persisted with difficult tasks (typical for age/grade)
❑ 3. Appeared fidgety or restless at times ❑ 3. Attempted but gave up easily
❑ 4. Overly active for age/grade; resulted in difficulty attending to tasks ❑ 4. Would not try difficult tasks at all
Attention and concentration
❑ 1. Unusually absorbed by the tasks
❑ 2. Attentive to the tasks (typical for age/grade)
❑ 3. Distracted often
❑ 4. Consistently inattentive and distracted

When using this checklist, it is necessary to possess knowledge of the behaviors that can
be considered both typical and atypical for the age or grade level of the individual who is
being assessed. A wide range of behaviors may be considered typical within any age or grade
level. The checklist is designed so that a “typical” rating in each category is easily identified.
For example, typical examinees are cooperative during the examination, seem at ease and
comfortable, are attentive to the tasks, respond promptly but carefully, and generally persist
with difficult tasks. These behaviors are indicated as “Typical for age/grade” on the checklist.
For other categories, particularly those that reveal marked differences from age to age,
examiners will need to apply a finer knowledge of age- or grade-appropriate behaviors. For
example, “typical” levels of activity or conversational proficiency would be quite different
for a 5-year-old than for a 9-year-old child. For some age or grade levels, ratings such as
“Appeared fidgety or restless at times” could be included within the range of behaviors that
is “Typical for age/grade” rather than a separate category. In such instances, it would be more
accurate to check “Typical for age/grade” than “Appeared fidgety or restless at times” because
the former conveys the concept of age- or grade-appropriate behavior.
Use the “Test Session Observations Checklist,” located on the Test Record, immediately after
test administration. Each of the items describes a category of observations. For each item, place
a check mark in the box corresponding to the quality that best describes the behavior of the
individual who was assessed. Only one category should be checked for each item. If any item
does not apply to the individual, or if the categories do not convey an adequate description of
the examinee’s test session behaviors, leave the item blank. Also note any other behaviors of
clinical interest. This type of qualitative information may affect interpretation of test results.
Be sure to respond to the question “Do you have any reason to believe this testing session
may not represent a fair sample of the examinee’s abilities?” located on the Test Record cover.
If Yes is checked in response to this question, complete the sentence “These results may not
be a fair estimate because… _______.” Examples of reasons for questioning the validity of the
test results may include suspected or known problems with an examinee’s hearing or vision,
emotional problems of a nature that interfere with the person’s ability to concentrate, and
certain background factors.

36 General Administration and Scoring Procedures


The seven scales included in the “Test Session Observations Checklist” were derived
from a review of related scales and research on test session observations. The checklist can
help to qualitatively describe behaviors that may facilitate or inhibit cognitive, linguistic,
and academic performance. Additionally, certain responses to one or more of the categories
may impact the interpretation of an examinee’s scores. For example, an individual’s test
performance may have been impaired by distractibility during testing. Another person’s
performance may have been facilitated by an increase in effort when difficult tasks were
presented. In summary, the examinee’s observed behavior can provide valuable clinical
information, especially when the behavior in the test session can be compared with his or her
behavior in the classroom and other settings.

“Qualitative Observation” (Observación cualitativa) Checklists


The first 11 tests each have a “Qualitative Observation” checklist on the Test Record. The
purpose of these checklists is to document examinee performance on the test through
qualitative observations, or in the case of Prueba 8: Lectura oral, a quantitative observation.
Although these checklists are optional, important insights can be gained about the person’s
performance from documented observations about how the individual completed the task.
For example, on Prueba 1: Identificación de letras y palabras, the examiner may observe that
the examinee read the words accurately but quite slowly, indicating a lack of automaticity. Or
the examiner may observe that the examinee did not apply phoneme-grapheme relationships.
Figure 3-6 illustrates the possible observations for Prueba 1: Identificación de letras y palabras.

Figure 3-6. Observación cualitativa


“Qualitative Observation” ¿Cuál de las siguientes frases
checklist for Prueba 1: describe mejor la facilidad con
Identificación de letras y que el examinado identificó
palabras. las palabras en la prueba
Identificación de letras y palabras?
(Marque solo una respuesta).
❏ 1. Identificó las palabras
rápidamente y con exactitud,
sin mucho esfuerzo (Con
habilidad automática para
identificación de las palabras).
❏ 2. Identificó los ítems iniciales
rápidamente y con exactitud,
e identificó los ítems más
difíciles mediante una creciente
aplicación de las relaciones
fonema-grafema (Normal).
❏ 3. Identificó los ítems iniciales
rápidamente y con exactitud,
pero tuvo dificultades para
aplicar las relaciones fonema-
grafema en los últimos ítems.
❏ 4. Necesitó de más tiempo
y mayor atención a las
relaciones fonema-grafema
para determinar la respuesta
(Sin habilidad automática
para identificación de las
palabras).
❏ 5. No logró aplicar las relaciones
fonema-grafema.
❏ 6. Ninguna de las anteriores, no
se le observó, o no aplica.

General Administration and Scoring Procedures 37


Scoring (Calificación)
Because the examinee’s pattern of correct and incorrect responses is needed to determine basal
and ceiling levels, complete the item scoring during test administration (except for the timed
tests). Some raw scores (number correct or number of points) can be calculated between
tests, while others are calculated after all testing is completed. After the raw scores are totaled,
estimated age- and grade-equivalent scores are readily available from the “Scoring Tables” on
the Test Record. Use the online scoring and reporting program to complete all other scoring.

Item Scoring
With the exception of three tests (Prueba 6: Expresión de lenguaje escrito, Prueba 8: Lectura
oral, and Prueba 12: Rememoración de lectura), score each item administered by placing a 1
or a 0 in the appropriate space on the Test Record: 1 = correct response, 0 = incorrect or no
response. (Detailed scoring procedures for Prueba 6: Expresión de lenguaje escrito, Prueba 8:
Lectura oral, and Prueba 12: Rememoración de lectura are included in Chapter 4.) For items
not administered, leave the corresponding spaces on the Test Record blank. After a test has
been administered and completely scored, the only blank spaces should be items below the
basal and above the ceiling levels or items not included in the assigned block of items.
The correct and incorrect keys accompanying many of the items in the Test Book are
guides that demonstrate how certain responses are scored. Not all possible correct and
incorrect answers are listed. Judgment is required when scoring some responses. In the keys,
the first response listed is the answer given most frequently during the standardization.

Use of Judgment in Scoring Responses


Occasionally, an examinee’s response does not fall clearly into the correct or incorrect
category or it is difficult to decide if the item should be scored correct or incorrect on the
basis of the key. In this case, record the actual response on the Test Record and then score
it later upon completion of testing. Until a decision has been made, do not use the item(s)
to determine a basal or ceiling. Continue testing until the basal or ceiling criterion is met
without including the unscored item(s). If, after further consideration, it is still not clear how
to score several responses, balance the scores (1s and 0s). For example, if two questionable
responses remain, score one item 1 and the other 0.

Additional Notations for Recording Responses


In addition to using 1s and 0s to score items, writing the following abbreviations on
the Test Record margins may be helpful when recording an examinee’s responses. These
supplementary symbols can provide additional information about the person’s testing
behavior.
Q: Query—indicates the examiner asked a question to clarify the response
DK: Don’t Know—indicates the examinee responded, “I don’t know.”
NR: No Response—indicates the examinee did not respond to the item
SC: Self-Correction—indicates the examinee correctly changed a response
When possible, incorrect responses should be recorded verbatim on the Test Record
for diagnostic purposes. In addition to providing data for error analysis, recording actual
responses allows comparison of an individual’s current responses with future responses if the
test is administered again.

38 General Administration and Scoring Procedures


Scoring Multiple Responses
If a person gives more than one answer to an item, score the last answer given as correct
or incorrect. Do not base the score on the initial response. This procedure should be used
even if an examinee changes a response given much earlier in the testing session. The new
response, whether correct or incorrect, is used as the final basis for scoring that item.
If an individual provides both a correct and an incorrect answer to an item, query the
response by saying something like “¿Cuál es?” (Which is it?). Score the final response.

Computing Raw Scores


For Prueba 6: Expresión de lenguaje escrito, Prueba 8: Lectura oral, and Prueba 12:
Rememoración de lectura, the raw score is the number of points or number correct in the
given block or group of items. For all other tests, the raw score is the number of correct
responses or the number of points plus every item in the test below the basal.
Do not include points for sample items in the calculation of raw scores. Although
responses to the sample items are recorded on the Test Record, they are indented and appear
in screened boxes and are clearly distinct from the actual test items.
Record the raw score in the screened Number Correct or Number of Points box at the end
of each test on the Test Record. The scoring for each test can be completed before moving
to the next test or as the examinee is working on a timed test, such as Prueba 11: Fluidez en
escritura de frases.

Obtaining Age- and Grade-Equivalent Scores


Receive immediate feedback regarding the examinee’s level of performance during the testing
session by computing the raw score and checking the estimated age- or grade-equivalent score.
These results may suggest the possible need for further testing, perhaps in the same test session.
To obtain estimated age- and grade-equivalent scores, calculate the examinee’s raw score,
locate that score in the first column of the “Scoring Table” provided for each test on the Test
Record, and encircle the entire row including the raw score. The circled row will include the
number correct or number of points, the estimated age equivalent (AE), and the estimated
grade equivalent (GE).
The “Scoring Tables” on the Test Record provide estimates of the actual AE or GE. In some
cases, these scores will be the same as those produced by the online scoring and reporting
program. In other cases, however, differences will exist between the estimated AE/GE and
the actual AE/GE. For example, timed tests may have differences between the estimated and
actual scores. When discussing AEs or GEs or including these scores in reports, use the actual
scores from the online scoring and reporting program rather than the estimated ones from the
Test Record.

Using the Online Scoring and Reporting Program


The online scoring and reporting program calculates derived scores, variations, comparisons,
and discrepancies.
Enter identifying information, raw scores, “Test Session Observations Checklist” information,
and “Qualitative Observation” information directly from the Test Record into the online scoring
and reporting program. The online scoring and reporting program automatically calculates the
examinee’s chronological age and tenth-of-school-year grade placement (based on a standard
school year). If the student is enrolled in a year-round school or a school with starting or ending
dates that fall more than 2 weeks before or after the default range (i.e., August 16 through June

General Administration and Scoring Procedures 39


15), use the option for entering exact starting and ending dates of the school year. Due to the
wide variation in starting and ending dates for schools and districts, use this option regularly to
increase the precision of the grade norms accessed by the online scoring and reporting program.
After the examiner enters the starting and ending dates into the online scoring and reporting
program, it automatically calculates the exact grade placement, in tenths of the school year.
The online scoring and reporting program includes separate data entry fields for the Batería IV
APROV and the Batería IV COG to allow for different dates of testing and different examiners.
Additionally, the online scoring and reporting program provides the option to include the tests
from the WJ IV OL and select tests from the Batería III Woodcock-Muñoz: Pruebas de habilidades
cognitivas (Batería III COG; Muñoz-Sandoval, Woodcock, McGrew, & Mather, 2005, 2007c), if
administered, in the Batería IV report. To include these additional tests in the Batería report, they
must have administration dates no more than 30 days apart and have been committed in the
online scoring and reporting program prior to running the Batería IV report. Similarly, examiner
observations can be entered in the online scoring and reporting program. Certain changes can be
made to the Table of Scores. For example, electing a larger standard score confidence band (68% is
recommended), changing the variation and comparison cut-score criteria (1.5 is recommended),
or electing to include an additional score for reporting purposes are possible changes.

Accommodations
The Batería IV is ideally suited to increase the participation of students with disabilities in
assessment and accountability systems. This section identifies several administration features
of the Batería IV that allow individuals with disabilities to participate more fully in the
evaluation process.
Setting
The individual administration format of the Batería IV APROV provides the opportunity
for standardized assessment on a one-to-one basis. Use of a separate location for testing
minimizes the distractions inherent in a classroom group-testing environment. If needed,
use noise buffers such as earplugs or headphones to mask external sounds. Also, incorporate
special lighting, special acoustics, or adaptive or special furniture if needed.
Timing
Use of basal and ceiling rules focuses the assessment on the examinee’s level of ability and
minimizes testing time. In addition, frequent breaks can be taken between tests, if needed.
With the exception of the timed tests, individuals can have extended time to complete tasks,
if required.
Presentation
All instructions are presented orally to the examinee, and the language of the instructions is
at a sufficiently simple level of linguistic complexity to minimize language comprehension
barriers. The instructions may be repeated or signed, if necessary. Special sample items on
many of the tests help clarify the person’s understanding. Use of large print, fewer items per
page, and increased space between items allows examinees to focus better on individual items
without being overwhelmed by simultaneous presentation of multiple items as would occur
during a group-administered assessment. Visual magnification devices and templates that
reduce glare also may be incorporated into the assessment without affecting validity.
Scheduling
Administration of the Batería IV APROV tests can be scheduled at a specific time of day
to accommodate individual examinee needs. The tests may be presented in any order to
maximize interest and performance. When an individual cannot sustain peak performance for
long periods of time, the test may be administered over several days.

40 General Administration and Scoring Procedures


Recommended Accommodations
As a general rule, the examiner should adhere to standard administration and scoring
procedures. However, at times, an examinee’s special attributes need to be accommodated.
Accommodations should be made only to reduce the effect of the examinee’s special attributes
that are not relevant to the construct being assessed. In providing accommodations and
interpreting test results for individuals with disabilities, be sensitive to the limitations
different impairments may impose on a person’s abilities and behavior.
A modification means that the content of the test has been altered. It is important to
recognize that modifications may have a compromising effect on the validity of the test
results. Modifications are usually inappropriate in cases where the disability is relevant to the
construct being measured. For example, no modification is appropriate for an individual with
limited reading skill if the test being administered is designed to measure reading ability. In
this instance, the modification would fundamentally alter the construct being measured.
Generally, the examiner should select and administer tests that do not require
modifications. The broad classes of examinees often requiring some level of accommodation
in the assessment process are young children; English learners; individuals with attentional
and learning difficulties; and individuals with hearing, visual, and physical impairments. Prior
to making accommodations, the examiner should be trained in the specific area or consult
with a professional who has such expertise. Selected portions of the Batería IV APROV may
be used for individuals with sensory impairments if their physical or sensory limitations
interfere with performance, or make performance impossible, on certain other tests.

Young Children
Assessing young children in their preschool and early school years requires an examiner who
is trained and knowledgeable in this type of assessment. Examiners must select tests that
are appropriate for the age and functional level of the examinee. Some tests may not have
an adequate floor for young or low-functioning individuals, and other tests are designed
for use with school-age children or older individuals. For example, few individuals below
age 6 would be expected to perform adequately on tests such as Prueba 9: Fluidez en lectura
de frases or Prueba 13: Números matrices. On the other hand, examinees as young as age 2
generally can perform beginning tasks on Prueba 1: Identificación de letras y palabras and
Prueba 2: Problemas aplicados.
Preparation for Testing
Some young children may be uncomfortable with unfamiliar adults and may have difficulty
separating from their caregiver or teacher. It may be necessary to spend additional time with
such a child with a familiar adult nearby prior to accompanying the child to the testing
situation. Let the young child know that the caregiver is nearby and will be around when
testing is completed. In extreme circumstances, it may be necessary to have the familiar adult
stay with the child during testing. However, under these circumstances, the caregiver must
understand the standardized conditions under which the testing must occur. Every effort
should be made to minimize the caregiver’s involvement in the test situation. If a parent must
be present during the testing session, carefully explain the testing process, including the
establishment of test basals and ceilings (i.e., that some items may be very easy for the child
and that other items may be difficult), before the actual testing begins. Also, explain to the
parent that it is important he or she not assist the child in any way during the testing session.
The parent should be asked to sit to one side behind the child so that it is not too easy for the
child to interact with the parent during the test administration.

General Administration and Scoring Procedures 41


General Guidelines
Several early development tests require the child to respond verbally. Initially, some children
may be shy and refuse to speak with an unfamiliar adult. If the child persists in not speaking,
even after several rapport-building activities between the examiner and the child, such as
playing with a preferred toy and spending some time together outside of the testing situation,
it may be best to discontinue testing and try again at a later date. It also may be beneficial to
administer tests in a different order. For example, the assessment could begin with tests that
require only a pointing response and then continue with tests that require verbal responses.
Intelligibility also is often an issue when testing young children. Instructions on many
of the tests indicate to not penalize examinees for articulation errors, dialect variations, or
regional speech patterns. Additional time conversing with or observing the child prior to
the testing situation may be necessary to discern such variations. Follow-up conversation
after testing also may be informative. Do not ask the child to repeat responses frequently, but
instead note the difficulty with intelligibility in the report.
Young children typically need more frequent breaks during the testing session than
do older students and adults. Short breaks are particularly helpful if the child has a short
attention span or high activity level, both of which are common in young children. Be careful
to provide break activities that are enjoyable but not so engaging that the child does not want
to return to the test situation. Quiet break-time activities, such as rolling a ball, working a
puzzle, walking to get a drink of water, having a short snack, or other activities with a clear
beginning and end, are typically most desirable. Many children will respond positively if
given reinforcements, such as verbal praise, smiles, stickers, or snacks, between tests. Use
of a friendly and engaging voice during the test administration may help involve the child
better in the test situation. Praise the child’s efforts but do not indicate whether responses are
correct or incorrect.
Conduct testing at a table of appropriate height for the child. It is important that the child
be able to sit independently and upright in a chair without adult assistance. Consider the
visual perspective of the young child. The child should not sit too low (e.g., on a small chair
at a big table), sit on the floor, or sit on a parent’s lap looking down on the test materials.
This is especially important on items where the child receives visual information from the
Test Book easel.
Attempt to eliminate distractions in the environment. While this is true for all examinees,
it is particularly important with young children, who may be much more easily distracted.
Colorful pictures on the wall, open window blinds, and toys around the room may make it
difficult for the child to attend to the test.
When testing young children, attempt to make the testing situation engaging, interesting,
and fun. Adjusting the pace of testing to meet the needs of the child is important. While
many young children will respond best to a brisk pace with frequent verbal praise, some
young children prefer a quieter, slower pace with limited verbalization, especially when they
are starting out in a new situation.

English Learners
The most important accommodation for students who are English learners (ELs) is having
an examiner who is knowledgeable about important issues relevant to second language
acquisition, the assessment process, and the interpretation of test results for students who are
ELs. To this end, the examiner must be familiar with the second language acquisition process,
native language attrition, language shift in dominance, cross linguistic transfer of learning,
and the impact of special language programming and socioeconomic factors on language
learning (August & Shanahan, 2006; Cummins & Hornberger, 2008; de Leeuw, 2008;

42 General Administration and Scoring Procedures


Flege, Schirru, & MacKay, 2003; Grosjean, 2001; Thomas & Collier, 2002). The examiner
must know about the availability and limitations of tests in the student’s native language, as
well as how to interpret the test performance of individuals who are ELs.

Individuals With Learning and/or Reading Difficulties


In certain instances, it may be necessary to provide certain accommodations for examinees
with learning and/or reading problems. Often the appropriateness of an accommodation
can be determined by the reason for the referral. For example, it is not appropriate to read
the reading tests to an individual who is struggling with reading because the purpose of the
evaluation is to determine the extent and severity of the reading impairment. By reading
the test, the construct being measured is altered and the test of reading ability becomes a
measure of oral comprehension. While not appropriate in the testing situation, this type of
accommodation may be entirely appropriate when the student encounters unmanageable
reading tasks in the classroom setting.
Similarly, an examinee may complete tasks at a very slow rate. Although most of the
Batería IV tests do not have a time limit, allowing additional time is not appropriate on timed
tests. The purpose of the timed tests is to ascertain how quickly the person can perform
tasks within a specified amount of time. Some people may take an undue amount of time on
items that are too difficult for them to complete; for example, an individual may rework math
problems several times in different ways to come up with solutions. In these cases, attempt to
keep the process moving so that the pace of testing is not monotonous.
For some examinees with severe perceptual impairments, use of a card or piece of paper to
highlight or draw attention to specific items is appropriate. Individuals with poor fine motor
control may need to type responses rather than write them in the Response Booklet. Others
who are easily frustrated by tasks that become too difficult may respond better to several
short testing sessions rather than one lengthy session.
Examinees with weaknesses in specific abilities often require more encouragement and
reinforcement during the testing process than those who learn easily. Provide specific praise
and positive comments as needed to keep the individual engaged and to reinforce his or
her effort.

Individuals With Attentional and Behavioral Difficulties


Clinical expertise is needed when assessing individuals with severe behavioral or attentional
difficulties. Examiners should have specific training in this area or should consult with a
professional who has such expertise.
Preparation for Testing
It is desirable to become familiar with an examinee’s typical classroom behavior prior to
conducting the assessment. If possible, develop rapport with the person before engaging in
formal assessment procedures. Depending on the age of the individual, this could include
classroom or playground visits or an informal interview prior to the assessment. It is often
beneficial to identify specific activities that the examinee enjoys (e.g., playing a computer
game, shooting baskets on the playground). These activities can sometimes be used as
reinforcers during break times.
General Guidelines
When testing individuals with attentional and behavioral difficulties, implementing
behavioral management techniques may help avoid or reduce problem behavior and increase
the likelihood of compliance. The following, adapted from several sources (Herschell, Greco,
Filcheck, & McNeil, 2002; Prifitera, Saklofske, & Weiss, 2008; Sattler & Hoge, 2005), are
suggested techniques for managing examinee behavior.

General Administration and Scoring Procedures 43


Schedule the testing session when the person is most likely to perform at his or her best.
To ensure a more positive reaction, testing can be done in several short sessions. Short breaks
should be quiet and structured.
To help the individual stay on task, remove all distractions from the testing environment
and keep test materials that are not in use out of the examinee’s reach. Attempt to keep full
attention on the examinee and maintain a brisk testing pace. This is most easily accomplished
by knowing the test procedures thoroughly prior to the test administration and by having
all test materials set up prior to the testing session. When setting up the testing materials,
consider the examinee’s distractibility. Sitting next to, rather than across from, the person will
allow redirection of the individual’s attention to the testing task.
At the beginning of the testing situation, establish the expectations for the examinee’s
behavior; for example, the individual should remain in his or her seat, follow directions,
and sit still. During the testing session, it is important to provide reinforcement (e.g., verbal
praise) for appropriate examinee behavior and effort. Redirect or ignore inappropriate
behavior. It is also important to remind the examinee to work carefully and slowly if he or
she responds carelessly or impulsively, except on tests designed to measure those behaviors,
such as timed tests. If an individual appears frustrated, offer this reminder: “Algunas
preguntas y problemas van a ser muy fáciles y otros van a parecer difíciles. Haz [haga] lo
mejor que puedas [pueda].” (Some questions and problems will seem very easy, while others
will seem hard. Do the best you can.) Make sure the examinee is ready to start each test
before beginning administration.
Use commands that describe appropriate behavior rather than inappropriate behavior.
For example, say, “Tom, alcánceme ese lápiz” (Tom, please hand me that pencil) rather than
“Tom, deje ya de jugar con ese lápiz” (Tom, stop playing with that pencil). Using statements
that limit the person’s choices is also helpful, as in the following examples: “Cuando se siente
en su silla, entonces le mostraré su siguiente actividad.” (When you sit in your chair, then I’ll
show you our next activity.) “Si se sienta derecho, entonces podemos continuar.” (If you sit up
straight, then we can move on.) “Usted tiene dos alternativas. Puede escuchar ahora algunas
preguntas o resolver algunos problemas.” (You have two choices. You can either listen to some
questions next or solve some problems.)
One of the examiner’s responsibilities is to determine whether the test results provide a
valid representation of the examinee’s present performance level. When evaluating individuals
with challenging behaviors, attempt to ascertain the effects of the problem behavior on the
assessment process and determine how the behavior affects performance. In some situations
the problem behavior produces test results that are not representative of the person’s true
performance or capabilities. For example, during an evaluation, an examinee refused to
respond to the examiner’s oral questions. The examiner realized that the results of the
assessment were more a reflection of the noncompliant behavior than the person’s knowledge
of the subject matter. In this case, an examiner should not report the test scores, but instead
should reschedule the assessment for another time when the person is more willing to
cooperate. In other situations, it is apparent through behavioral observation that the test
results reflect something different from the intended construct. For example, on a timed task,
if the examinee’s attention needs to be redirected to the task many times, the low performance
may be indicative of attentional difficulties rather than a slow processing rate.
On rare occasions, it may be necessary to discontinue testing if an examinee shows acute
signs of frustration or anxiety or is unable to maintain attention. If the person exhibits
behavior that suggests the possibility of verbal or physical aggression, discontinue testing
and wait until a time when he or she is less volatile. Be sure to complete the “Test Session

44 General Administration and Scoring Procedures


Observations Checklist” on the cover of the Test Record. If needed, make a note of any
additional observations and include them in the written report.

Individuals With Hearing Impairments


When testing examinees who are deaf or hard of hearing, the evaluator must consider the
usefulness of the normative scores, the types of accommodations that must be made in
administering the tests, and the factors that may influence interpretation. In these cases,
the person’s primary mode of communication is more important than the degree or type of
hearing impairment. Communication modes range from sign language to aural/oral language,
with multiple gradations between. For discussion purposes, communication modes have been
grouped into three categories:
■ Sign Language: A complete visual-spatial language with its own semantics, syntax, and

pragmatics using the hands, body, and facial expressions. Sign language does not follow
the grammar of the oral language. Sign languages in each country often have their
own dialect. In the United States, American Sign Language (ASL) is used. In Canada,
ASL is used mostly by Anglophones, and Quebec Sign Language is used mostly by
Francophones. In the Spanish-speaking countries, there are different sign languages for
different countries, including Argentine, Columbian, Cuban, Mexican, Peruvian, and
Spanish sign languages. Most sign languages have sign language dictionaries that can be
accessed online.
■ Sign-Supported Speech: The use of spoken language with sign used simultaneously

all or most of the time. People using this form of communication are not able to
adequately comprehend spoken language without sign accompaniment.
■ Aural/Oral Language: The use of spoken language without sign, usually aided by some

form of auditory amplification.


General Guidelines
Primary Communication Mode. The evaluator must administer verbal tests through the
examinee’s primary communication mode. To establish the primary communication mode,
consult a professional (e.g., teacher, certified interpreter) who is familiar with the person and
who has expertise in communication modes used by people who are deaf or hard of hearing.
Use of an Interpreter. Ideally, the qualified examiner would also be fluent in the person’s
communication mode. If an interpreter must be used, however, he or she must be a certified
sign language interpreter and must be sufficiently skilled and flexible to adapt to the
examinee’s primary mode of communication.
Although necessary in many cases, using an interpreter for testing can cause problems.
Young children may not yet have learned how to use an interpreter. In addition, the presence
of another person in the room may alter the child’s performance and affect the validity of
the test results. To minimize this possibility, use an interpreter with whom the examinee is
already familiar or allow time for him or her to become familiar with the interpreter before
beginning the evaluation.
In many cases, the signs that should be used to convey test instructions depend more on
the intent of the task than on the sentences being translated. To avoid misinterpretation, it
is important to work with the interpreter prior to the assessment to familiarize him or her
with the test instructions, procedures, items, unfamiliar concepts or terminology, and skills
being assessed.
Testing Environment and Amplification. Testing of examinees who are hard of hearing
should be conducted in a room with no background noise and few visual distractions. Often
hearing aids do not filter out background noise, thus making it harder for the examinee to

General Administration and Scoring Procedures 45


hear the evaluator’s voice. Check the person’s hearing aid or cochlear implant immediately
before testing to ensure that it is working correctly, turned on, and positioned properly. When
available, use a room with an amplification system, and ensure that the microphone is turned
on and that the examinee’s amplification device is switched to the proper channel to receive
the examiner’s voice.
Speech Intelligibility. Before administering tests requiring a verbal response, confirm that
the examinee’s speech is intelligible. If an oral response is unintelligible, the person should be
asked to explain further to determine whether or not the intended response is correct. Do not
penalize examinees for articulation errors, dialect variations, or regional or unusual speech
patterns, but make note of them on the Test Record for later analysis. Unless it makes the
person uncomfortable, a voice recorder could be used so responses can be verified later by a
professional (e.g., speech-language pathologist, teacher) who is familiar with the individual’s
speech patterns.
Scoring and Interpretation
Generally, examinees whose amplified hearing and speech discrimination is normal should
be able to take all of the tests following the standardized procedures, in which case, the
scores should be valid. However, in each situation, use judgment concerning the validity
of the scores based on the number and degree of adaptations made. For interpretation
purposes, however, the age at which the hearing loss was diagnosed and the amplification
provided should be considered as indications of the number of years the person has had an
opportunity to gain undistorted information through hearing. Hearing loss over an extended
period of time can negatively affect an individual’s vocabulary development and acquisition of
information usually learned incidentally.
Consider the examinee’s audiogram when scoring responses. Apparent errors might be
related to the accuracy of an examinee’s speech discrimination or to the frequencies that are
impaired. For example, an individual with a hearing loss in the high frequencies may omit
certain word endings (e.g., /s/ or -ed voiced as /t/) because he or she does not hear them.
For examinees using sign-supported speech, the examiner must make judgments
concerning the degree of the examinee’s dependence on sign rather than voice. A strong
reliance on sign may suggest that even those tests marked in Table 3-1 as useful for sign-
supported speech communicators should not be administered or that increased caution
should be used when interpreting the scores. Instructions given in sign language will almost
always deviate from standardized instructions due to the linguistic differences between sign
language and spoken language, although this will not necessarily invalidate the usefulness
of the test.
Given these cautions, it is advisable to interpret the performance of examinees who are
hard of hearing in consultation with a professional in the field of hearing impairment who
is familiar with the examinee. Knowledge of the differences between spoken language and
signed communication and in the life experiences of people with hearing impairments (e.g.,
activities of daily living, limitations on incidental learning) may influence interpretation of
the scores.
Documentation of Deviations From Standardized Administration
Note any deviation from the standardized administration on the Batería IV APROV Test
Record as well as in the evaluation report. During testing, note any prompts provided to the
examinee as well as the examinee’s incorrect and questionable responses on individual items
so they can be considered in interpreting the test results. The report should state how the
examinee’s hearing impairment or the altering of standardized administration procedures
may have affected the person’s scores, possibly underestimating or overestimating actual
achievement levels.

46 General Administration and Scoring Procedures


Accommodations and Cautions Specific to the Batería IV APROV
Table 3-1 indicates which tests might be useful for each of the three communication groups
as well as the validity of the scores. The numbers in the table refer to accommodations and
cautions specific to each test that are explained below the table. All accommodations must
be specific to each individual. The notations accorded to the Aural/Oral Language column
assume that, with all of the needed accommodations provided, the examinee has normal or
near normal hearing. The more severe the hearing impairment, the more caution is called for
in using the scores. Be sure to document all accommodations and modifications clearly in the
evaluation report.
The symbols represent the following recommendations:
◆ This test is useful and can yield valid scores.
□ This test may be useful but requires cautious interpretation of the scores.
□ This test should be used for qualitative information only.
×

Table 3-1. Sign-Supported Aural/Oral


Batería IV APROV Tests Test Sign Language Speech Language
Useful for Individuals With
Hearing Impairments 1: Identificación de letras y palabras 1 □ 1, 2 ◆2
2: Problemas aplicados □3 □3 ◆
3: Ortografía 4 □4 □4
4: Comprensión de textos □ 5, 6
× □7 ◆7
5: Cálculo ◆ ◆ ◆
6: Expresión de lenguaje escrito □ 8, 9
× □9 ◆
7: Análisis de palabras □2 □2
8: Lectura oral 10 □ 10 ◆
9: Fluidez en lectura de frases □ 5, 6 □7 ◆7
10: Fluidez en datos matemáticos ◆ ◆ ◆
11: Fluidez en escritura de frases □9
× □9 ◆
12: Rememoración de lectura □ 11
× □7 ◆7
13: Números matrices □ 12 □ 12 ◆
1 Prueba 1: Identificación de letras y palabras—This is a test of word identification for hearing examinees, but it is a reading vocabulary test for sign
communicators because the sign for a word represents its meaning rather than its sound. Additionally, for some of the stimulus words, one sign
can refer to multiple items (e.g., cup, glass, can), some are routinely fingerspelled, and some have no meaning out of context. Examinees using
sign-supported speech must be able to read the words orally.
2 Prueba 1: Identificación de letras y palabras, Prueba 7: Análisis de palabras—An examinee’s pronunciation will indicate how well he or she is able
to apply phonics skills and knowledge of Spanish orthography; however, the examinee’s internal pronunciation may be more accurate than his or
her voiced pronunciation. Additionally, pronunciation errors may be secondary to the hearing impairment (articulation) rather than indications of
limited word attack skill.
3 Prueba 2: Problemas aplicados—In some of the earlier items, the question incorporates a sign that gives the answer (e.g., “two fingers” is signed
with two fingers). In some later items, signing the problem depicts the method of solution (e.g., which operation is needed). Fewer of these
problems occur after Item 25. At this point, the items are more complex, the examiner cannot assume that the examinee will be able to read them,
and the interpreter’s accuracy is critical. Consequently, prior to the test session, it is essential that the interpreter has ample time to read all of the
items the examinee is likely to take so that he or she can develop a well-reasoned approach to signing them. When deciding whether or not to use
the scores, take into account the level of the items administered, the extent to which the signing provided clues to the answer, and, for later items,
whether or not the examinee appeared to understand the signed interpretation.
4 Prueba 3: Ortografía—The examinee who uses sign-supported speech or aural/oral Spanish may misunderstand a stimulus word due to sound
distortion. If this happens, provide additional sentences to clarify the word. Prueba 3: Ortografía should not be administered in sign. Many of the
stimulus words do not have a specific sign or are fingerspelled, and a few do not exist in sign language (e.g., is, am ). Additionally, some of the
stimulus words are represented by signs that have multiple meanings (e.g., the same sign can mean already, finished, complete, and done ).
5 Prueba 4: Comprensión de textos, Prueba 9: Fluidez en lectura de frases—The examinee may miss some specific items that are biased toward
hearing (e.g., completing a rhyme) or Spanish syntax.

General Administration and Scoring Procedures 47


6 Prueba 4: Comprensión de textos, Prueba 9: Fluidez en lectura de frases—If an examinee’s comprehension is weak or his or her reading speed
is slow, consider that Spanish is a second (foreign) language for most people who are deaf and who use sign language as their primary mode of
communication. The norms, however, represent the performance of people who use Spanish as their primary language and who, for the most part,
have a wider reading vocabulary and an innate sense of Spanish syntax.
7 Prueba 4: Comprensión de textos, Prueba 9: Fluidez en lectura de frases, Prueba 12: Rememoración de lectura—People who are hard of hearing

often have a more limited oral vocabulary than their hearing peers because they do not have the same access to spoken language. Rather than
demonstrating difficulty with reading speed or recall, the examinee may not know the meaning of some of the words.
8 Prueba 6: Expresión de lenguaje escrito—Explain the directions carefully and possibly change the wording if the examinee does not appear to

understand.
9 Prueba 6: Expresión de lenguaje escrito, Prueba 11: Fluidez en escritura de frases—Spelling errors made by individuals whose primary

communication mode is manual often have little phonetic relationship to the intended word. Allow time to review the responses and, if the
response word is not understandable due to a nonphonetic misspelling, ask the examinee to sign it. Even if no credit is awarded, knowing what
word the examinee intended will help with interpretation.
10 Prueba 8: Lectura oral—Because a person must know the meaning of a word to sign it, for sign communicators, this test assesses reading

vocabulary and comprehension instead of oral reading. Consequently, responses cannot be compared with the performance of hearing/speaking
peers in the norm sample. For examinees who use speech, consider that errors in pronunciation may be secondary to the hearing impairment
(articulation) rather than indications of weak decoding skills.
11 Prueba 12: Rememoración de lectura—For examinees who use sign language, this test might indicate their comprehension and recall of written

Spanish; however, they will have to fingerspell names and other words that do not have signs. The interpreter must be alerted to the importance of
the bolded words so that he or she will voice those particular words if the examinee’s signed response appropriately represents them.
12 Prueba 13: Números matrices—Because of the complexity, signed instructions may have to deviate significantly from the standardized instructions

to ensure that the examinee understands the task.

Individuals With Visual Impairments


The types of visual impairment and the extent of visual functioning (i.e., the ability to use
available vision to complete activities) experienced by individuals with visual impairments
are extremely varied and person-specific; thus, the combination of accommodations necessary
for administering any particular test requires case-by-case consideration.
For discussion purposes, individuals with visual impairments have been grouped into two
categories:
Low Vision: “A person who has measurable vision but has difficulty accomplishing or
cannot accomplish visual tasks, even with prescribed corrective lenses, but who can
enhance his or her ability to accomplish these tasks with the use of compensatory visual
strategies, low vision devices, and environmental modifications” (Corn & Lusk, 2010,
p. 3). Low vision is the category that contains the greatest variation in visual impairment.
Blind: A person with sufficiently limited vision so as to need braille and/or auditory
materials for learning.
It is not recommended that the Batería IV APROV be administered to individuals who
are blind. The required adaptations to the battery would be too extensive. The problems
inherent in having multiple versions of a test produced by multiple people are myriad and
are likely to render the resulting scores useless. Tests specifically designed for people who
are blind, informal tests, criterion-referenced tests, and diagnostic teaching would be more
accurate measures of academic skills and knowledge. The progression of instruction of
braille characters, decoding and spelling skills, and math are often not comparable to the
progression of the same skills taught to sighted individuals and represented in the Batería IV.
Additionally, many of the Batería IV APROV tests include items with picture prompts that are
inaccessible to individuals who are blind and/or that assume a foundation of knowledge and
concepts that may be unfamiliar to these individuals.

48 General Administration and Scoring Procedures


Preparation for Testing
In preparing to test any individual with low vision, consider the findings of the most recent
reports regarding the examinee’s visual impairment, including (a) the effect it has on his or
her functional vision, (b) the most useful modes for reading, writing, math computation, and
responding, (c) optical devices prescribed, (d) adaptations to print and graphic materials,
and (e) recommended environmental accommodations. This information must be based on
the integrated findings of an ophthalmologic or optometric examination, a functional vision
assessment, and the individual’s needs for accommodations and assistive technology.
Corn and Lusk (2010) indicated that “clinical measures of vision (such as visual acuity
and peripheral field) do not directly correlate with how a person uses vision or is able to
function visually” (p. 3). A functional vision assessment (FVA) is needed to assess the
examinee’s visual acuity, visual fields, motility, neurological functions (e.g., visual fixation,
perception), and light and color perception. The FVA report includes recommendations
for optimizing the person’s functioning in educational and daily activities. Accordingly,
optimizing an examinee’s visual functioning for the purpose of testing academic achievement
will involve consideration of a variety of environmental factors (e.g., optical devices, lighting,
color of materials, print/picture-to-background contrast, and the distance between the
examinee and the materials) and physical factors (e.g., rate of visual fatigue). Consequently,
well in advance of testing, the examiner should consult a vision specialist who is familiar
with both the examinee and the results of his or her most recent FVA. Decisions as to
the appropriateness of any of the cautions, accommodations, and suggestions regarding
interpretation provided here will depend entirely upon the type and severity of the
individual’s visual condition and history. Therefore, collaboration with the vision specialist or
(if the examinee is a student) the teacher of visual impairments (TVI) is critical to minimize
the effect of the visual impairment on test performance and to interpret test results accurately.
General Guidelines
Orienting the Examinee to the Testing Environment. Verbally greet the examinee upon
arrival and then, according to the extent of the person’s visual limitations, help him or her
become familiar with the testing environment. For example, for people who have extremely
poor acuity or who have a very restricted visual field, describe the layout of the room.
Guide the examinee to explore the area in which he or she will be working—the physical
arrangement of the testing area, the seating arrangement, the table, and any materials on the
table.
Devices and Equipment. If the examinee uses an optical device (e.g., glasses, hand
magnifier, telescopic device, video magnifier), ask the vision specialist or TVI to determine
whether the examinee is proficient in its use. Check to make sure that the device is clean
and in good condition. Do not make substitutions such as enlarging test print because a
video magnifier is not available or relying on overhead lighting because a focused light is not
available.
Instructions. During testing, give visual guidance as needed to supplement verbal
instructions. This may include clarifying the position of the target stimulus (e.g., “en el
lado izquierdo, aproximadamente en la mitad de la página” [on the left side, about halfway
down the page]), pointing to where the examinee is to start reading or writing a response, or
pointing to a specific picture to help an examinee focus on the target.
Environment. Check with the examinee to ensure that the environmental conditions
are optimal. This may include providing an appropriate light source (e.g., incandescent,
fluorescent, and/or natural), moving the table in relationship to windows or other light
sources, adjusting light intensity or focus on the test materials, and/or providing a darkened
room.

General Administration and Scoring Procedures 49


Materials. Test materials may need to be adapted, such as providing black-lined response
sheets or a black felt-tip pen instead of a pencil or enlarging print or graphics. The examinee
may require the use of matte-finish acetate—either transparent acetate to reduce glare or
colored acetate to increase the contrast between the stimulus and background.
Physical Considerations. Seating should be arranged so that the examinee can move easily
to position his or her head at a comfortable distance from the stimulus and achieve the most
stable visual focus, the widest visual field, or the least interference from blind spots.
Altered Test Conditions. The examiner may need to mask parts of a page to reduce
visual clutter, increase the duration of test item exposure, and increase overall test time. The
examinee may need shorter test sessions to avoid visual fatigue and/or may need to use the
optical devices that he or she uses in the classroom and/or in daily living situations.
Increasing time limits for tests that were standardized with particular time limits is not
recommended. These tests are Prueba 9: Fluidez en lectura de frases, Prueba 10: Fluidez en
datos matemáticos, and Prueba 11: Fluidez en escritura de frases. Altering the standardized
administration procedures invalidates the scores. Results indicating how much slower an
examinee is than age or grade peers when reading, writing, or recalling math facts establish
documentation for accommodations of extended time. If the person’s visual limitations will
have an obvious negative effect on his or her performance on a test, omit the test or use the
results solely for qualitative purposes.
Guidelines for Interpreting Test Performance and Results
The validity and usefulness of test interpretation for examinees with visual impairments may
be increased by adhering to the following guidelines and suggestions:
1. Interpret test findings and their educational relevance in consultation with a vision
specialist or, if the examinee is in school, the TVI who is familiar with the examinee’s
visual functioning and with the most recent FVA.
2. If an examinee performs poorly relative to age or grade peers on tests that incorporate
reading comprehension, consider the limiting effect of a visual impairment on life
experiences and related vocabulary and concept development. Individuals with visual
impairments may have little or no experience with certain information that typically
is learned incidentally and through vision (e.g., a skull and crossbones indicate
poison, what Abraham Lincoln looks like).
3. When analyzing error patterns, ask the examinee to explain the thinking process
used on incorrect items. This explanation will help to determine whether the factors
contributing to the error are related to the examinee’s visual functioning or to his or
her grasp of the academic skill/concept. The vision professional can help determine
the error patterns to probe.
4. On items that the examiner reads aloud and that have the same text on the
examinee’s page, be aware that the examinee may not be able to adequately see the
text or pictures meant as prompts. If an individual has to hold the oral information in
mind, it may add to the burden on working memory and may interfere with problem
solving.
5. Look for the possible relationship between the examinee’s visual impairment and
the type of academic errors made. For example, a restricted visual field may make
it difficult for the person to maintain his or her place on a line of print, resulting
in word repetitions or omissions. Thus, the instructional implications would relate
to more efficient visual scanning, a change in position of the eyes relative to the
stimulus, or different use of the optical device (Smith, 1999).
6. In addition to the previous guidelines, remember that it is possible for a person with a
visual impairment to have comorbid disabilities, such as learning disabilities. Making

50 General Administration and Scoring Procedures


this type of determination may require further assessment and must result from a
collaborative effort among a psychologist, vision specialist or TVI, learning disabilities
specialist, general education teacher, and/or others who know the examinee well.
Accommodations and Cautions Specific to the Batería IV APROV
Many of the Batería IV APROV tests may be used with individuals with low vision as long as
the appropriate guidelines for testing are followed and optimal accommodations are made.
Table 3-2 indicates which tests might be useful when testing an individual with low vision
and the validity of the scores. The numbers in the table refer to accommodations, cautions,
and/or suggestions for interpretation that are specific to each test and that are explained
below the table. An examinee’s performance may be analyzed for instructional purposes and
scores may be used to indicate the examinee’s academic achievement in relation to normally
sighted peers. The more severe the visual impairment, the more caution is called for in using
the scores. All accommodations and modifications must be documented clearly in the evaluation
report.
The symbols represent the following recommendations:
◆ This test is useful and can yield valid scores.
□ This test may be useful but requires cautious interpretation of the scores.

Table 3-2. Test Low Vision


Batería IV APROV Tests
Useful for Individuals With 1: Identificación de letras y palabras □ 1, 5
Visual Impairments 2: Problemas aplicados □2

3: Ortografía ◆3

4: Comprensión de textos □4

5: Cálculo ◆

6: Expresión de lenguaje escrito □2

7: Análisis de palabras ◆

8: Lectura oral □4

9: Fluidez en lectura de frases □6

10: Fluidez en datos matemáticos □6

11: Fluidez en escritura de frases □6

12: Rememoración de lectura □5

13: Números matrices ◆7


1 Prueba 1: Identificación de letras y palabras—Extend or dispense with 5-second response guideline.
2 Prueba 2: Problemas aplicados, Prueba 6: Expresión de lenguaje escrito—Point to the picture prompt(s) and text on the examinee’s page,
regardless of the test instructions.
3 Prueba 3: Ortografía—Provide whatever type of writing utensil and paper (e.g., black lined) the student normally uses in the classroom.
4 Prueba 4: Comprensión de textos, Prueba 8: Lectura oral—If the examinee has a visual impairment that interferes with his or her ability to scan
smoothly across a line of print, errors and repetitions may be due to the visual impairment rather than to a deficiency in the examinee’s academic
skill.
5 Prueba 1: Identificación de letras y palabras, Prueba 12: Rememoración de lectura—Poor performance may be due to limited vocabulary and
concepts secondary to the examinee’s limited visually based incidental learning and experiences.
6 Prueba 9: Fluidez en lectura de frases, Prueba 10: Fluidez en datos matemáticos, Prueba 11: Fluidez en escritura de frases—If the examinee’s
responses are correct but the score is low compared to similar tests without time limits, consider that the visual impairment may be interfering with
rapid symbol and/or picture recognition. Thus, the results may indicate a need for extra time for visual work but may not indicate a weakness in the
underlying language or academic skills.
7 Prueba 13: Números matrices—If the examinee is trying to mask parts of the matrix with a hand, provide a blank, unlined index card.

General Administration and Scoring Procedures 51


Individuals With Physical Impairments
Several accommodations are appropriate when testing individuals who have physical or
multiple disabilities. Be sensitive to the limits of the examinee’s physical condition and how
it may influence or limit his or her ability to perform on the test and interact with the testing
materials.
Preparation for Testing
Make appropriate physical accommodations, such as using a table of appropriate height
for a person using a wheelchair. The seating arrangement should allow the person ease of
movement and comfortable visual access to the testing materials. Consult a specialist who is
familiar with the needs of the examinee and is an expert in the use of any special equipment
or assistive technology the examinee requires.
General Guidelines
Be sensitive to the examinee’s fatigue level. Depending on the type of disability, some people
may perform better when given several rest periods or breaks during test administration.
Allow modified response modes. For example, if a person is unable to write, some
responses may be given orally (dictated) or by pointing. If an individual is unable to speak,
he or she may write, type, or sign responses to appropriate tests. If signed responses will be
used, the examiner should have expertise in the examinee’s communication mode or should
use a skilled, certified interpreter.
Test materials may need to be adapted to accommodate the examinee. For example, if the
person has poor motor control but is able to write, the Response Booklet may need to be
taped to the table and/or enlarged.

Interpretive Cautions
Many test modifications, such as altering administration procedures by providing additional
cues, are appropriate in specific circumstances. Modifying test procedures requires
understanding the examinee’s condition or Spanish-speaking limitations, as well as the nature
and purpose of each test. Keep in mind that, in many instances, the purpose of an evaluation
is to determine an individual’s unique pattern of strengths and weaknesses and then to use
this assessment data to suggest appropriate classroom accommodations and to recommend
possible teaching strategies and interventions. Although a modification may improve
test performance, the resulting score may not be an accurate reflection of an examinee’s
capabilities. Note any deviation from the standardized administration on the Test Record and
always include a statement of the modified testing conditions in the written report.

Use of Derived Scores


Valid use of the broad population normative information will depend on the extent to
which the assessment varied from standard conditions (e.g., simplification of instructions,
supplemental practice, review of test instructions). Derived scores may not be valid for
tests in which the administration deviated more than minimally from the standardized
administration. The examiner must determine whether the procedures have been altered
to the extent that the published norms must be interpreted with caution. In addition to
the statement of modified testing conditions, in some cases, the examiner should include a
statement indicating that the obtained scores are likely to be too high or too low.

52 General Administration and Scoring Procedures


Chapter 4

Administering and
Scoring the Batería IV
APROV Tests
This chapter contains detailed administration procedures for each of the tests in the
Batería IV Pruebas de aprovechamiento (Batería IV APROV). Comparing the information in
this chapter with the actual instructions in the Test Book will help examiners learn both
administration and scoring procedures. In addition, the test-by-test “Batería IV Pruebas de
aprovechamiento Examiner Training Checklist” in Appendix B of this manual can be a helpful
tool for examiners learning to administer the Batería IV APROV. It is recommended that
examiners first learn and practice administering the core tests (Tests 1 through 6) and then
the remaining tests.

Batería IV APROV Tests


Below are detailed instructions for administering and scoring the 13 tests.

Prueba 1: Identificación de letras y palabras (Test 1: Letter-Word


Identification)
This test does not require additional materials for administration.

Starting Point
Select a starting point based on an estimate of the examinee’s present level of reading
achievement. Consult the Suggested Starting Points table in the Test Book to determine an
appropriate starting point for the examinee.

Basal
Test by complete pages until the 6 lowest-numbered items administered are correct, or until
the page with Item 1 has been administered.

Administering and Scoring the Batería IV APROV Tests 53


Ceiling
Test by complete pages until the 6 highest-numbered items administered are incorrect, or
until the page with Item 78 has been administered.

Scoring
Score each correct response 1 and each incorrect response 0. Score words that are not read
fluently (smoothly) on the last attempt 0. Do not penalize an examinee for mispronunciations
resulting from articulation errors, dialect variations, or regional speech patterns. Record the
total number of all items answered correctly and all items below the basal in the Number
Correct box after the last Identificación de letras y palabras item on the Test Record.

Administration Procedures
Know the exact pronunciation of each item before administering the test. The correct
pronunciation is in parentheses following more difficult items. For additional help with
pronunciation, refer to a standard dictionary. Do not tell or help the examinee with any
letters or words during this test.
If the examinee’s response to a specific item is unclear, do not ask him or her to repeat the
specific item. Instead, allow the examinee to complete the entire page and then ask him or
her to repeat all of the items on that page. Score only the item in question; do not rescore the
other items.
If the examinee pronounces words letter by letter or syllable by syllable instead of reading
them fluently, tell the examinee, “Primero lee [lea] la palabra en silencio y luego dime
[dígame] la palabra completa” (First read the word silently and then say the whole word
smoothly). Give this instruction only once during administration of this test. If the examinee
gives more than one response, score the last response. Examiners may wish to record
incorrect responses for later error analysis. In addition, examiners may wish to complete
the “Qualitative Observation” checklist on the Test Record to document how the person
performed the task.

Prueba 2: Problemas aplicados (Test 2: Applied Problems)


When prompted, give the examinee the worksheet in the Response Booklet and a pencil with
an eraser.

Starting Point
Select a starting point based on an estimate of the examinee’s present level of math
achievement. Consult the Suggested Starting Points table in the Test Book to determine an
appropriate starting point for the individual.

Basal
Test by complete pages until the 5 lowest-numbered items administered are correct, or until
the page with Item 1 has been administered.

Ceiling
Test by complete pages until the 5 highest-numbered items administered are incorrect, or
until the page with Item 50 has been administered.

54 Administering and Scoring the Batería IV APROV Tests


Scoring
Score each correct response 1 and each incorrect response 0. Unit labels (e.g., dólares,
centímetros) are not required unless specified in the correct key. If a unit label is required,
both the answer and the label must be correct to receive credit. If a unit label is not required
and the examinee provides a correct answer and a correct label, score the item as correct.
However, if the examinee provides an incorrect unit label, required or not, score the item as
incorrect. Record the total number of all items answered correctly and all items below the
basal in the Number Correct box after the last Problemas aplicados item on the Test Record.

Administration Procedures
If the examinee requests or appears to need it, provide the worksheet (Hoja de trabajo)
in the Response Booklet and a pencil with eraser prior to being prompted to do so. In all
cases, provide the Response Booklet and a pencil as directed at Item 27. Any question may
be repeated during the test whenever the examinee requests. Because the focal construct of
this test is not the person’s reading ability, read all items to the examinee. Completing the
“Qualitative Observation” checklist on the Test Record can help characterize the examinee’s
performance on this task.

Prueba 3: Ortografía (Test 3: Spelling)


When prompted, give the examinee the Response Booklet and a pencil with an eraser.

Starting Point
Select a starting point based on an estimate of the examinee’s present level of spelling skill.
Consult the Suggested Starting Points table in the Test Book to determine an appropriate
starting point for the person.

Basal
Test until the 6 lowest-numbered items administered are correct, or until Item 1 has been
administered.

Ceiling
Test until the 6 highest-numbered items administered are incorrect, or until Item 52 has been
administered.

Scoring
Score each correct response 1 and each incorrect response 0. Do not penalize for poor
handwriting or reversed letters as long as the letter does not form a different letter. For example,
a reversed lowercase c would not be penalized, but a reversed lowercase b would be penalized
because it becomes the letter d. Accept upper- or lowercase responses as correct unless a case is
specified. Record the total number of all items answered correctly and all items below the basal
in the Number Correct box after the last Ortografía item on the Test Record.

Administration Procedures
Know the exact pronunciation of each test item before administering the test. The correct
pronunciation is in parentheses following more difficult items. Request printed responses;
however, accept cursive responses. Completing the “Qualitative Observation” checklist on the
Test Record can help describe the examinee’s automaticity on this task.

Administering and Scoring the Batería IV APROV Tests 55


Prueba 4: Comprensión de textos (Test 4: Passage Comprehension)
This test does not require additional materials for administration.

Starting Point
Begin with the Introduction for examinees functioning at the preschool to kindergarten level.
Begin with Item 7 for all examinees functioning at the grade 1 level. For all other examinees,
administer Sample Item B and then select a starting point based on an estimate of the examinee’s
present level of reading achievement. Consult the Suggested Starting Points table following
Sample Item B in the Test Book to determine an appropriate starting point for the individual.

Basal
Test by complete pages until the 6 lowest-numbered items administered are correct, or until
the page with Item 1 has been administered.

Ceiling
Test by complete pages until the 6 highest-numbered items administered are incorrect, or
until the page with Item 54 has been administered.

Scoring
Score each correct response 1 and each incorrect response 0. Unless noted, accept only one-
word responses as correct. If an examinee gives a two-word or longer response, ask for a one-
word answer. Score a response correct if it differs from the correct response(s) listed only in
verb tense or number (singular/plural), unless otherwise indicated by the scoring key. Score
a response incorrect if the person substitutes a different part of speech, such as a noun for a
verb, unless otherwise indicated by the scoring key. Do not penalize for mispronunciations
resulting from articulation errors, dialect variations, or regional speech patterns.
Record the total number of all items answered correctly and all items below the basal in
the Number Correct box after the last Comprensión de textos item on the Test Record. Do not
include points for the introduction or sample items.

Administration Procedures
Examinees should read the passages silently; however, some individuals, especially younger
children, may read aloud. If this happens, ask the person to read silently. If the individual
continues to read aloud, do not insist on silent reading. Do not tell the examinee any words
on this test.
The examinee needs to identify the specific word that goes in the blank. If he or she reads
the sentence aloud with a correct answer, say, “Responde [Responda] con una sola palabra”
(Tell me one word). If the examinee cannot provide the word, score the item incorrect.
For Items 15 and higher, if the examinee does not respond to an item in about 30 seconds,
encourage a response. If the person still does not respond, score the item 0, point to the next
item and say, “Prueba [Pruebe] con esta” (Try this one). The 30 seconds is a guideline and
not a time limit. If an examinee requires more time to complete an item, more time may be
given. For example, if a response is encouraged after 30 seconds and the examinee indicates
he or she is still reading or needs more time, it is permissible to give more time.
Mark the one description on the “Qualitative Observation” checklist on the Test Record
that best describes the person’s performance on this task.

56 Administering and Scoring the Batería IV APROV Tests


Prueba 5: Cálculo (Test 5: Calculation)
When prompted, give the examinee the Response Booklet and a pencil with an eraser.

Starting Point
Select a starting point based on an estimate of the examinee’s present level of computational
skill. Consult the Suggested Starting Points table in the Test Book to determine an appropriate
starting point for the person.

Basal
Test until the 6 lowest-numbered items administered are correct, or until Item 1 has been
administered.

Ceiling
Test until the 6 highest-numbered items administered are incorrect, or until Item 57 has been
administered.

Scoring
Score completed items on this test before moving to another test to verify the basal and
ceiling and to complete any queries. Score each correct response 1 and each incorrect
response 0. If the examinee skips an item before the last completed item, score the item 0.
Score poorly formed or reversed numbers correct on this test. Score transposed numbers (e.g.,
12 for 21) incorrect. Record the total number of all items answered correctly and all items
below the basal in the Number Correct box after the last Cálculo item on the Test Record. Do
not include points for sample items.

Administration Procedures
If testing begins with Sample Item A and the examinee responds incorrectly to one or both
of the sample items, discontinue testing and record a score of 0 for this test. Make sure
to complete any queries listed in the Test Book, such as for the items involving reducing
fractions. Do not point to the signs or remind the examinee to pay attention to the signs
during this test. Use the “Qualitative Observation” checklist on the Test Record to help
describe the person’s rate and automaticity on this task.

Prueba 6: Expresión de lenguaje escrito (Test 6: Written Language


Expression)
When prompted, give the examinee the Response Booklet and a pencil with an eraser.

Starting Point
Select a starting point based on an estimate of the examinee’s present level of writing ability.
Administer the appropriate block of items as indicated in the Suggested Starting Points table
in the Test Book, on the page after the Expresión de lenguaje escrito tab.

Continuation Instructions
This test uses continuation instructions instead of basal and ceiling rules. Administer all
items in the selected block and then follow the continuation instructions. The continuation
instructions appear at the end of each block of items in the Test Book and on the Test Record.

Administering and Scoring the Batería IV APROV Tests 57


Scoring
Score items in Blocks A through F (Items 1 through 36) as 1 or 0. Score items in Block G
(Items 37 through 40) as 2, 1, or 0. Use the scoring keys included with each item to assign
a score. For items requiring the examinee to write a complete sentence, the keys provide
scoring instructions and examples of correct and incorrect responses. For items with stimulus
words, assign a score based on whether the examinee successfully demonstrates that he or
she understands the meaning of the stimulus word(s) and provides a complete sentence.
Do not penalize the examinee for spelling errors unless the misspelling interferes with
understanding the examinee’s response or the misspelling forms another real word. Do not
penalize the examinee for punctuation, capitalization, or usage errors unless otherwise
indicated in the item scoring key. Do not penalize the examinee for poor handwriting unless
the response is illegible after you have reminded the examinee to write neatly.
There are five reminders that may be given only once during the test. Provide the
appropriate reminder only after the first occurrence of each error type.
■ If the examinee responds with an incomplete sentence on items requiring a complete

sentence, score the item 0 and say, “Recuerda [Recuerde] que debes [debe] escribir una
oración completa” (Remember to write a complete sentence).
■ If the examinee changes a stimulus word in any way, score the item 0 and say,

“Recuerda [Recuerde] que no debes [debe] cambiar las palabras de ninguna forma”
(Remember, do not change the word(s) in any way).
■ If the examinee does not use all the stimulus words, score the item 0 and say, “Recuerda

[Recuerde] que debes [debe] usar todas las palabras.” (Remember to use all the words.)
■ If the examinee responds with two or more sentences, score the item 0 and say,

“Recuerda [Recuerde] que debes [debe] escribir solo una oración” (Remember to use
the words in one sentence).
■ If the examinee appears to be struggling to use the words in only the order presented,

say, “Recuerda [Recuerde] que puedes [puede] usar las palabras en cualquier orden”
(Remember, you can use the words in any order). While this reminder may be given
only once, it does not result in an automatic item score of 0.
You may want to write down these reminders on a card and keep it visually accessible
during testing. Although the reminders may not be repeated more than once, you may repeat
the instructions for the test items, if necessary.
Record the total number of points for each administered block in the appropriate boxes on
the Test Record.

Administration Procedures
If an examinee’s response to an item is illegible or difficult to read, ask him or her to write
as neatly as possible. The examiner may read any words to the examinee during this test
or repeat the instructions, if necessary. When an examinee asks if spelling is important or
how to spell a word, encourage the individual to just do the best he or she can. Do not spell
any words for the examinee. The overall quality of the individual’s written sentences can be
described by completing the “Qualitative Observation” checklist on the Test Record.
This test may be administered simultaneously to a small group of two or three individuals
if, in the examiner’s judgment, this procedure will not affect any examinee’s performance.

58 Administering and Scoring the Batería IV APROV Tests


Prueba 7: Análisis de palabras (Test 7: Word Attack)
This test does not require additional materials for administration.

Starting Point
Select a starting point based on an estimate of the examinee’s present level of reading skill.
The table in the Test Book presents suggested starting points.

Basal
Test by complete pages until the 6 lowest-numbered items administered are correct, or until
the page with Item 1 has been administered.

Ceiling
Test by complete pages until the 6 highest-numbered items administered are incorrect, or
until the page with Item 34 has been administered.

Scoring
Score each correct response 1 and each incorrect response 0. Score words that are not read
fluently (smoothly) on the last attempt 0. Do not penalize an examinee for mispronunciations
resulting from articulation errors, dialect variations, or regional speech patterns. Record the
total number of all items answered correctly and all items below the basal in the Number
Correct box after the last Análisis de palabras item on the Test Record. Do not include points
for sample items.

Administration Procedures
It is essential to know the exact pronunciation of each test item before administering the test.
The correct pronunciation is in parentheses following more difficult items. Say the phoneme
(the most common sound of the letter), not the letter name, when letters are printed within
slashes, such as /p/.
If the examinee has any special speech characteristics resulting from articulation errors or
dialect variations, become familiar with the examinee’s speech pattern before administering
this test.
If the examinee’s response to a specific item is unclear, do not ask him or her to repeat the
specific item. Instead, allow the person to complete the entire page and then ask him or her
to repeat all of the items on that page. Score only the item in question; do not rescore the
other items.
If the examinee pronounces words letter by letter or syllable by syllable instead of reading
them fluently, tell the individual, “Primero lee [lea] la palabra en silencio y luego dime
[dígame] la palabra completa” (First read the word silently and then say the whole word
smoothly). Give this instruction only once during the administration of this test. Score the
examinee’s last response. The examiner may wish to record incorrect responses for later
error analysis. In addition, the examiner may wish to complete the “Qualitative Observation”
checklist on the Test Record to document how the person performed the task.

Administering and Scoring the Batería IV APROV Tests 59


Prueba 8: Lectura oral (Test 8: Oral Reading)
This test does not require additional materials for administration.

Starting Point
Select a starting point based on an estimate of the examinee’s present level of reading skill.
Consult the Suggested Starting Points table in the Test Book to determine an appropriate
starting point for the individual.

Continuation Instructions
This test uses continuation instructions instead of basal and ceiling rules. Follow the
continuation instructions to determine which additional sentences should be administered
and when to discontinue testing. The continuation instructions are located at the bottom of
the examiner pages in the Test Book and on the Test Record.

Scoring
When the examinee reads a sentence with no errors, score the item 2. If the examinee makes
one error on the sentence, score the item 1. When the examinee makes two or more errors,
score the item 0. Types of reading errors include mispronunciations, omissions, insertions,
substitutions, hesitations of more than 3 seconds, repetitions, transpositions, or ignoring
punctuation. If the examinee self-corrects within 3 seconds, do not count the word as an
error. Do not penalize the examinee for mispronunciations resulting from articulation errors,
dialect variations, or regional speech patterns. Record the number of points earned in the
Number of Points box after the last Lectura oral item on the Test Record.

Administration Procedures
It is essential to know the exact pronunciation of each test item. The correct pronunciation is
in parentheses following more difficult words.
Become familiar with the types of reading mistakes that count as errors on this test. Figure
4-1 lists the types of reading errors that are shown in the Test Book. Sentences are reproduced
on the Test Record to facilitate scoring. During the test, follow along on the Test Record as the
examinee reads each sentence and mark each error with a slash (/) at the point in the sentence
where the error occurs. In most cases, the slash will be placed on the printed word that was the
error (i.e., mispronunciation, omission, substitution, transposition, hesitation, or repetition).
For an inserted word, place the slash between the two printed words where the insertion
occurred. If the examinee ignores punctuation (e.g., does not pause at a comma or raise his or
her voice for a question mark), place the slash on the punctuation mark that was ignored. The
examiner can also record and total each type of error in the “Qualitative Observation Tally” on
the Test Record. Figure 4-2 illustrates a portion of a completed Test Record and tally.

Figure 4-1. Mala pronunciación—Pronounces the word incorrectly


Reading error types in Omisión—Leaves out a word
Prueba 8: Lectura oral.
Inserción—Adds a word or words
Sustitución—Says a word that is incorrect but that maintains the sentence meaning (e.g., “ballenos” for ballenas)
Vacilación—Does not pronounce the word within 3 seconds. If this happens, say, “Pasa [Pase] a la siguiente palabra”
(Go on to the next word).
Repetición—Repeats a word or words
Transposición—Reads words in the wrong order (e.g., “bajan y suben” instead of suben y bajan)
Ignora puntuación—Does not observe punctuation (e.g., fails to pause for a comma or fails to raise voice for a question
mark)

60 Administering and Scoring the Batería IV APROV Tests


Figure 4-2.
Example of Test Record and Prueba 8 Lectura oral Conteo de observación cualitativa
Nota: Nivel básico y nivel máximo no aplican en esta prueba.
“Qualitative Observation

Mala pronunciación
La calificación se en basa la administración de un grupo específico de ítems.

Ignora puntuación
Tally” for Prueba 8: Lectura

Transposición
Sustitución

Repetición
Vacilación
Inserción
oral.

Omisión
Calificación 2, 1, 0

1
1 Una historia sobre ballenas. Un 1

2 2 Pueden nadar.

3 0 Ellas suben y bajan.


2

4 2 Algunos animales son grandes.

1 muchos 1
5 Comen peces pequeños.
Ítems 1-5 8A: Número de Número de errores
6 puntos (0-10) Ítems 1-5 2 1 1
5 o menos puntos: Suspenda la prueba
6 o más puntos: Administre Ítems 6-10, si no han sido administrados

Testing will continue with Items 6–10.


6 ¿Alguna vez has visto una ballena?

7 Ellas tienen aletas.

8 Una ballena puede vivir más de 100 años.

9 Un tipo es llamado ballena azul y puede llegar a crecer hasta 29 metros


de largo.

10 Un amigo puede preguntar: “¿Alguna vez has visto a una ballena saltar
alto fuera del agua?”
Ítems 6-10 8B: Número de Número de errores
puntos (0-10) Ítems 6-10
5 o menos puntos: Administre Ítems 1-5, si no han sido administrados
6 o más puntos: Administre Ítems 11-15, si no han sido administrados

11 Las ballenas usan sonidos para comunicarse; estos sonidos pueden ser
escuchados por millas bajo el agua.

12 Cuando las aletas de las ballenas golpean el agua, hacen ¡PLAF!


13 En tiempos prehistóricos, las ballenas tenían cuatro patas y vagaban por
los pantanos.

14 Sobrevivieron porque cambiaron de ser animales de tierra a ser animales


marinos.

15 Las ballenas barbadas emigran largas distancias, viajando de lugares de


aguas frías a aguas cálidas.
Ítems 11-15 8C: Número de Número de errores
puntos (0-10) Ítems 11-15
5 o menos puntos: Administre Ítems 6-10, si no han sido administrados
6 o más puntos: Administre Ítems 16-20

Prueba 9: Fluidez en lectura de frases (Test 9: Sentence Reading


Fluency)
10 7

When prompted, give the examinee the Response Booklet and a pencil with an eraser. This
test requires a stopwatch or a watch or clock with a second hand.

Starting Point
All examinees complete the sample items and practice exercise and then begin with Item 1.

Time Limit
Discontinue testing after exactly 3 minutes and collect the examinee’s pencil and Response
Booklet. Record the exact finishing time in minutes and seconds on the Test Record. It is

Administering and Scoring the Batería IV APROV Tests 61


important to record the exact finishing time because examinees who do well and finish in less
than 3 minutes will receive a higher score than individuals who continue to work for the full
3 minutes.

Scoring
Score each correct response 1 and each incorrect response 0. Ignore skipped items. Use the
Fluidez en lectura de frases Scoring Guide overlay to score this test. Record both the total
number of items answered correctly and the total number of items answered incorrectly
within the 3-minute time limit in the Fluidez en lectura de frases Number Correct and
Number Incorrect boxes on the Test Record. To obtain the estimated age and grade
equivalents on the Test Record, subtract the Number Incorrect from the Number Correct.
Enter both the Number Correct and the Number Incorrect into the online scoring and
reporting program. Do not include points for sample items or practice exercises.

Administration Procedures
If the examinee has 2 or fewer correct on Practice Exercises C through F, discontinue testing
and record a score of 0 in the Fluidez en lectura de frases Number Correct box on the Test
Record.
The sentences are intended to be read silently. Remind the examinee to read silently if he
or she begins reading aloud. If the person appears to be answering items without reading the
sentences, remind him or her to read each sentence. If the individual stops at the bottom of
a page, remind him or her to continue to the top of the next column or to the next page. If
the examinee starts to erase a response, provide a reminder to cross out the answer he or she
does not want.
This test may be administered simultaneously to a small group of two or three individuals
if, in the examiner’s judgment, this procedure will not affect any person’s performance.
However, do not administer this test to individuals who cannot read.

Prueba 10: Fluidez en datos matemáticos (Test 10: Math Facts


Fluency)
When prompted, give the examinee the Response Booklet and a pencil with an eraser. This
test requires a stopwatch or a watch or clock with a second hand.

Starting Point
All examinees begin with Item 1.

Time Limit
Discontinue testing after exactly 3 minutes and collect the examinee’s pencil and Response
Booklet. Record the exact finishing time in minutes and seconds on the Test Record. It is
important to record the exact finishing time because examinees who do well and finish in less
than 3 minutes will receive a higher score than individuals who continue to work for the full
3 minutes.
If the examinee has 3 or fewer correct after 1 minute, discontinue testing, and record a
time of 1 minute and the Number Correct (0 to 3) on the Test Record.

Scoring
Score each correct response 1 and each incorrect response 0. Use the Fluidez en datos

62 Administering and Scoring the Batería IV APROV Tests


matemáticos Scoring Guide overlay to score this test. Do not penalize for poorly formed or
reversed numbers. However, score transposed numbers (e.g., 12 for 21) incorrect. Record the
total number of calculations answered correctly within the 3-minute time limit in the Fluidez
en datos matemáticos Number Correct box on the Test Record.

Administration Procedures
Do not point to the signs or remind the examinee to pay attention to the signs during testing.
Watch to make sure the examinee is going from left to right, row by row, down the page.
Some examinees may choose to work left to right on the first row, right to left on the second
row, and so on, which is acceptable. However, if the examinee starts skipping around, remind
him or her to proceed across the page, one row at a time. If the examinee stops at the bottom
of the page, remind him or her to continue to the top of the next page. If the examinee starts
to erase a response, remind the examinee to cross out the answer he or she does not want.
This test may be administered simultaneously to a small group of two or three individuals
if, in the examiner’s judgment, this procedure will not affect any person’s performance.

Prueba 11: Fluidez en escritura de frases (Test 11: Sentence Writing


Fluency)
When prompted, give the examinee the Response Booklet and a pencil with an eraser. This
test requires a stopwatch or a watch or clock with a second hand.

Starting Point
All examinees complete the sample items and then begin with Item 1.

Time Limit
Discontinue testing after exactly 5 minutes and collect the examinee’s pencil and Response
Booklet. Record the exact finishing time in minutes and seconds on the Test Record. It is
important to record the exact finishing time because examinees who do well and finish in less
than 5 minutes will receive a higher score than individuals who continue to work for the full
5 minutes.
If an examinee has 3 or fewer correct responses within the first 2 minutes, discontinue
testing. Record a time of 2 minutes and the Number Correct (0 to 3) on the Test Record.

Scoring
Score each correct response 1 and each incorrect response 0. Score any skipped items
incorrect. Do not penalize an examinee for errors in punctuation, capitalization, or spelling or
for poor handwriting unless the response is illegible. Score illegible items incorrect.
Sometimes it may not be immediately apparent whether to score an item correct or
incorrect. A few general guidelines will assist in scoring the Fluidez en escritura de frases test.
To receive credit for an item, the examinee must use all three stimulus words in a complete
sentence. As noted in the Test Book instructions, the examinee may not change the stimulus
word in any way. If, for example, the examinee alters the tense of a verb or changes a noun
from singular to plural, score the item incorrect. A minor change in a word may make it
easier for the examinee to write a sentence, thus altering the difficulty level of the item.
However, if a stimulus word is miscopied or misspelled, the item can still receive credit as
long as the miscopying did not result in a change in tense, part of speech, or number.
To receive credit, the response must be a reasonable sentence. Some examinees may

Administering and Scoring the Batería IV APROV Tests 63


produce awkward sentences. If the meaning is clear, score the response correct. Score
sentences with the understood subject you, such as “Dress the pretty doll,” correct. If the
examinee uses a symbol for a word, such as an ampersand (&) or plus sign (+) for the word
and, or an abbreviation like w/ instead of the full word with, give credit if the response meets
all other criteria.
If a word that is critical to the sentence meaning is omitted, score the response incorrect.
The omission of a critical word often makes the response an incomplete sentence. However,
do not penalize an examinee for the accidental omission of a less meaningful word in a
sentence, such as the articles a, the, or an.
If, after reviewing these guidelines, it is still unclear how to score two or more items,
balance the scores given to these responses. For example, if two responses are unclear, score
one item 1 and the other item 0. Do not always give the examinee the benefit of the doubt
when scoring questionable responses.
Record the total number of sentences written correctly within the 2-minute cutoff or
5-minute time limit in the Fluidez en escritura de frases Number Correct box on the Test
Record. Do not include points for sample items.

Administration Procedures
If the examinee receives a 0 on Sample Items B through D after the error correction
procedure, discontinue testing and record a score of 0 in the Fluidez en escritura de frases
Number Correct box on the Test Record. If the examinee stops at the bottom of a page,
remind him or her to continue to the top of the next page.
In this test, the examiner may read any of the stimulus words to the examinee if requested
by the examinee. This test may be administered simultaneously to a small group of two or
three individuals if, in the examiner’s judgment, this procedure will not affect any person’s
performance.

Prueba 12: Rememoración de lectura (Test 12: Reading Recall)


This test does not require additional materials for administration.

Starting Point
Select a starting point based on an estimate of the examinee’s present level of reading ability.
Consult the Suggested Starting Points table in the Test Book to determine an appropriate
starting point for the examinee.

Continuation Instructions
This test uses continuation instructions instead of basal and ceiling rules. Follow the
continuation instructions in the Test Book to determine which additional stories should be
administered and when to discontinue testing. Because the continuation instructions on the
Test Record are abbreviated, consult the complete continuation instructions in the Test Book.

Scoring
On the Test Record, the elements to be scored are separated by slash marks (/). Place a check
mark above each element that the examinee recalls correctly during the retelling. Score each
correctly recalled element 1 and each incorrectly recalled element 0. Score elements not
recalled at all (correctly or incorrectly) 0. Scoring is based on a key word (shown in bold

64 Administering and Scoring the Batería IV APROV Tests


type) in each element. The examinee must recall the specific element, a synonym, or a word
that preserves the meaning to receive credit. For example, if the element to be recalled is
“perro” and, when retelling the story, the examinee says “perrito,” score the element correct.
However, if the element is “1519” and the examinee says, “En los anos 1500s,” score the
response incorrect. The examinee may recall the elements in any order.
Record the number of elements the examinee recalls correctly for each set of two stories
and enter the total in the Number of Points box for each set on the Test Record. Enter these
numbers in the online scoring and reporting program and enter an X if a set of stories was
not administered. Use the Number of Points for each set of stories administered to obtain
an estimated age and grade equivalent from the “Scoring Table” on the Test Record. If more
than two sets of stories are administered, use the column corresponding to the last two sets
administered to obtain the estimated age and grade equivalents.

Administration Procedures
Direct the examinee to read the story once silently. If necessary, remind the examinee of
this rule. Turn the page after the examinee has finished reading the story once. Prompt the
examinee as directed to retell the story. Do not tell the examinee any words on this test. It is
important to be familiar with the stories and required elements before administering this test.
This will facilitate scoring elements, particularly if the examinee retells them out of sequence.

Prueba 13: Números matrices (Test 13: Number Matrices)


When prompted, give the examinee the Response Booklet and a pencil with an eraser.
While this test is not a timed test, each item has either a 30-second or 1-minute guideline.
Therefore, it is recommended that the examiner use a stopwatch or a watch or clock with a
second hand to monitor response times.

Starting Point
Select the appropriate sample item based on an estimate of the person’s present achievement
level. Begin with Sample Item A for examinees functioning at the kindergarten to grade 8
level. For all other examinees, administer Sample Item B and then select a starting point
based on an estimate of the examinee’s present level of ability. Consult the Suggested Starting
Points table following Sample Item B in the Test Book to determine an appropriate starting
point for the individual.

Basal
Test by complete pages until the 6 lowest-numbered items administered are correct, or until
the page with Item 1 has been administered.

Ceiling
Test by complete pages until the 6 highest-numbered items administered are incorrect, or
until the page with Item 30 has been administered.

Scoring
Score each correct response 1 and each incorrect response 0. To be correct, an answer must
solve the problem both horizontally and vertically. Record the total number of all items
answered correctly and all items below the basal in the Number Correct box after the last
Números matrices item on the Test Record. Do not include points for sample items.

Administering and Scoring the Batería IV APROV Tests 65


Administration Procedures
Follow all verbal and pointing directions carefully when administering the sample items,
including the error or no response corrections. For each item, follow the time guideline. If
the examinee is actively engaged in trying to solve the problem, the examiner may allow
more time. However, if the examinee does not appear to be trying to solve the problem,
encourage a response. If the examinee does not give a response, score the item 0 and ask him
or her to move on to the next item. If the examinee provides a response that is not a whole
number, ask him or her to solve the problem using whole numbers only.
Very young or low-functioning examinees may be confused by more than one matrix per
page. In these cases, it is permissible to use a piece of paper to present one matrix at a time.

66 Administering and Scoring the Batería IV APROV Tests


Chapter 5

Scores and Interpretation


Calculating an examinee’s raw scores is only the beginning of the interpretation process
for the Batería IV Pruebas de aprovechamiento (Batería IV APROV). Raw scores have little
meaning until they have been converted into other scores, such as grade equivalents (GE)
or percentile ranks (PR). A wide array of interpretative options and scores is available.
Depending upon the purpose of the assessment, one type of score may be more useful than
another. For some situations and purposes, determining grade-equivalent scores and relative
proficiency indexes (RPIs) may be all that is necessary. In other situations, percentile ranks
may provide a more useful description of the individual’s test performance.
This chapter begins with a brief description of the levels of interpretive information and
various types of scores that are available for interpreting an examinee’s performance on the
Batería IV APROV. Next the chapter describes procedures for interpreting the tests, and
then the types of ability/achievement comparisons, discrepancies, and variation procedures
available and how to interpret them. The chapter concludes with a discussion of the
implications of the test results, relevant cautions, and recommendations for follow-up testing.

Levels of Interpretive Information


The range of interpretive information available for each test and cluster in the Batería IV
APROV includes information regarding testing behavior and examinee errors, developmental
status, degree of proficiency, and comparison with grade or age peers. In contrast to that
of many other test batteries, the interpretive design of the Batería IV APROV enables the
clinician to capitalize on the full range of information. Table 5-1 presents the range of
available interpretive information in four hierarchical levels (theoretically available with any
test, not just the Batería IV APROV).
A central principle inherent in the hierarchy presented in Table 5-1 is that each of the four
levels provides unique information about a person’s test performance. Information from one
level cannot be used interchangeably with information from another. For example, standard
scores (SS) cannot be used in place of age or grade equivalents, or vice versa. Each level
reports different information about the individual’s test performance.

Scores and Interpretation 67


Table 5-1. Level Type of Information Basis Information and Scores Uses
Hierarchy of Batería IV
1 Qualitative Observations during testing Description of examinee’s ■ A ppreciation of the
APROV Test Information (Criterion-Referenced) and analysis of responses reaction to the test examinee’s behavior
situation underlying obtained test
score
Performance on finely
defined skills at the item ■ P rediction of the examinee’s
content level behavior and reactions in
instructional situations
■ S pecific skill instructional
recommendations
2 Level of Development Sum of items scores Raw score ■ R eporting an examinee’s
(Norm-Referenced) level of development
Age or grade level in the * Rasch Ability score
norming sample at which (Example: Test or cluster ■ B asis for describing
the average is the same as W score) the implications of
the examinee’s score developmental strengths
Age Equivalent (AE)
and weaknesses
Grade Equivalent (GE)
■ B asis for initial
recommendations regarding
instructional level and
materials
■ P lacement decisions
based on a criterion of
significantly advanced or
delayed development
3 Proficiency Examinee’s distance on a Quality of performance on ■ P roficiency on tasks of
(Criterion-Referenced) Rasch scale from an age or reference tasks average difficulty for peers
grade reference point
* Rasch Difference score ■  evelopmental level at
D
(Example: Test or cluster which typical tasks will be
W DIFF) perceived as easy by the
examinee
Relative Proficiency Index
(RPI) ■  evelopmental level at
D
which typical tasks will be
CALP Level
perceived as very difficult
Instructional or by the examinee
Developmental Zone
■ P lacement decisions
based on a criterion of
significantly good or poor
proficiency
4 Relative Standing in a Relative position Rank order ■  ommunication of an
C
Group examinee’s competitive
(A transformation of a * Standard Score (SS)
(Norm-Referenced) position among peers
difference score, such as (Including T score, z score,
dividing by the standard NCE, Discrepancy SD ■ P lacement decisions
deviation of the reference DIFF) based on a criterion of
group) significantly high or low
Percentile Rank (PR)
standing
(Including Discrepancy PR)
*Equal-interval units; preferred metric for statistical analyses

The four levels of test information are cumulative; that is, each successive level builds on
information from the previous level. Information from all four levels is necessary to describe
a person’s performance completely. Level 1 provides qualitative data that are often used to
support a clinical hypothesis. Levels 2, 3, and 4 include a variety of score options from which
to select.
Level 1 information is obtained through behavioral observations during testing and
through analysis of erroneous responses to individual items. Observation of an examinee’s

68 Scores and Interpretation


behavior and analysis of specific errors can assist in understanding an individual’s test
performance and can be an important source of information when writing reports and
planning instructional or treatment programs. An example of level 1 information is the “Test
Session Observations Checklist” located on the Test Record.
Level 2 information is derived directly from the raw scores and is used to indicate an
individual’s stage of development. For most tests, raw scores are transformed into metrics that
more meaningfully convey level of development, such as age or grade equivalents.
Level 3 information indicates the quality of a person’s performance on criterion tasks of
a given difficulty level. The relative proficiency index (RPI), used throughout the Batería IV,
is an example of level 3 information. An RPI of 60/90 indicates that an examinee was 60%
successful on tasks that average persons in a reference group (either an age or a grade group)
perform with 90% success. The instructional zone (developmental zone on the Batería IV
Pruebas de habilidades cognitivas [Batería IV COG]) is another example of level 3 information.
This zone defines the range of tasks from those that a person would perceive as quite easy
(96% successful) to those that he or she would perceive as quite difficult (75% successful).
Level 4 information provides a basis for making peer comparisons. In educational and
clinical settings, percentile ranks and standard scores are the metrics most commonly used to
describe an individual’s relative standing in comparison to grade or age peers.
Although the information within each level is interchangeable, some of these metrics
are more easily interpreted than others. The scores listed within each level in Table 5-1 are
presented in order from the least to the most meaningful for most test users. For example,
in level 4, knowing the simple rank order of an individual’s score (e.g., 17th in a group of
unknown size) is not as meaningful as knowing the corresponding standard score. The
standard score, in turn, is not as meaningful as knowing the corresponding percentile rank.
In fact, standard scores are usually explained to lay persons in terms of the percentage of
individuals who fall at or below a given standard score—in other words, the percentile rank.
When selecting the scores to report, some metrics are more easily explained to parents,
teachers, and examinees than others are.
Certain scores in some levels have the characteristic of equal-interval units (Stevens,
1951) and are generally considered more appropriate for statistical analyses (see Woodcock-
Johnson IV Technical Manual [McGrew et al., 2014] for more information). These scores are
the preferred metric in that level for most statistical calculations and are identified with an
asterisk (*) in Table 5-1. In level 3 the W Difference score (W DIFF) is preferred because it is
based on the equal-interval W scale. In level 4 the standard score, rather than the percentile
rank, is preferred for statistical analyses. At any level, the statistically preferred metric may be
used for calculation and statistical purposes. The results of these procedures, such as a mean
(M) or standard deviation (SD), can then be converted into another more meaningful metric
from that level for reporting purposes.

Age- and Grade-Based Norms


Most interpretive scores are based on procedures that compare an examinee’s performance to
the performance of some well-defined group—a segment of the norming sample. The WJ IV
Technical Manual provides further details about the norming sample and the procedures used
to gather data.

Scores and Interpretation 69


A special feature of the Batería IV APROV is the option to use either grade- or age-based
norms. That is, the examinee’s test performance is compared to the average performance of
grade or age peers. Grade norms are available for kindergarten through grade 12; students
in 2-year colleges, as an extension of the K through 12 educational system; and students in
4-year colleges, including the first year of graduate school.
Age norms are based on ages 2 through 90+ years. Age and grade equivalents are not
affected by selection of age or grade norms; however, the standard scores, percentile ranks,
and relative proficiency index scores will be affected by the selection of the basis for the
norms. Generally, grade norms are preferable for school-based decisions, whereas age norms
may be more applicable in clinical settings. For example, if a 30-year-old adult who was
applying to graduate school was being evaluated, the most relevant comparison group would
be others at the same grade or level of academic completion (e.g., grade 17.0). A comparison
to an age cohort would not be as meaningful because this group would include many
people who did not attend or complete a 4-year college. If Batería IV APROV results will be
compared to results from another test that only provides age norms, age norms should be
used. The option to report age comparisons or grade comparisons is available when using the
online scoring and reporting program.

Types of Scores
This section discusses the variety of scores available for test interpretation. Included among
these scores are grade equivalents (GE), age equivalents (AE), relative proficiency indexes
(RPI), cognitive-academic language proficiency (CALP) levels, percentile ranks (PR), and
standard scores (SS). Most of these scores will be familiar to examiners who have used the
Batería III Woodcock-Muñoz or the Woodcock-Johnson IV. Several optional standard score
scales, including the normal curve equivalents (NCE) scale, also are discussed.

Raw Score (Puntaje bruto)


For most tests, the raw score is the number of correct responses, each receiving 1 raw score
point. The three exceptions in the Batería IV APROV are Prueba 6: Expresión de lenguaje
escrito, in which responses to Items 37 and higher can receive 2, 1, or 0 points; Prueba 8:
Lectura oral, in which responses can receive 2, 1, or 0 points; and Prueba 12: Rememoración de
lectura, in which the raw score is based on the number of elements recalled correctly on the
stories administered. Number Correct or Number of Points is listed in the left column of the
“Scoring Table” that appears for each test on the Test Record. Procedures for calculating the
raw score are presented in Chapter 3 of this manual.
When an examinee receives a score of 0 on any test, the examiner needs to judge whether
that score is a true assessment of the examinee’s ability or whether it reflects the individual’s
inability to perform the task. If it is the latter, it may be more appropriate to assume that the
examinee has no score for the test rather than using the score of 0 in further calculation and
interpretation.

70 Scores and Interpretation


W Score (Puntuación W )
The online scoring and reporting program converts raw scores into W scores (Woodcock,
1978; Woodcock & Dahl, 1971), which are a special transformation of the Rasch ability
scale (Rasch, 1960; Wright & Stone, 1979). The W scale has mathematical properties that
make it well suited for use as an intermediate step in the interpretation of test performance.
Among these properties are the interpretation advantages of Rasch-based measurement
(Woodcock, 1978, 1982, 1999) and the equal-interval measurement characteristic of the scale
(Stevens, 1951). The W scale for each test is centered on a value of 500, which has been set
to approximate the average performance of 10-year-old individuals. Any cluster score from
the Batería IV APROV is the average (arithmetic mean) W score of the tests included in that
cluster. For example, the cluster score for Lectura amplia is the average W score of Prueba 1:
Identificación de letras y palabras, Prueba 4: Comprensión de textos, and Prueba 9: Fluidez en
lectura de frases.

Grade Equivalent (Equivalente de grado)


A grade equivalent (GE), or grade score, reflects the examinee’s performance in terms of the
grade level in the norming sample at which the median score is the same as the examinee’s
score. In other words, if the median W score on a test for students in the sixth month of
the second grade is 488, then an examinee who scored 488 would receive 2.6 as a grade
equivalent score.
At the ends of the grade scale, when using the online scoring and reporting program, less
than (<) signs are used for grade scores falling below the median score obtained by children
beginning kindergarten (K.0) and greater than (>) signs are used for grade scores higher than
the median score obtained by graduate students finishing the first year of graduate school
(17.9), or, if scored by 2-year college norms, at the end of the final year of a 2-year program
(14.9). For example, a student who scored above the median for students finishing the first
year of graduate school would receive a grade equivalent of >17.9, whereas a student who
scored below the median of students entering kindergarten would receive a score of <K.0.
When hand scoring, grade equivalents can only be closely approximated. Thus, the grade
equivalents located in the “Scoring Table” for each test on the Test Record are estimates (Est).
Precise grade equivalents for tests and grade-equivalent scores for clusters are only available
when using the online scoring and reporting program.
One frequently alleged disadvantage of grade scores is that they are not useful for
instructional planning because they do not reflect the student’s ability. This is sometimes
followed by the recommendation that some other metric, such as standard scores, be
used in place of grade equivalents. (Recall from the discussion about levels of interpretive
information that standard scores provide information regarding peer comparison but do not
provide information regarding level of development.)
Grade-equivalent scores may cause interpretive problems in tests that are composed
mostly of items with a limited range of difficulty (such as the multilevel tests of many group
achievement batteries). For example, if a third-grade student earns a grade equivalent of 6.5
on a test that is intended to be administered to grade 3, it does not mean that the student
will be successful on tasks associated with the mid-sixth-grade level. Rather, it means that
the student answered correctly a high percentage of the items on a third-grade test—the same
percentage of items that an average sixth-grade student answered correctly on the third-grade
test. The student’s score in this case is more a reflection of the student’s accuracy level than
the grade level of task difficulty that this student can perform.
This problem with grade scores is eliminated when test items are distributed uniformly

Scores and Interpretation 71


in a test over a wide range of difficulty, when students are administered the subset of items
centered on their level of ability, and when the test has been normed on an appropriately
selected sample of students across a wide grade range. With the Batería IV APROV and many
other individually administered tests, grade- and age-equivalent scores reflect the actual level
of task difficulty a student can perform and thus are useful for instructional planning.

Age Equivalent (Equivalente de edad)


An age equivalent (AE), or age score, is similar to a grade equivalent, except that it reflects
performance in terms of the age level in the norming sample at which the median score is the
same as the examinee’s score. Age equivalents may be more useful in some applications than
grade equivalents, especially as they relate to the abilities of young children or adults not
attending school.
At the ends of the age scale, less than (<) signs are used for levels of performance that fall
below the median of the specified age. Greater than (>) signs are used for levels above the
median of the specified age.
When hand scoring, age equivalents can only be closely approximated. Thus, the age
equivalents located in the “Scoring Table” for each test on the Test Record are estimates (Est).
The online scoring and reporting program reports the precise age equivalents for tests and
age-equivalent scores for clusters.

W Difference Score (Diferencia W )


Level 3 scores (RPIs) and level 4 scores (standard scores, percentile ranks) are based on
test or cluster W Difference scores. The W Difference scores are the difference between an
examinee’s test or cluster W score and the median test or cluster W score for the reference
group in the norming sample (same age or same grade) with which the comparison is being
made.

Relative Proficiency Index (Índice de proficiencia relativa)


The relative proficiency index (RPI) is a variation of the relative mastery index (RMI)
score first used in the Woodcock Reading Mastery Tests (Woodcock, 1973). The RPI allows
statements to be generated about an examinee’s predicted quality of performance on tasks
similar to the ones tested.
The RPI is similar to the index used with Snellen charts to describe visual acuity. For
example, 20/20 vision indicates that a person can distinguish at 20 feet what a person with
normal vision can discern at 20 feet. A person with 20/200 vision has to be at 20 feet to
see what people with normal vision can see at 200 feet. Although the constant term in the
Snellen Index is the numerator rather than the denominator, the procedure of representing a
comparative score is similar to the procedure used for presenting the RPI.
RPIs are based on the distance along the W scale that an examinee’s score falls above or
below the average score for the reference group. This distance is the difference scale. An RPI
of 90/90 means that the examinee would be predicted to demonstrate 90% proficiency with
similar tasks that average individuals in the comparison group (age or grade) would also
perform with 90% proficiency.
As an example, when used with the Lectura amplia cluster, the RPI predicts the
percentage of success for a person when given a variety of reading tasks that the reference
group (individuals of the same age or same grade) would perform with 90% success (the
denominator of the index). An RPI of 71/90 is interpreted to mean that when others at the

72 Scores and Interpretation


examinee’s age or grade show 90% success on reading tasks, the examinee is predicted to
show only 71% success on the same tasks. On the other hand, if the examinee’s RPI is 98/90,
the examinee is predicted to perform with 98% success those tasks that average age or grade
mates perform with 90% success.

Instructional Zone (Zona de instrucción)


The instructional zone (called developmental zone in the Batería IV COG and the WJ IV
OL) is a special application of the RPI. An examinee will perceive tasks that fall at an RPI of
96/90 as easy, whereas he or she will perceive tasks that fall at an RPI of 75/90 as difficult.
Thus, the instructional zone identifies a range along a developmental scale that encompasses
an examinee’s present level of functioning from easy (the independent level) to difficult (the
frustration level). The lower and higher points of this zone are labeled EASY and DIFF in the
“Table of Scores” generated when using the online scoring and reporting program.

CALP Levels (Niveles CALP)


Cummins (1984) formalized a distinction between two types of language proficiency: basic
interpersonal communication skill (BICS) and cognitive-academic language proficiency
(CALP). BICS is defined as language proficiency in everyday communicative contexts, or
those aspects of language proficiency that seem to be acquired naturally and without formal
schooling. CALP is defined as language proficiency in academic situations, or those aspects of
language proficiency that emerge and become distinctive with formal schooling. Classroom-
appropriate academic proficiency is further defined by literacy skills involving conceptual-
linguistic knowledge that occur in a context of semantics, abstractions, and context-reduced
linguistic forms.
The online scoring and reporting program includes the option to report CALP levels to
help describe the examinee’s language proficiency. If the option is selected, CALP levels can
be reported for several clusters in the Batería IV APROV (see Table 5-2). If administered
and selected, the Comprensión-conocimiento (Gc) cluster from the Batería IV COG and
select clusters from the WJ IV OL will also yield CALP levels. See the Batería IV Pruebas
de habilidades cognitivas Examiner’s Manual (Wendling et al., 2019) and the Woodcock-
Johnson IV Tests of Oral Language Examiner’s Manual (Mather & Wendling, 2014) for more
information. Table 5-3 illustrates the six CALP levels as well as two regions of uncertainty
and corresponding implications. The CALP levels are based on W Difference scores, and the
RPIs corresponding to these W Difference scores provide meaningful interpretations regarding
the individual’s language proficiency.

Table 5-2. Reading Clusters Writing Clusters Cross-Domain Clusters


APROV Clusters That Yield Lectura Lenguaje escrito Destrezas académicas
a CALP Level
Destrezas básicas en lectura Expresión escrita Aplicaciones académicas
Comprensión de lectura Aprovechamiento breve

Scores and Interpretation 73


Table 5-3. Instructional
CALP Levels and CALP Level W Difference RPI Implications
Corresponding Implications
6 Very Advanced +31 and above 100/90 Extremely easy
5 Advanced +14 to +30 98/90 to 100/90 Very easy
4–5 (4.5) Fluent to Advanced +7 to +13 95/90 to 98/90 Easy
4 Fluent –6 to +6 82/90 to 95/90 Manageable
3–4 (3.5) Limited to Fluent –13 to –7 67/90 to 82/90 Difficult
3 Limited –30 to –14 24/90 to 67/90 Very difficult
2 Very Limited –50 to –31 3/90 to 24/90 Extremely difficult
1 Extremely Limited –51 and below 0/90 to 3/90 Nearly impossible

Level 6, Very Advanced CALP


When compared with others of the same age or grade, an individual at level 6 demonstrates
very advanced cognitive-academic language proficiency. If provided with instruction at the
examinee’s chronological age or corresponding grade level, it is expected that an individual at
level 6 will find the language demands of the learning task extremely easy.

Level 5, Advanced CALP


When compared with others of the same age or grade, an individual at level 5 demonstrates
advanced cognitive-academic language proficiency. If provided with instruction at the
examinee’s chronological age or corresponding grade level, it is expected that an individual at
level 5 will find the language demands of the learning task very easy.

Level 4, Fluent CALP


When compared with others of the same age or grade, an individual at level 4 demonstrates
fluent cognitive-academic language proficiency. If provided with instruction at the examinee’s
chronological age or corresponding grade level, it is expected that an individual at level 4 will
find the language demands of the learning task manageable.

Level 3, Limited CALP


When compared with others of the same age or grade, an individual at level 3 demonstrates
limited cognitive-academic language proficiency. If provided with instruction at the
examinee’s chronological age or corresponding grade level, it is expected that an individual at
level 3 will find the language demands of the learning task very difficult.

Level 2, Very Limited CALP


When compared with others of the same age or grade, an individual at level 2 demonstrates
very limited cognitive-academic language proficiency. If provided with instruction at the
examinee’s chronological age or corresponding grade level, it is expected that an individual at
level 2 will find the language demands of the learning task extremely difficult.

Level 1, Extremely Limited CALP


When compared with others of the same age or grade, an individual at level 1 demonstrates
extremely limited cognitive-academic language proficiency. If provided with instruction at the
examinee’s chronological age or corresponding grade level, it is expected that an individual at
level 1 will find the language demands of the learning task nearly impossible to manage.

74 Scores and Interpretation


Percentile Rank (Rango percentil)
A percentile rank describes performance on a scale from 1 to 99 relative to the performance
of some segment of the norming sample that is at a specific age or grade level. The examinee’s
percentile rank indicates the percentage of individuals in the selected segment of the norming
sample who had scores the same as or lower than the examinee’s score. Percentile ranks are
particularly useful for describing a person’s relative standing in the population.
Extended percentile ranks (Woodcock, 1987, 1998) provide scores that extend down to a
percentile rank of one tenth (0.1) and up to a percentile rank of ninety-nine and nine tenths
(99.9). Figure 5-1 includes a comparison of the traditional and extended percentile rank scales.
If an examinee’s percentile rank is 0.2, for example, this indicates not only that the score
is below the first percentile (1.0) but, furthermore, that only 2 people out of 1,000 (0.2%)
would have a score as low or lower. If an individual’s percentile rank is determined to be 99.8,
this indicates that the person’s performance is as good as or better than that of 998 persons
out of 1,000 (99.8%) in the reference group, or that only 2 out of 1,000 people would have a
score as high or higher. Extending the percentile rank scale adds approximately one and one
half standard deviations of discriminating measurement to the range of a traditional percentile
rank scale—three-fourths of a standard deviation at the top and three-fourths of a standard
deviation at the bottom of the scale.

Standard Score (Puntuación estándar)


The standard score scale used in the Batería IV APROV is based on a mean (M) of 100 and a
standard deviation (SD) of 15. This scale is the same as most deviation-IQ scales and may be
used to relate standard scores from the Batería IV to other test scores based on the same mean
and standard deviation. The Batería IV also includes extended standard scores, providing
a greater range of standard scores (0 to over 200) than do other tests. Standard scores
sometimes present a disadvantage to inexperienced users and others, such as parents or the
examinee, because the scores lack objective meaning. Consequently, the interpretation of a
standard score is often explained using its equivalent percentile rank. Figure 5-1 illustrates
the relationship between selected standard scores and the extended percentile rank scale.

Figure 5-1. Extended Percentile Rank Scale


Comparison of the
traditional and extended 99.9 146
99.8 143
percentile rank scales with 99.7 141
the standard score scale 99.6 140
(M = 100, SD = 15). 99.5 139
99 99 (99.0) 135
98 98 131
95 95 125
90 90 119
80 80 113
Traditional 70 70 108 Standard
Percentile 60 60 104 Score
Rank Score 50 50 100 Scale
40 40 96
30 30 92
20 20 87
10 10 81
5 5 75
2 2 69
1 1 (1.0) 65
0.5 61
0.4 60
0.3 59
0.2 57
0.1 54

Scores and Interpretation 75


In writing reports or communicating test results to parents and others, an examiner may
prefer to use verbal labels rather than numbers to describe test performance. A classification
of standard score and percentile rank ranges is provided in Table 5-4 as a guideline for
describing an individual’s relative standing among age or grade peers. The third column
provides a set of verbal labels for the score ranges. Examiners should use caution and
professional judgment in the selection and application of verbal labels to describe a range of
scores. Although labels may assist in communicating test results, the terminology is at times
ambiguous or the meaning of the labels is misunderstood.

Table 5-4. Standard Score Range Percentile Rank Range Batería IV Classification
Classification of Standard
131 and above 98 to 99.9 Very Superior
Score and Percentile Rank
Ranges 121 to 130 92 to 97 Superior
111 to 120 76 to 91 High Average
90 to 110 25 to 75 Average
80 to 89 9 to 24 Low Average
70 to 79 3 to 8 Low
69 and below 0.1 to 2 Very Low

The online scoring and reporting program provides the option to report an additional
standard score from a selection of four other types of standard scores: z scores, T scores,
stanines, and normal curve equivalents (NCEs). The basic standard score is the z score with
a mean of 0 and a standard deviation of 1. The T score has a mean of 50 and a standard
deviation of 10. Although T scores have been frequently used in education and industry,
they have been replaced by the deviation-IQ scale (M = 100, SD = 15) for most clinical
applications. Another standard score scale is the traditional stanine scale. Stanines have a
mean of 5 and a standard deviation of 2 and are most useful in applications in which a single-
digit gross scale of measurement is desired. The normal curve equivalent scale (Tallmadge
& Wood, 1976) has a mean of 50 and a standard deviation of 21.06 and has been used most
often for evaluating student performance in certain federally funded programs.

Standard Error of Measurement (Error estándar de medición)


To provide a more accurate depiction of performance, a statistical estimate can be made of
the amount of error inherent in a score. This score, called the standard error of measurement
(SEM), is used to determine ranges of scores and provides an indication of the degree of
confidence professionals can have in an obtained score. One advantage derived from the
Rasch scaling of test data is that a unique calculation of the SEM is provided for each possible
test score. This is in contrast to other test development procedures that may provide only the
average SEM for the group of individuals studied.

Interpreting Tests
This section contains details on interpretation of the tests in each of the curricular areas.
Chapter 2 contains functional definitions of the abilities measured by each test. In evaluating
the practical significance of differences among test performance, consider any extenuating
circumstances that may explain these differences, as well as any unusual behaviors or
responses obtained on those tests. This information may have useful diagnostic implications.

76 Scores and Interpretation


Both the “Test Session Observations Checklist” and the “Qualitative Observation” checklists
available for Tests 1 through 11 can provide additional information about the examinee’s test
performance.
One interpretive plan is to consider each test in terms of task complexity within a
continuum. Some tasks are measures of isolated units; others require connected text,
reasoning, or motoric output. This requires an analysis of the test in terms of stimulus
material, task demands, and the expressive and receptive language requirements needed to
complete the task.
The Batería IV APROV tests may also be interpreted with respect to a well-accepted theory
of cognitive ability—the Cattell-Horn-Carroll (CHC) theory of cognitive abilities (Carroll,
1993; Cattell, 1963; Horn, 1988, 1991; Horn & Cattell, 1966; McGrew, 2005, 2009; Schneider
& McGrew, 2012; Woodcock, 1990). The Batería IV COG Examiner’s Manual and the WJ IV
Technical Manual contain more information on CHC theory.

Interpreting the Reading Tests


When interpreting the reading tests, consider the relative complexity of task demands in
each. Figure 5-2 is an interpretive model of the various skills measured by the Batería IV
reading tests.
In terms of complexity, the skills measured in these six tests range from the lower-level
ability to recognize isolated letters (the beginning items in Prueba 1: Identificación de letras y
palabras) to the higher-level ability to comprehend vocabulary and connected text (Prueba 4:
Comprensión de textos and Prueba 12: Rememoración de lectura).

Figure 5-2. MORE COMPLEX


Various skills measured
Batería IV Test Stimulus Task
by the Batería IV APROV
reading tests. Connected Discourse Prueba 4: Comprensión Printed passages Understanding a written passage
(Translexical Level) de textos and completing the passage with
a single word

Prueba 12: Rememoración Printed passages Reading and recalling elements


de lectura of a passage

Prueba 8: Lectura oral Printed sentences Oral reading of sentences

Rate/Automaticity Prueba 9: Fluidez en Printed sentences Reading and understanding


lectura de frases short sentences quickly

Isolated Words Prueba 1: Identificación Printed words Pronouncing real words


(Lexical Level) de letras y palabras
(word items)
Phono/Orthographic Prueba 7: Análisis de Printed words Applying phonic and structural analysis
Coding palabras (nonsense) skills to pronouncing nonsense words

Prueba 7: Análisis de Printed letters Identifying single phonemes


Isolated Letters palabras
(Sublexical Level)
Prueba 1: Identificación Printed letters Identifying single letters
de letras y palabras
(letter items)

LESS COMPLEX

Scores and Interpretation 77


Although the Batería IV APROV reading tests are primarily measures of reading ability
(Grw), these tests require other cognitive abilities as well, such as auditory processing
(Ga), comprehension-knowledge (Gc), processing speed (Gs), or long-term storage and
retrieval (Glr).

Prueba 1: Identificación de letras y palabras (Test 1: Letter-Word Identification)


This test is a measure of reading decoding (Grw), including the ability to identify the names
of several uppercase and lowercase letters and the ability to identify words. An individual
with good sight-word recognition skills demonstrates a pattern of recognizing many words
rapidly with little effort. Low performance on Identificación de letras y palabras may be a
function of inefficient strategies for word identification or response style. In most cases,
low scores mean that the person has not developed automatic word identification skills. An
examinee with nonautomatic word identification skills may identify several words accurately
but may require increased time and greater attention to phonological and orthographical
analysis to determine the correct response. In some cases, however, an examinee may have
developed some word identification skill but is unwilling to try, is frustrated, or is afraid to
risk making an error.
The “Qualitative Observation” checklist for this test helps document how the examinee
approached the task. Data were collected during standardization on checklists that were
completed by examiners. Table 5-5 provides information about the percentage of age mates
who were assigned each rating. For example, at age 9, 4% were rated as being able to identify
words rapidly and accurately, 7% were rated as having nonautomatic word identification
skills, and 1% did not apply phoneme-grapheme relationships. The majority of 9-year-olds
(75%) were rated as identifying initial words rapidly and accurately and then identifying
more difficult items with increased application of phoneme-grapheme relationships. Thus,
this would be considered typical performance for 9-year-olds. Using this information can help
determine how typical or atypical the examinee’s performance is compared to age mates.

Prueba 4: Comprensión de textos (Test 4: Passage Comprehension)


This test is a measure of reading comprehension (Grw) and lexical knowledge (Gc). This
modified cloze task requires the ability to use syntactic and semantic cues. Low performance
on Prueba 4: Comprensión de textos may be a function of limited basic reading skills,
comprehension difficulties, or both.
The essence of passage comprehension ability, as an independent skill, is how well an
individual understands written discourse as it is being read. The requirement that passage
comprehension be defined and measured as an independent skill is no different from good
measurement in any other area of achievement or cognitive ability. Scores from measures
that are not independent (confounded) are difficult to interpret. For a measure of passage
comprehension to meet the assumption that it is an independent measure requires a
reasonable expectation that examinees have prior familiarity with the words used in the
passages and have knowledge of any concepts that are prerequisite for processing the passage
contents. If these conditions are not met, the so-called passage comprehension test score
is confounded with word recognition skill and knowledge. For many examinees, a test
passage concerned with the spectrographic analysis of white light would be more a measure
of knowledge, or ignorance, of physics vocabulary and concepts than of the capability to
understand written discourse.
Some tests of reading comprehension are actually tests of information processing that
happen to use reading as the medium of communication. Asking an individual to study a
passage and then answer questions about the content, such as to state the author’s purpose

78 Scores and Interpretation


or to predict what may happen next, does not tap skills specific to reading. It taps language
processing and cognitive skills. These are valid skills to assess in their own right, regardless
of the medium of communication (for example, printed text, an audio recording, a television
excerpt, or a mime performance). However, scores from such tests do not measure the
essence of reading comprehension, but instead reflect performance on a confounded
language-processing task with indeterminate diagnostic results. A program of remedial
instruction planned for an individual may be ineffective if it is assumed that the problem is
with the person’s reading skill when the problem is actually a symptom of a broader language
processing skill. In fact, such problems might be remediated more effectively using materials
and procedures that do not require reading. For example, if a person has an information
processing weakness that interferes with appreciating the main purpose of a passage or
anticipating what may happen next, the remediation might best be approached using a variety
of media including listening, watching television, and reading. Broadening the language
base of instruction makes it more likely that the training will generalize to all areas of
communication and thinking, including reading.

Table 5-5. Rating 1 Rating 2 Rating 3 Rating 4 Rating 5 Rating 6


Percentage by Age of Identified Identified initial Identified Required Was not able to None of the
Occurrence of Qualitative words rapidly items rapidly initial items increased time apply phoneme- above, not
Observations for Prueba 1: and accurately and accurately rapidly and and greater grapheme observed, or
Identificación de letras y with little effort and identified accurately but attention to relationships does not apply
palabras (automatic word more difficult had difficulty phoneme-
identification items through applying grapheme
skills) increased phoneme- relationships
application grapheme to determine
of phoneme- relationships to the correct
grapheme latter items response
relationships (nonautomatic
(typical) word
identification
skills)
Age Percentage of Occurrence in Norming Sample
2 NA 3 4 4 26 63
3 1 19 5 3 19 54
4 1 27 10 3 9 50
5 4 40 11 7 18 21
6 9 60 10 14 6 1
7 5 68 22 2 1 1
8 15 65 13 4 2 NA
9 4 75 13 7 1 NA
10 5 74 11 6 4 1
11 1 73 16 8 1 1
12 3 71 16 9 1 NA
13 1 68 19 9 3 1
14 6 56 18 15 3 1
15 3 62 27 8 NA NA
16 8 61 20 7 2 2
17 4 65 20 10 NA NA
18 6 60 22 12 1 NA
19 7 60 18 13 1 NA
NA = Not observed or not rated

Scores and Interpretation 79


The modified cloze procedure used in Prueba 4: Comprensión de textos requires an
examinee to dynamically apply a variety of vocabulary and comprehension skills in the
process of arriving at the point where the missing word can be supplied in a passage. It
should be noted that with good modified cloze items an examinee should be unable to
provide the answer based on local context in the passage. Consider the following three
cloze examples.
Provide the missing word:
“...do something about ______ it.”
Now, read the entire sentence and attempt to provide the missing word:
“It is another thing to do something about ______ it.”
Finally, read and answer the entire item:
“It is one thing to demonstrate that modern war is harmful to the species. It is another
thing to do something about ______ it.”
Note that the solution to this item (for example, the word preventing) required
understanding of not only the sentence containing the blank, but also the preceding sentence,
thus requiring the use of a variety of reading, language processing, and vocabulary skills.
Such a task more likely measures an examinee’s ability to understand written discourse as it is
being read than many other reading comprehension test formats.
The question is sometimes asked whether Prueba 4: Comprensión de textos is a measure
of literal or inferential comprehension. During the process of completing a typical item, the
examinee likely draws on both types of comprehension. However, the process of providing
the missing word may be a result of inferential comprehension because the examinee must
infer an acceptable word from the total context of the passage. A useful comparison is
performance on this reading task with performance on the WJ IV OL Comprensión oral test, a
parallel task that does not require reading.
The “Qualitative Observation” checklist for this test helps document how the examinee
approached the task. Table 5-6 provides information about the percentage of age mates who
were assigned each rating in the norming sample. For example, of the 10-year-olds whose
performance was rated, 83% appeared to have typical passage comprehension, 7% appeared
to read with no observed difficulties, and 9% read slowly and had difficulty identifying
the correct word. Using this information can help determine how typical or atypical the
examinee’s performance is compared to age mates.

Prueba 7: Análisis de palabras (Test 7: Word Attack)


This test measures an examinee’s ability to apply phonic and structural analysis skills in
pronouncing phonically and orthographically regular nonsense or nonwords (Grw, Ga).
The individual must recall the phoneme associated with each grapheme and then blend
or synthesize the phonemes into a word. Knowledge of word structure is required for the
multisyllabic nonsense words. The “Qualitative Observation” checklist for this test helps
document how the examinee approached the task.
In most cases, poor performance on Prueba 7: Análisis de palabras means that the examinee
has not developed or mastered phonetic decoding skills. In some cases, however, an examinee
may have developed some phonic and structural analysis skills but is unwilling to try, is
frustrated, or is afraid to risk making an error.

80 Scores and Interpretation


Prueba 8: Lectura oral (Test 8: Oral Reading)
This test measures an examinee’s ability to apply important aspects of reading fluency, such
as accuracy and prosody, when reading sentences aloud (Grw). Low performance on Prueba
8: Lectura oral may be a function of limited decoding skills, comprehension difficulties, or
both, resulting in a lack of reading fluency. Individuals with expressive language impairments
may struggle with the oral demands of this task. The “Qualitative Observation Tally” for this
test helps document the number of each error type the examinee made while reading. This
information can help with planning an appropriate intervention.

Table 5-6. Rating 1 Rating 2 Rating 3 Rating 4


Percentage by Age of
Appeared to read Appeared to Appeared to None of the
Occurrence of Qualitative passages with read initial read passages above, not
Observations for Prueba 4: no observed passages easily very slowly and observed, or
Comprensión de textos difficulties (good but appeared had difficulty does not apply
use of syntactic to struggle identifying a
and semantic as reading correct word
cues) increased (struggled with
in difficulty application of
(typical) syntactic and
semantic cues)
Age Percentage of Occurrence in Norming Sample
2 NA 11 11 78
3 NA 20 6 74
4 1 29 6 64
5 3 40 17 39
6 5 69 21 6
7 9 82 9 NA
8 12 78 9 NA
9 3 87 9 1
10 7 83 9 1
11 12 83 5 NA
12 6 84 9 1
13 7 84 7 3
14 15 75 9 1
15 9 82 8 1
16 16 75 8 2
17 16 74 9 1
18 7 83 10 NA
19 15 73 11 1
NA = Not observed or not rated

Prueba 9: Fluidez en lectura de frases (Test 9: Sentence Reading Fluency)


This test is a measure of reading speed and rate (Grw, Gs). The task requires the ability
to read and comprehend simple sentences quickly. Low performance on this test may be a
function of limited basic reading skills, comprehension difficulties, slow processing speed,
and/or an inability to sustain concentration. The “Qualitative Observation” checklist for this
test helps document how the examinee approached the task. Table 5-7 provides information
about the percentage of age mates who were assigned each rating in the norming sample.

Scores and Interpretation 81


For example, of the 9-year-olds whose performance was rated, 15% appeared to read the
sentences slowly and 7% appeared to read them rapidly. The majority of 9-year-olds (78%)
appeared to read at a rate typical for their age. Using this information can help determine
how typical or atypical the examinee’s performance is compared to age mates.

Table 5-7. Rating 1 Rating 2 Rating 3 Rating 4


Percentage by Age of
Appeared to Appeared to Appeared to None of the
Occurrence of Qualitative
read sentences read sentences read sentences above, not
Observations for Prueba 9: rapidly at a rate typical slowly observed, or
Fluidez en lectura de frases for peers does not apply
Age Percentage of Occurrence in Norming Sample
5 4 12 8 77
6 4 46 14 36
7 8 65 18 10
8 11 76 13 NA
9 7 78 15 NA
10 17 68 15 NA
11 20 72 8 NA
12 17 76 6 NA
13 17 75 8 NA
14 23 67 11 NA
15 21 79 NA NA
16 17 78 5 NA
17 30 66 5 NA
18 18 73 7 2
19 29 58 11 2
NA = Not observed or not rated

Prueba 12: Rememoración de lectura (Test 12: Reading Recall)


This test is a measure of reading comprehension (Grw) and meaningful memory (Glr).
Low performance may result from a variety of factors, including limitations in attentional
control, limited basic reading skills, comprehension difficulties, or weaknesses in memory.
Additionally, weaknesses in oral language also can impact performance. For example,
individuals with expressive language difficulties may struggle with the oral retelling of the
details in a reading passage.

Interpreting the Math Tests


When interpreting the math tests, consider the relative complexity of task demands in each.
Figure 5-3 provides an interpretive model of the various skills measured by the Batería IV
APROV math tests.

82 Scores and Interpretation


Figure 5-3. MORE COMPLEX
Various skills measured by Batería IV Test Stimulus Task
the Batería IV APROV math
tests. Problem Solving Prueba 2: Problemas Printed problems Analyzing and solving
and Concepts aplicados presented orally practical problems

Prueba 13: Números Rectangular array Analyzing numerical


matrices of numbers relationships

Skills Prueba 5: Cálculo Printed items for Performing simple to


computation complex computations

Automaticity Prueba 10: Fluidez en Printed math facts Quickly calculating


datos matemáticos single-digit math facts
(addition, subtraction,
and multiplication)

Basic Math Facts Prueba 5: Cálculo Single-digit computations Calculating single-digit


facts

Motoric Output Prueba 5: Cálculo Orally presented numbers Writing numbers

LESS COMPLEX

In terms of complexity, the skills measured in the four Batería IV math tests range from
the lower-level ability of recognizing math symbols and vocabulary to the higher-level ability
of mathematical reasoning and problem solving. Based on CHC theory, the math tests are
primarily measures of quantitative knowledge (Gq), although some math tests measure other
aspects of processing, particularly fluid reasoning (Gf ) or processing speed (Gs).

Prueba 2: Problemas aplicados (Test 2: Applied Problems)


This test is a measure of quantitative reasoning, math achievement, and math knowledge
(Gq). The task requires the ability to analyze and solve math problems. This test also
measures an aspect of fluid reasoning (Gf ). Low performance on Prueba 2: Problemas
aplicados may be a function of limited math skills, comprehension difficulties, or poor
mathematical reasoning ability. The “Qualitative Observation” checklist for this test helps
document how the examinee approached the task. Table 5-8 provides information about
the percentage of age mates who were assigned each rating in the norming sample. For
example, of the 14-year-olds whose performance was rated, 17% appeared to have limited
understanding of age-appropriate math applications, while 10% solved the problems with no
observed difficulties. Using this information can help determine how typical or atypical the
examinee’s performance is compared to age mates.

Scores and Interpretation 83


Table 5-8. Rating 1 Rating 2 Rating 3 Rating 4
Percentage by Age of Solved Solved initial Appeared to None of the
Occurrence of Qualitative problems with problems with have limited above, not
Observations for Prueba 2: no observed no observed understanding observed, or
Problemas aplicados difficulties difficulty but of grade- or does not apply
(good demonstrated age-appropriate
comprehension increasing math
and analytical difficulties application
abilities) solving the tasks
latter items
(typical)
Age Percentage of Occurrence in Norming Sample
2 1 26 30 44
3 5 55 21 19
4 4 80 9 7
5 6 85 9 NA
6 9 85 5 1
7 10 86 4 NA
8 14 79 7 NA
9 9 82 8 1
10 5 87 9 NA
11 7 84 8 1
12 6 86 9 NA
13 7 82 10 1
14 10 71 17 2
15 5 77 18 1
16 5 67 26 2
17 7 66 25 3
18 5 60 35 NA
19 4 66 28 2
NA = Not observed or not rated

Prueba 5: Cálculo (Test 5: Calculation)


This test of math achievement measures the ability to perform mathematical computations
(Gq). The task requires the examinee to perform a variety of calculations ranging from simple
addition to calculus. Low performance may be a function of limited basic math skills, limited
instruction, or lack of attention. The “Qualitative Observation” checklist for this test helps
document how the examinee approached the task. Table 5-9 provides information about the
percentage of age mates who were assigned each rating in the norming sample. For example,
of the 12-year-olds whose performance was rated, 8% worked very slowly and relied on
inefficient strategies, 8% solved the problems quickly and with no observed difficulties, and
2% appeared to work too quickly. Using this information can help determine how typical or
atypical the examinee’s performance is compared to age mates.

84 Scores and Interpretation


Table 5-9. Rating 1 Rating 2 Rating 3 Rating 4 Rating 5 Rating 6
Percentage by Age of Worked too Solved Solved initial Solved Worked very None of the
Occurrence of Qualitative quickly problems problems problems slowly and above, not
Observations for Prueba 5: quickly with quickly with slowly and relied on use observed, or
Cálculo no observed no observed demonstrated of strategies does not apply
difficulties difficulties but less that appeared
(fluent and demonstrated automaticity to be inefficient
automatic) less with the latter for age or
automaticity items grade level
with the latter (nonautomatic)
items (typical)
Age Percentage of Occurrence in Norming Sample
4 NA NA 17 8 8 67
5 2 5 24 16 7 46
6 1 4 52 21 9 12
7 1 5 68 16 9 NA
8 NA 6 60 27 6 1
9 1 3 63 25 7 1
10 NA 2 75 14 8 1
11 2 4 76 13 5 NA
12 2 8 66 16 8 1
13 1 8 65 18 5 3
14 1 9 57 18 9 5
15 2 8 56 24 9 2
16 1 8 65 19 5 3
17 NA 8 51 31 7 3
18 NA 9 44 32 14 2
19 1 7 56 22 13 2
NA = Not observed or not rated

Prueba 10: Fluidez en datos matemáticos (Test 10: Math Facts Fluency)
This test is a measure of math achievement and number facility requiring the examinee
to solve simple addition, subtraction, and multiplication problems rapidly (Gq, Gs). Low
performance on Fluidez en datos matemáticos may be a function of limited knowledge of basic
math facts or lack of automaticity. The “Qualitative Observation” checklist for this test helps
document how the examinee approached the task. Table 5-10 provides information about the
percentage of age mates who were assigned each rating in the norming sample. For example,
while 17% of the 8-year-olds whose performance was rated solved the problems slowly, only
6% of the 11-year-olds had that same rating. Using this information can help determine how
typical or atypical the examinee’s performance is compared to age mates.

Scores and Interpretation 85


Table 5-10. Rating 1 Rating 2 Rating 3 Rating 4
Percentage by Age of
Solved Solved Solved None of the
Occurrence of Qualitative problems problems at a problems slowly above, not
Observations for Prueba quickly rate typical for observed, or
10: Fluidez en datos peers does not apply
matemáticos Age Percentage of Occurrence in Norming Sample
5 6 29 19 45
6 8 49 30 13
7 3 77 19 NA
8 10 72 17 NA
9 5 83 12 NA
10 12 72 14 2
11 23 71 6 NA
12 18 70 11 NA
13 22 70 8 NA
14 23 64 12 NA
15 21 74 6 NA
16 33 60 7 NA
17 25 68 7 NA
18 29 59 13 NA
19 35 55 8 2
NA = Not observed or not rated

Prueba 13: Números matrices (Test 13: Number Matrices)


This test is a measure of quantitative reasoning, an aspect of fluid reasoning (Gf ). The
task requires the ability to inductively and deductively reason with numbers to determine
a missing number in a matrix. Low performance may be a function of limited quantitative
reasoning. It may be helpful to compare an individual’s performance on this test to his or her
performance on Batería IV COG tests that require reasoning: Prueba 2: Series numéricas and
Prueba 9: Formación de conceptos.

Interpreting the Written Language Tests


When interpreting the written language tests, consider the relative complexity of written
language skills in each. Figure 5-4 is an interpretive model of the skills measured in the
Batería IV APROV written language tests.
The Batería IV measures three aspects of writing skill: spelling, writing fluency, and quality
of written expression. Additionally, the quality of an individual’s handwriting can be observed
informally on Prueba 3: Ortografía, Prueba 6: Expresión de lenguaje escrito, and Prueba 11:
Fluidez en escritura de frases. In terms of relative complexity, the skills measured in these tests
range from the production of legible handwritten output to the generative writing for quality
expression requiring ideas, organization, task adherence, and reasoning in Prueba 6: Expresión
de lenguaje escrito.

86 Scores and Interpretation


Figure 5-4. MORE COMPLEX
Various skills measured Batería IV Test Stimulus Task
by the Batería IV APROV
writing tests. Connected Discourse Prueba 6: Expresión de Various sentence Writing for the quality of
(Translexical Level) lenguaje escrito prompts with differing expression
task demands

Rate/Automaticity Prueba 11: Fluidez en A picture and three Writing short sentences
escritura de frases words to form into quickly—requires correct
a sentence syntax and automaticity

Phono/Orthographic Prueba 3: Ortografía Orally dictated words Producing correct spellings


Coding

Isolated Letters Prueba 3: Ortografía Orally presented letters Writing letter names
(Sublexical Level)

Motoric Output Handwriting Prueba 6: Expresión de Writing legibly


lenguaje escrito responses

LESS COMPLEX

Prueba 3: Ortografía (Test 3: Spelling)


This test measures knowledge of prewriting skills and spelling (Grw). The task requires the
production of single letters or words in response to oral prompts.
Performance on Prueba 3: Ortografía may be related to several factors, including
handwriting. If an examinee is unable to complete Items 1 through 3, he or she may not have
developed the muscular control or visual-motor skill needed in beginning handwriting.
A closer analysis of Prueba 3: Ortografía items will help examiners differentiate between
phonetically accurate and phonetically inaccurate spelling errors. In analyzing an examinee’s
responses, an examiner may determine whether a difference exists in the individual’s ability
to spell words that have regular phoneme-grapheme correspondence and those that require
the memorization of visual features. In addition, the following specific error patterns may
be present in an examinee’s misspellings: (a) addition of unnecessary letters, (b) omissions
of needed letters, (c) mispronunciations or dialectal speech patterns, (d) reversals of letters,
(e) transpositions of whole words (e.g., los for sol), word parts (e.g., ropa for paro), or
consonants and/or vowels, (f) phonetic spellings of nonphonetic words, and (g) incorrect
associations of sounds with letters (e.g., enbutido for embutido).
The “Qualitative Observation” checklist for this test helps document how the examinee
approached the task. Table 5-11 provides information about the percentage of age mates who
were assigned each rating in the norming sample. For example, of the 13-year-olds whose
performance was rated, 2% spelled words easily and accurately and 27% spelled words in a
laborious, nonautomatic manner. Using this information can help determine how typical or
atypical the examinee’s performance is compared to age mates.

Scores and Interpretation 87


Table 5-11. Rating 1 Rating 2 Rating 3 Rating 4
Percentage by Age of Spelled words Spelled initial Spelled words None of the
Occurrence of Qualitative easily and items easily in a laborious above, not
Observations for Prueba 3: accurately and accurately; manner observed, or
Ortografía spelling of (nonautomatic) does not apply
latter items
reflected a need
for further skill
development
(typical)
Age Percentage of Occurrence in Norming Sample
2 NA 7 6 86
3 NA 18 10 72
4 2 39 8 51
5 2 56 16 25
6 6 76 16 3
7 15 82 3 NA
8 18 77 5 NA
9 11 78 10 1
10 4 81 14 1
11 8 80 12 1
12 6 74 19 1
13 2 65 27 7
14 2 67 26 4
15 3 76 16 5
16 5 74 19 2
17 3 77 18 2
18 3 73 24 NA
19 7 64 27 3
NA = Not observed or not rated

Prueba 6: Expresión de lenguaje escrito (Test 6: Written Language Expression)


This test measures the ability to convey ideas in writing (Grw). The task requires the
production of meaningful written sentences in response to a variety of task criteria.
Performance on Prueba 6: Expresión de lenguaje escrito may be related to several factors,
including an examinee’s attitude toward writing, oral language performance, vocabulary, and
organizational ability. Some individuals are highly resistant to writing and produce only short,
simple sentences. In rare cases, the person may refuse to write. In many cases, such people
have experienced failure in attempting to write. In addition, an individual’s oral language
performance may affect his or her Prueba 6: Expresión de lenguaje escrito scores. Dialects and
cultural influences may affect not only the way people pronounce words, but also how they
spell the words. Many individuals with low oral vocabulary abilities will have low written
vocabulary abilities. Finally, organizational abilities may be related to performance on Prueba
6: Expresión de lenguaje escrito. One item type on this test requires the examinee to fill in a
missing middle sentence in a paragraph. Sequencing and organizational abilities, or the ability
to arrange thoughts logically in writing, may be a contributing factor.
The “Qualitative Observation” checklist helps document how the examinee approached
the task. Table 5-12 provides information about the percentage of age mates who were
assigned each rating in the norming sample on the WJ IV ACH Writing Samples test, which

88 Scores and Interpretation


is very similar to Prueba 6: Expresión de lenguaje escrito in both content and process. For
example, of the 9-year-olds whose performance was rated, 4% wrote sentences that were both
complex and detailed and 19% wrote inadequate sentences. Using this information can help
determine how typical or atypical the examinee’s performance is compared to age mates.

Table 5-12. Rating 1 Rating 2 Rating 3 Rating 4


Percentage by Age of
Sentences were Sentences Sentences were None of the
Occurrence of Qualitative both complex were simple inadequate (for above, not
Observations for WJ IV and detailed but adequate example, run- observed, or
ACH Test 6: Writing (typical) ons, incomplete does not apply
Samples sentences,
awkward syntax,
or limited
content)
Age Percentage of Occurrence in Norming Sample
4 NA 6 6 89
5 1 17 16 66
6 1 47 28 25
7 6 68 20 7
8 10 71 18 1
9 4 77 19 1
10 5 79 14 1
11 10 77 13 1
12 14 74 12 NA
13 9 71 19 1
14 11 67 19 2
15 13 64 20 2
16 20 64 12 3
17 17 70 13 1
18 12 73 14 1
19 20 67 11 2
NA = Not observed or not rated

Prueba 11: Fluidez en escritura de frases (Test 11: Sentence Writing Fluency)
This test measures the examinee’s ability to write rapidly with ease (automaticity; Grw, Gs).
The task requires the production of legible, simple sentences with acceptable syntax. Minimal
analytic attention or problem solving is necessary.
Performance on Prueba 11: Fluidez en escritura de frases may be related to several factors,
including muscular or motor control, response style, ability to sustain concentration, and
reading or spelling skills. When an examinee’s attention is focused on the mechanics of
writing rather than on the formulation or expression of ideas, writing is not automatic. Poor
muscular control may contribute to a concentration on the mechanics of the writing task and
contribute to low scores. In addition, an examinee’s response to timed tasks can influence the
quality of automaticity. A range of different response styles to this task has been observed.
Some examinees complete all tasks at a slow, consistent pace, regardless of imposed time
constraints. Other examinees work very rapidly but tend to make a lot of careless errors. In
an interpretation of the examinee’s response style, an examiner may want to define whether
the examinee worked (a) slowly but inaccurately, (b) slowly and accurately, (c) rapidly but

Scores and Interpretation 89


inaccurately, or (d) rapidly and accurately. Also, low scores on Prueba 11: Fluidez en escritura
de frases may be related to an observed difficulty in sustaining concentration for the 5-minute
time period of the test. For example, some examinees write a few words and then look
around the room. They need to be redirected to the task. Others ask the examiner how much
time has elapsed. However, difficulty sustaining attention could be related to frustration with
writing tasks. Word recognition and spelling skill also may affect performance on this task,
especially for younger children or older students with limited skill. Although the stimulus
words are controlled in terms of reading difficulty, and the examiner is allowed to read any
requested word, some examinees may misread a word or may not ask for a pronunciation of
an unrecognized word. Some examinees with spelling difficulties will need to glance at each
stimulus word several times to copy it correctly, thus affecting writing speed.
The “Qualitative Observation” checklist for this test helps document how the examinee
approached the task. Table 5-13 provides information about the percentage of age mates who
were assigned each rating in the norming sample. For example, of the 10-year-olds whose
performance was rated, 19% had difficulty formulating or writing sentences quickly and 26%
wrote appropriate sentences at a slow pace. Using this information can help determine how
typical or atypical the examinee’s performance is compared to age mates.

Table 5-13. Rating 1 Rating 2 Rating 3 Rating 4 Rating 5


Percentage by Age of Wrote sentences Wrote Wrote Had trouble None of the
Occurrence of Qualitative with remarkable appropriate appropriate formulating above, not
Observations for Prueba ease and sentences at an sentences at a or writing observed, or
11: Fluidez en escritura de accuracy adequate pace slow pace sentences does not apply
frases (typical) quickly
Age Percentage of Occurrence in Norming Sample
5 NA 11 NA 26 63
6 NA 22 22 44 13
7 NA 27 35 35 3
8 6 48 23 23 NA
9 NA 66 22 12 NA
10 3 52 26 19 NA
11 13 60 18 10 NA
12 5 81 10 5 NA
13 5 77 16 2 2
14 5 79 5 9 2
15 17 72 11 NA NA
16 12 75 8 2 3
17 NA 91 7 2 NA
18 13 65 16 6 NA
19 19 66 7 7 NA
NA = Not observed or not rated

Interpreting Variations and Comparisons


The Batería IV provides a procedure for norm-based evaluation of the presence and
significance of strengths and weaknesses among an individual’s cognitive, linguistic, and
achievement abilities. This information is especially appropriate for documenting the nature
of and differentiating between intra-ability variations and ability/achievement comparisons.

90 Scores and Interpretation


Table 5-14 depicts the various variation and ability/achievement comparison or discrepancy
procedures available in the Batería IV.

Table 5-14. Intra-Ability Variation Models


Batería IV Intra-Ability
Intra-Cognitive
Variation and Ability/
Achievement Comparison Intra-Achievement
Procedures Academic Skills/Academic Fluency/Academic Applications
Ability/Achievement Comparison Models
General Intellectual Ability/Achievement
Gf-Gc Composite/Other Abilities
Scholastic Aptitude/Achievement
Oral Language/Achievement

Intra-Ability Variations
Intra-ability variation models are bidirectional comparisons (as represented by the two-
headed arrows in Figure 5-5) that allow comparison of performance among skills and
abilities. There are three types of intra-ability variations in the Batería IV: intra-achievement
(determined with the Batería IV APROV), academic skills/academic fluency/academic
applications (determined with the Batería IV APROV), and intra-cognitive (determined with
the Batería IV COG). The two variation procedures discussed in detail here are the ones
available when using the Batería IV APROV. While a summary is presented here, consult the
Batería IV COG Examiner’s Manual for further information about the intra-cognitive variation
procedure.

Intra-Achievement Variations
This variation procedure allows comparison of one area of academic achievement to
the examinee’s expected or predicted performance as determined by his or her average
performance on other achievement areas. An intra-achievement variation is present within
individuals who have specific achievement strengths or weaknesses, such as superior
math skills relative to their expected achievement based on their average performance
in other areas of achievement. Individuals with a significant intra-achievement variation
exhibit specific strengths or weaknesses in one or more areas of achievement. This type of
information is an invaluable aid in instructional planning and can be used, for example,
to support the hypothesis of a specific difficulty as compared to generally low academic
performance across achievement domains. For example, a student may perform poorly in
mathematics but may have average abilities on tasks involving reading.
As indicated in Table 5-15, intra-achievement variations can be calculated if Batería IV
APROV Tests 1 through 6 are administered. Each test is compared to the examinee’s predicted
or expected test score based on his or her average performance on the other five tests. For
example, when considering Prueba 1: Identificación de letras y palabras, the individual’s
average performance on the remaining five tests (Tests 2 through 6) is used as the predictor
to determine his or her expected score on Prueba 1: Identificación de letras y palabras. This
expected score is then compared to the person’s obtained Prueba 1: Identificación de letras
y palabras score. If the individual’s expected score is higher than his or her actual score, a
relative weakness is identified. If the expected score is lower than the actual score, a relative
strength is identified.

Scores and Interpretation 91


Figure 5-5. Intra-Ability Variation Procedures
Three types of intra-ability BIDIRECTIONAL COMPARISONS
variation models in the Intra-Cognitive Variations
Batería IV. Cognitive Abilities Achievement

Intra-Achievement
Variations
Cognitive Abilities Achievement

Academic Skills/Academic Fluency/


Academic Applications
Cognitive Abilities Achievement

As an option, other tests can be included in the variation procedure. When including any
additional tests, the corresponding cluster or clusters that are created also are included in
the variation procedure. For example, if Prueba 7: Análisis de palabras is administered, the
Destrezas básicas en lectura cluster is available when combined with Prueba 1: Identificación
de letras y palabras. Therefore, both Prueba 7: Análisis de palabras and the Destrezas básicas
en lectura cluster are compared to the same expected score based on the same predictor as
Prueba 1: Identificación de letras y palabras in the variation procedure. No matter how many
tests are administered, the predictor score is always based on five tests from Batería IV
APROV Tests 1 through 6. An intra-achievement variation is present within individuals
who have specific academic strengths or weaknesses, such as superior Destrezas básicas en
lectura (Grw) relative to their expected performance based on their average performance on
the remaining five tests. If any of the optional additional tests are included in the variation
procedure, the variation is labeled Intra-Achievement (Extended).

92 Scores and Interpretation


Table 5-15. Intra-Achievement Variations
Batería IV Intra-Achievement Required From Batería IV APROV (Tests 1–6) Optional From Batería IV APROV
Variations
Prueba 1: Identificación de letras y palabras Uses same predictor as Identificación de letras y palabras
Prueba 7: Análisis de palabras
Prueba 8: Lectura oral
Destrezas básicas en lectura
Fluidez en la lectura (also requires Prueba 9: Fluidez en lectura de
frases)

Prueba 2: Problemas aplicados Uses same predictor as Problemas aplicados


Prueba 13: Números matrices
Resolución de problemas matemáticos

Prueba 3: Ortografía

Prueba 4: Comprensión de textos Uses same predictor as Comprensión de textos


Prueba 9: Fluidez en lectura de frases
Prueba 12: Rememoración de lectura
Comprensión de lectura

Prueba 5: Cálculo Uses same predictor as Cálculo


Prueba 10: Fluidez en datos matemáticos
Destrezas en cálculos matemáticos

Prueba 6: Expresión de lenguaje escrito Uses same predictor as Expresión de lenguaje escrito
Prueba 11: Fluidez en escritura de frases
Expresión escrita

Academic Skills/Academic Fluency/Academic Applications Variations


This variation procedure allows comparison of the examinee’s performance in skills, fluency,
and applications across the academic areas of reading, written language, and mathematics.
Nine Batería IV APROV tests are required: three in reading (Prueba 1: Identificación de letras
y palabras, Prueba 4: Comprensión de textos, Prueba 9: Fluidez en lectura de frases), three in
mathematics (Prueba 2: Problemas aplicados, Prueba 5: Cálculo, Prueba 10: Fluidez en datos
matemáticos), and three in written language (Prueba 3: Ortografía, Prueba 6: Expresión de
lenguaje escrito, Prueba 11: Fluidez en escritura de frases). The Destrezas académicas cluster is
composed of Prueba 1: Identificación de letras y palabras, Prueba 3: Ortografía, and Prueba 5:
Cálculo. The Fluidez académica cluster is composed of Prueba 9: Fluidez en lectura de frases,
Prueba 10: Fluidez en datos matemáticos, and Prueba 11: Fluidez en escritura de frases. The
Aplicaciones académicas cluster is composed of Prueba 2: Problemas aplicados, Prueba 4:
Comprensión de textos, and Prueba 6: Expresión de lenguaje escrito. Individuals with a
significant variation exhibit a specific strength or a weakness, such as limited academic
skills relative to their expected performance based on their average performance on the
other two cross-academic areas. This information is helpful in documenting the need for
an accommodation or modification of instruction. For example, if the individual has a
significant weakness in fluency, this may indicate he or she needs extended time or shortened
assignments. There are additional options for this variation procedure that include two
cognitive areas (cognitive processing speed, perceptual speed). These additional options are

Scores and Interpretation 93


compared to the same expected score based on the same predictor as the Fluidez académica
cluster. If any of the optional additional tests are included in the variation procedure, the
variation is labeled Academic Skills/Academic Fluency/Academic Applications (Extended).
Table 5-16 identifies the tests required for the various options in the procedure.

Table 5-16. Academic Skills/Academic Fluency/Academic Applications Variations


Batería IV Academic
Required From Batería IV APROV
Skills/Academic Fluency/ Optional From Batería IV APROV and Batería IV COG
(Tests 1–6, 9–11)
Academic Applications
Destrezas académicas
Variations
Prueba 1: Identificación de letras y palabras
Prueba 3: Ortografía
Prueba 5: Cálculo

Fluidez académica Uses same predictor as Fluidez académica


Prueba 9: Fluidez en lectura de frases Velocidad de procesamiento cognitivo (COG Prueba 4: Pareo de
Prueba 10: Fluidez en datos matemáticos letras idénticas and COG Prueba 13: Cancelación de pares)
Prueba 11: Fluidez en escritura de frases Rapidez perceptual (COG Prueba 4: Pareo de letras idénticas and
COG Prueba 11: Pareo de números idénticos)
Aplicaciones académicas
Prueba 2: Problemas aplicados
Prueba 4: Comprensión de textos
Prueba 6: Expresión de lenguaje escrito

Intra-Cognitive Variations
This variation is present within individuals who have specific cognitive strengths or
weaknesses, such as high fluid reasoning (Gf ) or poor short-term working memory (Gwm).
Equal interest exists in either a strength or a weakness in one ability relative to an individual’s
average performance in other cognitive abilities. This profile of variations can document areas
of relative strength and weakness, provide insights for program planning, and contribute to
a deeper understanding of the types of tasks that will be especially easy or difficult for an
individual compared to his or her other abilities.
Based on Batería IV COG Tests 1 through 7, this variation procedure allows comparison
of one area of cognitive ability to the examinee’s expected or predicted score based on his
or her average performance on six of the first seven cognitive tests, each measuring some
aspect of a different CHC cognitive ability (Gc, Gf, Gwm, Gs, Ga, Glr, Gv). For example,
when considering Prueba 1: Vocabulario oral, the individual’s average performance on the
remaining six tests (Prueba 2: Series numéricas, Prueba 3: Atención verbal, Prueba 4: Pareo
de letras idénticas, Prueba 5: Procesamiento fonético, Prueba 6: Rememoración de cuentos, and
Prueba 7: Visualización) is used as the predictor to determine the person’s expected Prueba 1:
Vocabulario oral score. This expected score is then compared to the person’s obtained Prueba
1: Vocabulario oral score. An intra-cognitive variation is present within individuals who have
specific cognitive strengths or weaknesses, such as superior comprehension-knowledge (Gc)
relative to their expected performance based on their average performance in other areas of
cognitive ability. If administered, the three Spanish language tests in the WJ IV OL can be
entered into the intra-cognitive variation. See the Batería IV COG Examiner’s Manual for
more information.

94 Scores and Interpretation


Ability/Achievement Comparisons
Ability/achievement comparison models are unidirectional comparisons (as represented by
the single-headed arrows in Figure 5-6) that use certain intellectual or linguistic abilities to
predict academic performance.
The ability/achievement comparison models are procedures for comparing an individual’s
current academic performance to the performance of others of the same age or grade with
the same ability score (based upon general intellectual ability, scholastic aptitude, Gf-Gc
composite, or oral language). These four models are not intended to gauge an individual’s
potential for future success. They are, however, valid methods for evaluating the presence and
significance of discrepancies between current levels of ability and achievement. All Batería IV
ability/achievement comparisons account for regression to the mean and provide actual or
real discrepancy norms (for more information, see the WJ IV Technical Manual).
There are four options for the predictor measure in the ability/achievement comparison
procedures. The Habilidad intelectual general, Aptitud académica, or Gf-Gc combinado may
be used from the Batería IV COG as predictors or measures of ability. The Amplio lenguaje
oral cluster from the WJ IV OL may be used to predict level of achievement based upon the
individual’s level of oral language development. Each of these procedures fulfills a different
purpose. A summary of these options is presented.

Figure 5-6. Ability/Achievement Comparison Models


Four types of ability/ UNIDIRECTIONAL COMPARISONS
achievement comparison Intellectual Ability/Achievement Comparisons
models in the Batería IV. Cognitive Abilities Oral Language Achievement
General Intellectual Ability (from WJ IV
Scholastic Aptitudes OL battery)

Gf-Gc Comparisons

Cognitive Abilities Oral Language Achievement


Gf-Gc

Oral Language Ability/Achievement Comparisons

Cognitive Abilities Oral Language Achievement


(from WJ IV
OL battery)

Scores and Interpretation 95


Three Cognitive Ability/Achievement Comparisons
In each academic area, the scholastic aptitude/achievement comparison procedure can be
used to determine if an examinee is achieving commensurate with his or her current levels of
associated cognitive abilities. The four cognitive tests that compose each aptitude provide the
most relevant theoretical and research-based predictors of present achievement levels. Unlike
a discrepancy procedure, the comparison procedure is looking for consistency between
scores. In other words, a person with low reading aptitude would be expected to have low
reading skills, whereas a person with high reading aptitude would be expected to have more
advanced skills.
The general intellectual ability/achievement comparison procedure can be used to
determine the presence and severity of a discrepancy between general intellectual ability (g)
and any particular area of achievement or oral language. This ability/achievement discrepancy
procedure may be used as part of the selection criteria for learning disability (LD) programs.
When the Gf-Gc composite is the predictor, it can be used to determine the presence of
strengths and weaknesses in any area of achievement, as well as oral language and other
cognitive abilities. The Gf-Gc composite is a high g index reflecting the individual’s fluid
and crystallized intellectual abilities. This type of comparison is particularly helpful in
cases where a processing deficit (e.g., slow processing speed) attenuates the GIA estimate
of potential. More information about these procedures can be found in the Batería IV COG
Examiner’s Manual.

Oral Language/Achievement Comparisons


Some professionals, especially those in the area of reading, prefer to use the oral language
score as an ability measure. In many cases, a significant discrepancy between oral language
ability and expected or predicted academic performance may be used to help substantiate the
existence of a specific reading, math, or writing disability. Oral language ability/achievement
comparisons use standard scores from the Amplio lenguaje oral cluster to predict achievement
on any of the broad, basic skills, or applied cluster scores. Examinees with a significant
negative discrepancy between oral language ability and achievement exhibit relative strengths
in oral language with weaknesses in one or more areas of achievement. Table 5-17 lists the
clusters that can be included in this comparison. Consult the WJ IV OL Examiner’s Manual
for more information about this procedure.

96 Scores and Interpretation


Table 5-17. Predictor From the WJ IV OL Clusters From the Batería IV APROV
Batería IV Oral Language/
Amplio lenguaje oral Lectura
Achievement Comparisons
Prueba 10: Vocabulario sobre dibujos Lectura amplia
Prueba 11: Comprensión oral Destrezas básicas en lectura
Prueba 12: Comprensión de indicaciones Comprensión de lectura
Fluidez en la lectura

Matemáticas
Matemáticas amplias
Destrezas en cálculos matemáticos
Resolución de problemas matemáticos

Lenguaje escrito
Lenguaje escrito amplio
Expresión escrita

Destrezas académicas
Fluidez académica
Aplicaciones académicas

Comparative Language Index (CLI)


A unique comparison procedure, the Comparative Language Index (CLI), is available when
the parallel Spanish and English tests from the WJ IV OL have been administered. This
comparison helps document an individual’s language proficiency in each language and
helps determine which language is dominant. The three Spanish language tests are Prueba
10: Vocabulario sobre dibujos, Prueba 11: Comprensión oral, and Prueba 12: Comprensión de
indicaciones. The three English language tests are Test 1: Picture Vocabulary, Test 2: Oral
Comprehension, and Test 6: Understanding Directions. If all six tests are administered, three
clusters are available for comparison: Lenguaje oral (Oral Language), Amplio lenguaje oral
(Broad Oral Language), and Comprensión auditiva (Listening Comprehension). Examiners
who wish to include the CLI information in the Batería IV report must administer the WJ
IV OL tests and commit the results to the online scoring and reporting program within 30
days of committing the Batería IV administration and prior to running the Batería IV report.
Consult the WJ IV OL Examiner’s Manual for more information about the Comparative
Language Index.

Discrepancy Scores
The online scoring and reporting program includes two scores for use in interpreting
the presence and severity of any variation, comparison, or discrepancy. These are called
the discrepancy percentile rank (discrepancy PR) and the discrepancy standard deviation
(discrepancy SD). These scores are based on actual difference scores computed for each
individual in the norming sample. (See the WJ IV Technical Manual for more information.)

Scores and Interpretation 97


The discrepancy percentile rank indicates the percentage of the examinee’s peer group
(same age or grade and same predicted score) with a difference score that is the same as or
larger than the examinee’s difference score. For example, a discrepancy percentile rank of 1
on Destrezas básicas en lectura indicates that only 1% of the examinee’s peer group had the
same or larger negative difference score on this cluster. On the other hand, a discrepancy
percentile rank of 97 on Resolución de problemas matemáticos indicates that only 3% of the
examinee’s peer group had the same or larger positive difference score on this cluster. The
Batería IV discrepancy PR values provide the identical information typically referred to as the
“base rate” in the population.
The discrepancy SD score is a standardized z score that reports (in standard deviation
units) the difference between an individual’s difference score and the average difference
score for individuals at the same age or grade level in the norming sample who had the
same predictor score. A negative value indicates the examinee’s actual ability is lower than
predicted. A positive value indicates the examinee’s actual ability is higher than predicted.
This statement of significance can be used, instead of the percentile rank, in programs with
selection criteria based on such criteria as “a difference equal to or greater than one and one-
half times the standard deviation.”

Implications Derived From Test Results


Use care when interpreting test scores and remember that norms are not standards of
performance. Norms simply report how scores are distributed in a representative sample of
the population. By statistical definition, one half of the individuals at any grade or age level
must be at or below that grade or age score and one half of the individuals must be at or
above that grade or age score.
Careful consideration of the information recorded for individual tests on the Test Record
and observations of unusual responses and test behavior will result in varying implications
for different examinees. One implication is that further testing should be completed using
the Batería IV COG, WJ IV OL, or other tests. Another implication relates to planning
programs or treatments. Professionals with appropriate background information about the
individual and knowledge of instructional or vocational alternatives will be able to use the
obtained information to assist in both decision making and program planning. Test patterns
will provide information about an individual’s strengths and weaknesses and, in some cases,
will provide insights relevant to necessary accommodations or appropriate instructional
recommendations.
Finally, testing is only one part of the total assessment process. Evaluators will want to
compare and integrate test results with information from many sources, including reports
from parents, teachers, employers, or medical personnel; first-hand observations of the
individual performing at home, in the classroom, in a rehabilitation clinic, or on the job; and
informal assessments and work samples.

98 Scores and Interpretation


References
American Educational Research Association (AERA), American Psychological Association
(APA), & National Council on Measurement in Education (NCME). (2014). Standards for
educational and psychological testing. Washington, DC: AERA.

August, D., & Shanahan, T. (2006). Developing literacy in second-language learners: Report of
the national literacy panel on language minority children and youth. Mahwah, NJ: Lawrence
Erlbaum.

Carroll, J. B. (1993). Human cognitive abilities: A survey of factor-analytic studies. New York,
NY: Cambridge University Press.

Carroll, J. B., & Maxwell, S. E. (1979). Individual differences in cognitive abilities. Annual
Review of Psychology, 30, 603–640.

Cattell, R. B. (1941). Some theoretical issues in adult intelligence testing. Psychological


Bulletin, 38, 592.

Cattell, R. B. (1943). The measurement of adult intelligence. Psychological Bulletin, 40,


153–193.

Cattell, R. B. (1950). Personality: A systematic theoretical and factoral study. New York, NY:
McGraw-Hill.

Cattell, R. B. (1963). Theory of fluid and crystallized intelligence: A critical experiment.


Journal of Educational Psychology, 54, 1–22.

Corn, A. L., & Lusk, K. E. (2010). Perspectives on low vision. In A. L. Corn & J. N. Erin
(Eds.), Foundations of low vision: Clinical and functional perspectives (2nd ed., pp. 3–34).
New York, NY: AFB Press.

Cummins, J. (1984). Bilingualism and special education: Issues in assessment and pedagogy.
Austin, TX: Pro-Ed.

Cummins, J., & Hornberger, N. H. (2008). Bilingual education. New York, NY: Springer.

de Leeuw, E. (2008). When your native language sounds foreign: A phonetic investigation into
first language attrition (Unpublished doctoral dissertation). Queen Margaret University,
Edinburgh, Scotland. Retrieved from https://eresearch.qmu.ac.uk/handle/20.500.12289/7436.

Flege, J. E., Schirru, C., & MacKay, I. R. (2003). Interaction between the native and second
language phonetic subsystems. Speech Communication, 40, 467–491.

Grosjean, F. (2001). The bilingual’s language modes. In J. Nicol (Ed.), One mind, two
languages: Bilingual language processing (2nd ed., pp. 1–22). Oxford, England: Blackwell.

Herschell, A. D., Greco, L. A., Filcheck, H. A., & McNeil, C. B. (2002). Who is testing
whom: Ten suggestions for managing disruptive behavior of young children during testing.
Intervention in School and Clinic, 37, 140–148.

References 99
Horn, J. L. (1965). Fluid and crystallized intelligence. (Unpublished doctoral dissertation).
University of Illinois, Urbana-Champaign, IL.

Horn, J. L. (1988). Thinking about human abilities. In J. R. Nesselroade & R. B. Cattell


(Eds.), Handbook of multivariate psychology (2nd ed., pp. 645–865). New York, NY:
Academic Press.

Horn, J. L. (1989). Models for intelligence. In R. Linn (Ed.), Intelligence: Measurement, theory
and public policy (pp. 29–73). Urbana, IL: University of Illinois Press.

Horn, J. L. (1991). Measurement of intellectual capabilities: A review of theory. In K. S.


McGrew, J. K. Werder, & R. W. Woodcock (Eds.), WJ-R technical manual (pp. 197–232).
Chicago, IL: Riverside Publishing.

Horn, J. L., & Cattell, R. B. (1966). Refinement and test of the theory of fluid and crystallized
general intelligences. Journal of Educational Psychology, 57, 253–270.

Horn, J. L., & Stankov, L. (1982). Auditory and visual factors of intelligence. Intelligence, 6,
165–185.

Linacre, J. M. (2002). What do infit and outfit, mean-square and standardized mean? Rasch
Measurement Transactions, 16(2), 878.

Linacre, J. M. (2012). WINSTEPS (Version 3.74.0) [Computer software]. Chicago, IL:


Winsteps.com.

Linacre, J. M. (2016). WINSTEPS (Version 3.92.0) [Computer software]. Beaverton, OR:


Winsteps.com. Retrieved January 1, 2016, from http://www.winsteps.com/.

Mather, N., & Wendling, B. J. (2014). Examiner’s Manual. Woodcock-Johnson IV Tests of Oral
Language. Rolling Meadows, IL: Riverside Publishing.

McArdle, J. J., & Woodcock, R. W. (1998). Human cognitive abilities in theory and practice.
Mahwah, NJ: Lawrence Erlbaum.

McGrew, K. S. (1997). Analysis of the major intelligence batteries according to a proposed


comprehensive Gf-Gc framework. In D. P. Flanagan, J. L. Genshaft, & P. L. Harrison (Eds.),
Contemporary intellectual assessment: Theories, tests, and issues (pp. 151–179). New York,
NY: Guilford Press.

McGrew, K. S. (2005). The Cattell-Horn-Carroll theory of cognitive abilities. In D. P. Flanagan


& P. L. Harrison (Eds.), Contemporary intellectual assessment: Theories, tests, and issues
(2nd ed., pp. 136–181). New York, NY: Guilford Press.

McGrew, K. S. (2009). Standing on the shoulders of the giants of psychometric intelligence


research. Intelligence, 37, 1–10.

McGrew, K. S. (2012). Implications of 20 years of CHC cognitive-achievement research: Back


to the future and beyond CHC. Paper presented at the Richard Woodcock Institute, Tufts
University, Medford, MA.

McGrew, K. S., LaForte, E. M., & Schrank, F. A. (2014). Technical Manual. Woodcock-Johnson
IV. Rolling Meadows, IL: Riverside Publishing.

100 References
McGrew, K. S., & Wendling, B. J. (2010). Cattell-Horn-Carroll cognitive-achievement
relations: What we have learned from the past 20 years of research. Psychology in the
Schools, 47, 651–675.

Mosier, C. I. (1943). On the reliability of a weighted composite. Psychometrika, 8, 161–168.

Muñoz-Sandoval, A. F., Woodcock, R. W., McGrew, K. S., & Mather, N. (2005, 2007a).
Batería III Woodcock-Muñoz. Rolling Meadows, IL: Riverside Publishing.

Muñoz-Sandoval, A. F., Woodcock, R. W., McGrew, K. S., & Mather, N. (2005, 2007b).
Batería III Woodcock-Muñoz Normative Update: Pruebas de aprovechamiento. Rolling
Meadows, IL: Riverside Publishing.

Muñoz-Sandoval, A. F., Woodcock, R. W., McGrew, K. S., & Mather, N. (2005, 2007c).
Batería III Woodcock-Muñoz Normative Update: Pruebas de habilidades cognitivas. Rolling
Meadows, IL: Riverside Publishing.

Prifitera, A., Saklofske, D. H., & Weiss, L. G. (Eds.). (2008). WISC-IV clinical assessment and
intervention. (2nd ed., pp. 217–235). San Diego, CA: Academic Press.

Rasch, G. (1960). Probabilistic models for some intelligence and attainment tests. Copenhagen,
Denmark: Danish Institute for Educational Research.

Sackett, P. R., & Yang, H. (2000). Correction for range restriction: An expanded typology.
Journal of Applied Psychology, 85, 112–118.

Sattler, J. M., & Hoge, R. D. (2005). Assessment of children: Behavioral, social, and clinical
foundations (5th ed.). San Diego, CA: Author.

Schneider, J., & McGrew, K. S. (2012). The Cattell-Horn-Carroll model of intelligence. In


D. P. Flanagan & P. L. Harrison (Eds.), Contemporary intellectual assessment: Theories, tests,
and issues (3rd ed., pp. 99–144). New York, NY: Guilford Press.

Schneider, W. J., & McGrew, K. S. (2018). The Cattell-Horn-Carroll theory of cognitive


abilities. In D. P. Flanagan & E. M. McDonough (Eds.), Contemporary intellectual
assessment: Theories, tests and issues (4th ed., pp. 73–164). New York, NY: Guilford Press.

Schrank, F. A., Mather, N., & McGrew, K. S. (2014). Woodcock-Johnson IV Tests of Oral
Language. Rolling Meadows, IL: Riverside Publishing.

Schrank, F. A., McGrew, K. S., & Mather, N. (2014). Woodcock-Johnson IV. Rolling Meadows,
IL: Riverside Publishing.

Smith, J. K. (1999). The effects of practice on the reading speed, accuracy, duration, and visual
fatigue of students with low vision when accessing standard size print with optical devices.
(Unpublished doctoral dissertation). University of Arizona, Tucson, AZ.

Stevens, S. S. (1951). Handbook of experimental psychology. New York, NY: John Wiley.

Tallmadge, G. K., & Wood, C. T. (1976, October). User’s guide, ESEA Title I evaluation and
reporting system. Mountain View, CA: RMC Research.

Thomas, W. P., & Collier, V. P. (2002). A national study of school effectiveness for language
minority students’ long-term academic achievement. Santa Cruz, CA: Center for Research on
Education, Diversity, and Excellence, University of California-Santa Cruz. Retrieved from:
http://www.thomasandcollier.com/assets/2002_thomas-and-collier_2002-final-report.pdf.

References 101
U.S. Department of Education. Family Educational Rights and Privacy Act. (1974). 20 U.S.C.
§ 1232g; 34 CFR Part 99.10(c) and (d).

Wendling, B. J., Mather, N., & Schrank, F. A. (2019). Examiner’s Manual. Batería IV Pruebas
de habilidades cognitivas. Itasca, IL: Riverside Assessments, LLC.

Wolfe, E. W. (2004). Equating and item banking with the Rasch model. In E. V. Smith, Jr. &
R. M. Smith (Eds.), Introduction to Rasch measurement. Maple Grove, MN: JAM Press.

Wolfe, F., MacIntosh, R., Kreiner, S., Lange, R., Graves, R., & Linacre, J. M. (2006). Multiple
significance tests. Rasch Measurement Transactions, 19, 1044.

Woodcock, R. W. (1973). Woodcock Reading Mastery Tests. Circle Pines, MN: American
Guidance Service.

Woodcock, R. W. (1978). Development and standardization of the Woodcock-Johnson Psycho-


Educational Battery. Chicago, IL: Riverside Publishing.

Woodcock, R. W. (1982, March). Interpretation of the Rasch ability and difficulty scales
for educational purposes. Paper presented at the meeting of the National Council on
Measurement in Education, New York, NY.

Woodcock, R. W. (1987, 1998). Woodcock Reading Mastery Tests–Revised. Circle Pines, MN:
American Guidance Service.

Woodcock, R. W. (1988, August). Factor structure of the tests of cognitive ability from the 1977
and 1989 Woodcock-Johnson. Paper presented at the Australian Council on Educational
Research Seminar on Intelligence, Melbourne, Australia.

Woodcock, R. W. (1990). Theoretical foundations of the WJ-R measures of cognitive ability.


Journal of Psychoeducational Assessment, 8, 231–258.

Woodcock, R. W. (1993). An information processing view of Gf-Gc theory. Journal of


Psychoeducational Assessment [Monograph Series, Advances in Psychoeducational
Assessment: Woodcock-Johnson Psycho-Educational Battery–Revised], 80–102.

Woodcock, R. W. (1994). Measures of the abilities of Gf-Gc theory. In R. Sternberg (Ed.),


Encyclopedia of intelligence (pp. 452–456). New York, NY: Macmillan.

Woodcock, R. W. (1998). Extending Gf-Gc theory into practice. In J. J. McArdle & R. W.


Woodcock (Eds.), Human cognitive abilities in theory and practice (pp. 137–156). Mahwah,
NJ: Lawrence Erlbaum.

Woodcock, R. W. (1999). What can Rasch-based scores convey about a person’s test
performance? In S. E. Embretson & S. L. Hershberger (Eds.), The new rules of
measurement: What every psychologist and educator should know (pp. 105–128). Mahwah,
NJ: Lawrence Erlbaum.

Woodcock, R. W., Alvarado, C. G., Schrank, F. A., Mather, N., McGrew, K. S., & Muñoz-
Sandoval, A. F. (2019). Batería IV Woodcock-Muñoz: Pruebas de aprovechamiento. Itasca, IL:
Riverside Assessments, LLC.

Woodcock, R. W., Alvarado, C. G., Schrank, F. A., McGrew, K. S., Mather, N., & Muñoz-
Sandoval, A. F. (2019a). Batería IV Woodcock-Muñoz. Itasca, IL: Riverside Assessments, LLC.

102 References
Woodcock, R. W., Alvarado, C. G., Schrank, F. A., McGrew, K. S., Mather, N., & Muñoz-
Sandoval, A. F. (2019b). Batería IV Woodcock-Muñoz: Pruebas de habilidades cognitivas.
Itasca, IL: Riverside Assessments, LLC.

Woodcock, R. W., & Dahl, M. N. (1971). A common scale for the measurement of person ability
and test item difficulty (AGS Paper No. 10). Circle Pines, MN: American Guidance Service.

Woodcock, R. W., McGrew, K. S., & Mather, N. (2001, 2007). Woodcock-Johnson III. Rolling
Meadows, IL: Riverside Publishing.

Wright, B. D., & Stone, M. H. (1979). Best test design. Chicago, IL: MESA Press.

Zieky, M. (1993). Practical questions in the use of DIF statistics in test development. In P.
W. Holland & H. Wainer (Eds.), Differential item functioning (pp. 337–347). Hillsdale, NJ:
Lawrence Erlbaum.

References 103
Appendix A

Norming and Calibration


Site States and Cities
The authors wish to thank the more than 8,000 individuals who participated in the
Woodcock-Johnson IV (Schrank, McGrew, & Mather, 2014) national standardization and
related studies and the 600 individuals who took part in the calibration of the Batería IV, as
well as the professionals and schools who assisted in obtaining the data. The following is a
list of states and cities where data were collected.

Alabama Trussville Parker

Alabaster Vestavia Hills Peoria

Bessemer Phoenix
Alaska Pima
Birmingham
Anchor Point Pirtleville
Center Point
Homer Portal
Clay
Nikolaevsk Sahuarita
Crestline
Cullman San Simon
Arizona Scottsdale
Fairfield
Avondale Sierra Vista
Florence
Benson Surprise
Forestdale
Bisbee Tempe
Gardendale
Bouse Tolleson
Hamilton
Bowie Tucson
Helena
Buckeye Vail
Homewood
Chandler Willcox
Hoover
Douglas
Huntsville
El Mirage Arkansas
Moody
Gilbert Arkadelphia
Mountain Brook
Glendale Blytheville
Pelham
Goodyear Lowell
Riverchase
Hereford Pine Bluff
Roebuck Plaza
Laveen Refield
Scottsboro
Mesa Springdale
Selma
Oro Valley
Tarrant

Appendix A 105
California Forest Ranch Monrovia

Acton Foster City Montebello

Alameda Fountain Valley Moreno Valley

Alhambra Fresno Mount Shasta

Aliso Viejo Garden Grove Murrieta

Alpine Gardena National City

Alta Loma Goleta North Hollywood

Altadena Gridley Northridge

Anaheim Hacienda Heights Oak Run

Anderson Half Moon Bay Oak View

Baldwin Park Hawthorne Oakdale

Banning Hayward Oceanside

Beaumont Holiday Orland

Beverly Hills Huntington Beach Oroville

Biggs Igo Pacoima

Bonita Imperial Beach Palo Alto

Brea Indio Palo Cedro

Buena Park Inglewood Paradise

Burbank Irvine Pasadena

Calexico Isleton Perris

Camarillo Jamul Pico Rivera

Canyon Lake La Canada Playa Del Ray

Carlsbad La Habra Pomona

Ceres La Mesa Rancho Cucamonga

Cerritos La Mirada Red Bluff

Chico La Verne Redding

Chula Vista Lake Elsinore Redlands

Claremont Lakeside Reseda

Colton Lathrop Riverside

Compton Lemon Grove Rosemead

Corning Linden San Diego

Cotati Long Beach San Dimas

Cottonwood Los Angeles San Francisco

Covina Los Molinos San Gabriel

Davis Magalia San Jacinto

Del Mar Malibu San Jose

Durham Manteca San Marcos

El Cajon Marysville San Mateo

Encino Menifee San Rafael

Escondido Millville San Ramon

Fontana Modesto Santa Clarita

106 Appendix A
Santa Cruz Castle Rock Winsted
Santa Maria Centennial Woodbury
Santa Monica Colorado Springs
Santa Rosa Denver Delaware
Santee Englewood Bear
Scotts Valley Evans
Shasta Greeley District of Columbia
Shasta Lake Greenwood Village Washington
Sherman Oaks Highlands Ranch
Florida
South San Francisco Larkspur
Spring Valley Littleton Apopka
Stockton Loveland Boca Raton
Studio City Parker Boynton Beach
Sun City Thornton Bradenton
Sylmar Westminster Brandon
Tarzana Brooksville
Temecula Connecticut Clearwater
Temple City Chester Clearwater Beach
Thousand Oaks Clinton Clermont
Tiburon Durham Coconut Creek
Turlock East Haven Cooper City
Tustin Essex Coral Gables
Upland Groton Coral Springs
Valencia Ivoryton Crystal River
Van Nuys Litchfield Davie
Venice Middlefield Deerfield Beach
Ventura New Britain Dunedin
Vista New Haven Fort Lauderdale
West Hollywood New London Fort Myers
Willows Oakdale Fruitland Park
Winchester Oakville Gainesville
Windsor Southington Glen Saint Mary
Woodland Stamford Green Cove Springs
Woodland Hills Stratford Greenacres
Yorba Linda Torrington Hallandale
Yreka Waterford Hallandale Beach
Watertown Hernando
Colorado West Granby Hialeah
Aurora West Haven Holiday
Boulder Westport Hollywood
Castle Pines Windsor Hudson

Appendix A 107
Jacksonville Ponte Vedra Beach Columbus
Jensen Beach Port Orange Comer
Kenneth City Port Richey Conyers
Lady Lake Port Saint Lucie Cordele
Lakewood Ranch Redington Shores Cumming
Land O Lakes Riverview Cuthbert
Largo Safety Harbor Dacula
Lauderdale Lakes Saint Augustine Decatur
Lauderhill Saint Petersburg Doraville
Leesburg Sarasota Douglasville
Lutz Seffner Duluth
Margate Southwest Ranches Dunwoody
Miami Stuart Ellenwood
Miami Beach Sunrise Fayetteville
Middleburg Tallahassee Flintstone
Miramar Tamarac Flowery Branch
Myakka City Tampa Fort Gaines
New Port Richey Tarpon Springs Fort Oglethorpe
Newport Temple Terrace Gainesville
Nokomis The Villages Grayson
North Lauderdale Trinity Helena
North Miami Valrico Kennesaw
North Miami Beach Wellington Lagrange
Oakland Park Wesley Chapel Lawrenceville
Ocala West Palm Beach Lilburn
Ocklawaha Weston Lincolnton
Ocoee Lithonia
Odessa Georgia Loganville
Oldsmar Alpharetta Lula
Orange Park Athens Marietta
Palm Beach Atlanta McDonough
Palm Beach Gardens Blakely Meansville
Palm City Bogart Milton
Palm Coast Bonaire Monroe
Palm Harbor Brooks Morganton
Palmetto Buford Morris
Parkland Calhoun Norcross
Pembroke Pines Canon Oxford
Plant City Carlton Ringgold
Plantation Chamblee Riverdale
Pompano Beach College Park Rock Springs

108 Appendix A
Rossville Nampa Evergreen Park
Rydal Rexburg Fairview Heights
Sandy Springs Rigby Frankfort
Smyrna Saint Anthony Geneva
Snellville Star Gilberts
Social Circle Teton Glendale Heights
Stockbridge Twin Falls Glenview
Stone Mountain Hampshire
Summerville Illinois Hanover Park
Suwanee Addison Harwood Heights
Sylvester Alsip Hazel Crest
Trenton Arlington Heights Highland Park
Tucker Aurora Highwood
Union City Bartlett Hinsdale
Warner Robins Batavia Hoffman Estates
Watkinsville Bensenville Hometown
Winder Berwyn Joliet
Woodstock Blue Island Justice
Bolingbrook LaGrange Park
Hawaii Buffalo Grove Lake in the Hills
Hauula Carpentersville Lansing
Honokaa Cary Libertyville
Honolulu Chicago Lincolnwood
Kaaawa Chicago Heights Lisle
Kailua Chicago Ridge Lockport
Kaneohe Cicero Lynwood
Kapolei Country Club Hills Lyons
Laupahoehoe Crest Hill Mahomet
Pearl City Crestwood Matteson
Wahiawa Crystal Lake McHenry
Waimanalo DeKalb Midlothian
Des Plaines Mokena
Idaho Dixon Montgomery
Ammon Dolton Mundelein
Boise Downers Grove Naperville
Bonners Ferry East Hazel Crest New Lenox
Caldwell East Saint Louis Oak Forest
Eagle Elgin Oak Lawn
Idaho Falls Elk Grove Village Orland Hills
Kuna Elmwood Park Orland Park
Meridian Evanston Palatine

Appendix A 109
Palos Heights Kansas Thomaston
Palos Hills Bonner Springs Westbrook
Park Ridge Chanute
Pingree Grove Maryland
De Soto
Plainfield Baltimore
Edwardsville
Prospect Heights Bel Air
Frontenac
Richton Park Bethesda
Gardner
River Grove Brandywine
Girard
Riverdale Catonsville
Gypsum
Riverside Clinton
Kansas City
Robbins Columbia
Lawrence
Rockton Cumberland
Leavenworth
Rolling Meadows Darlington
Leawood
Sauk Village Dayton
Lenexa
Schaumburg Edgewood
Linn Valley
Shirland Elkridge
Merriam
Skokie Elkton
Olathe
South Beloit Ellicott
Overland Park
South Holland Ellicott City
Pittsburg
Stickney Essex
Prairie Village
Streamwood Forest Hill
Salina
Tinley Park Fork
Scammon
Vernon Hills Gaithersburg
Shawnee
Waukegan Glenwood
Weir
Westmont Hampstead
Wheeling Kentucky Havre de Grace
Winnetka London La Plata
Woodridge Laurel
Woodstock Maine Pasadena
Worth Bucksport Perry Hall
Cape Elizabeth Pomfret
Indiana Rockville
Kennebunk
East Chicago Limington Salisbury
Fishers Naples Severn
Hammond Orland Silver Springs
Monticello Orrs Island Stevensville
Westfield Portland Sykesville
Whiting Rockland West Friendship
South Portland Woodbine
Iowa Woodstock
Spruce Head
Fort Dodge Tenants Harbor

110 Appendix A
Massachusetts Mendon Bloomfield Hills

Ashland Methuen Brighton

Auburn Nantucket Brown City

Barnstable Natick Canton

Bellingham Needham Cedar

Belmont Newton Center Line

Boston North Andover Central Lake

Brewster North Chatham Chesterfield

Brighton North Falmouth Clawson

Carver Northbridge Clinton Township

Centerville Orleans Coldwater

Chatham Osterville Colon

Chelmsford Plymouth Commerce

Cotuit Quincy Dearborn

Cummaquid Roxbury Dearborn Heights

Dennis Rutland Delton

Dennis Port Sandwich Detroit

Dorchester Shrewsbury Eastpointe

East Boston Somerville Ecorse

East Dennis South Attleboro Elk Rapids

East Harwich South Chatham Farmington

East Sandwich South Dennis Farmington Hills

Falmouth Sterling Fenton

Framingham Wakefield Ferndale

Hanover Wareham Fife Lake

Harwich Wayland Freeport

Holliston Wellfleet Garden City

Hull West Barnstable Grand Rapids

Hyannis West Harwich Hamtramck

Kingston West Yarmouth Harrison

Lancaster Weymouth Harrison Township

Lawrence Winchester Hastings

Leominster Worcester Hazel Park

Lexington Yarmouth Port Inkster

Malden Kalkaska
Michigan Kentwood
Manchester
Ann Arbor Kewadin
Manomet
Auburn Hills Kingsford
Marion
Battle Creek Kingsley
Marlborough
Berkley Lake Orion
Marstons Mills
Birmingham Lambertville
Medway

Appendix A 111
Lathrup Village Troy Maple Lake
Lawrence Twin Lake Mapleton
Lincoln Park Walled Lake Minneapolis
Livonia Warren Minnetonka
Luna Pier Waterford Mounds View
Macomb Wayland New Brighton
Madison Heights Wayne New London
Mancelona West Bloomfield North Mankato
Manton Westland Norwood Young America
Marysville White Lake Oakdale
Melvindale Williamsburg Owatonna
Mesick Wixom Plymouth
Milford Ypsilanti Rochester
New Baltimore Rockford
Northville Minnesota Roseville
Norway Andover Saint Clair
Oak Park Big Lake Saint Cloud
Oakland Blaine Saint Francis
Ottawa Lake Brooklyn Center Saint Paul
Petoskey Buffalo Shakopee
Plainwell Cambridge Shoreview
Plymouth Centerville Spring Lake Park
Pontiac Champlin Stacy
Redford Chaska Stillwater
River Rouge Chisago City Vadnais Heights
Rochester Circle Pines Wayzata
Rochester Hills Coon Rapids Wells
Romulus Delano White Bear Lake
Roseville Duluth Woodbury
Royal Oak Forest Lake Wyoming
Saint Clair Shores Fridley
Sault Sainte Marie Golden Valley Mississippi
Shelby Ham Lake Bay Springs
South Boardman Hopkins Brandon
Southfield Hugo Decatur
Spring Lake Hutchinson Forest
Springfield Lexington Hickory
Sterling Heights Lindstrum Lawrence
Taylor Lino Lakes Little Rock
Temperance Mahtomedi Meridian
Traverse City Maple Grove Newton

112 Appendix A
Pearl Park Hills New Hampshire
Richland Peculiar Allenstown
Union Pleasant Hope Ashland
Raytown Atkinson
Missouri Republic Bedford
Arnold Richmond Chichester
Ballwin Riverview Claremont
Barnhart Rogersville Concord
Battlefield Saint Ann Derry
Belton Saint Charles Dover
Bonne Terre Saint Clair Epsom
Buffalo Saint John Goffstown
Chesterfield Saint Louis Hampstead
Clarksville Saint Peters Henniker
Fair Grove Springfield Hooksett
Ferguson Strafford Hudson
Florissant Troy Laconia
Foley Villa Ridge Londonderry
Garden City Walnut Shade Manchester
Hazelwood Warsaw Merrimack
High Ridge Webster Groves Milford
Hollister Wentzville Nashua
Imperial Willard New Boston
Independence Wright City Pembroke
Joplin
Plaistow
Kansas City Montana
Portsmouth
Labadie Billings
Rye
Lake Saint Louis Livingston
Swanzey
Lebanon Missoula
Lee’s Summit New Jersey
Lowry City Nebraska
Allendale
Manchester Firth
Bloomfield
Maplewood Lincoln
Carlstadt
Marshfield Omaha
Cliffside Park
Nixa Roca
Closter
O’Fallon Seward
Dumont
Olivette Valparaiso
East Orange
Oronogo East Rutherford
Osceola
Nevada
Elmwood Park
Ozark Reno
Englewood
Pacific Fair Lawn

Appendix A 113
Fairview Wenonah Dexter
Fort Lee West New York Dix Hills
Franklin Westville East Jewett
Garfield Westwood East Northport
Hackensack Woodland Park East Rochester
Haledon Wyckoff Eastchester
Hamilton Elmsford
Jersey City New Mexico Fairport
Landing Espanola Farmington
Lincroft Las Cruces Freedom
Linden Los Alamos Gansevoort
Lodi Santa Fe Garden City
Mahwah Garden City South
Midland Park
New York Geneseo
Newark Acra Geneva
North Arlington Albany Glen Park
North Bergen Ardsley Glenmont
Northvale Arkville Glenville
Oradell Ashland Harrison
Palisades Park Astoria Hemlock
Paramus Auburn Hempstead
Paterson Ballston Lake Hensonville
Pequannock Bedford Hills Homer
Ridgefield Park Beaver Falls Honeoye Falls
Ridgewood Bethpage Huntington Station
River Vale Bloomfield Hurley
Riverdale Boiceville Irvington
Rochelle Park Brewster Jamaica
Rutherford Brownville Jewett
Saddle Brook Canandaigua Kenmore
Secaucus Castleton Lake Peekskill
Sewell Castleton on Hudson Lakeville
South Orange Churchville Lancaster
Springfield Clifton Park Lexington
Succasunna Clifton Springs Lima
Teaneck Clyde Liverpool
Titusville Cohoes Lowville
Union Beach Cortland Lyons
Wallington Croghan Macedon
Wanaque Croton on Hudson Malverne
Wayne Dansville Manchester

114 Appendix A
Maplecrest Saugerties Brevard
Marathon Savannah Cary
Marion Scarsdale Catawba
Maspeth Schenectady Chapel Hill
Massapequa Scotia Charlotte
Merrick Seneca Falls Cornelius
Middlesex Shandaken Cullowhee
Miller Place Shortsville Davidson
Mineola Sleepy Hollow Denver
Monroe Smithtown Durham
Mount Vernon Sodus Franklin
Naples Somers Garner
New Hyde Park Sparkill Gastonia
New Rochelle Spencerport Grover
New York Stillwater Hendersonville
Newark Tappan Huntersville
Niagara Falls Truxton Indian Trail
Niskayuna Uniondale Kitty Hawk
North Rose Verplanck Louisburg
North Syracuse Victor Matthews
Nunda Walworth Mint Hill
Nyack Waterloo Mount Holly
Ontario Watertown Otto
Orchard Park Watkins Glen Raleigh
Ossining Webster Roanoke Rapids
Palmyra West Hempstead Salisbury
Pelham West Henrietta Shelby
Penfield Westbury Southern Shores
Phelps White Plains Stanley
Pittsford Whitestone Statesville
Plainfield Williamson Sylva
Port Gibson Windham Vale
Prattsville Wolcott Wake Forest
Rennselaer Woodhaven Waxhaw
Rexford Woodstock Wilmington
Ridgewood Yonkers Youngsville
Rochester Yorktown Heights
Rockville Centre Ohio
Romulus North Carolina Akron
Rotterdam Apex Baltimore
Rye Brook Asheville Bay Village

Appendix A 115
Bedford Beaverton Irwin
Brookfield Bend Johnstown
Canal Winchester Cannon Beach Kingston
Canfield Clackamas Langhorne
Chagrin Falls Cornelius Lititz
Cleveland Eugene Malvern
Columbus Grants Pass Media
Elyria Gresham Milton
Fredericksburg Happy Valley Montgomery
Fremont Lake Oswego Montoursville
Guysville Manzanita Mount Pleasant
Highland Heights Nehalem Nesquehoning
Hudson Oregon City Newton
Ironton Portland North Wales
Jefferson Sandy Oakmont
Kent Seaside Oreland
Kitts Hill Tillamook Palmerton
Lakewood Tolovana Park Perkasie
Lithopolis Troutdale Philadelphia
Logan Warrenton Pittsburgh
Mentor Pittston
Miamisburg Pennsylvania Pottstown
Painesville Allison Park Quakertown
Pedro Bechtelsville Richeyville
Solon Bensalem Scranton
Strongsville Blairsville Shavertown
Tiffin Bradford Shelocta
Toledo Brockway Souderton
Twinsburg Brookville Trumbauersville
Westerville Claysburg Upper Darby
Wickliffe Conshohocken Wallingford
Wooster Dallas Warminster
Everson Warrington
Oklahoma Fort Washington West Chester
Drumright Forty Fort West Pittston
Sallisaw Freeport Wexford
Stillwater Furlong Wilkes-Barre
Gettysburg Williamsport
Oregon Homer City Williamstown
Aloha Hummelstown Wynnewood
Astoria Indiana Yardley

116 Appendix A
Rhode Island Tennessee Soddy Daisy

Pawtucket Alcoa Sunbright

Providence Apison Tellico Plains

Blaine Ten Mile


South Carolina Caryville Walland
Aiken Chattanooga Woodbury
Anderson Cleveland
Texas
Boiling Springs Corryton
Camden Abilene
Dandridge
Cassatt Addison
Dayton
Charleston Aldine
Dunlap
Clover Allen
East Ridge
Columbia Alton
Georgetown
Easley Austin
Graysville
Elgin Balch Springs
Harrison
Florence Baytown
Helenwood
Fort Mill Bellaire
Hixson
Fountain Inn Belton
Huntland
Goose Creek Blue Ridge
Huntsville
Greenville Boerne
Jacksboro
Greer Bryan
Knoxville
Hanahan Bulverde
Kodak
Hartsville Burkburnett
La Follette
Kershaw Burke
Lenoir City
Landrum Burleson
Loudon
Leesville Carrollton
Louisville
Lexington Cedar Hill
Luttrell
Lugoff Cedar Park
Maryville
Mauldin Channelview
McDonald
McBee Cleveland
Murfreesboro
McCormick Coppell
New Tazewell
Moore Corinth
Oak Ridge
Mount Pleasant Crosby
Oliver Springs
Myrtle Beach Cypress
Oneida
Rock Hill Dallas
Ooltewah
Saluda Dayton
Pigeon Forge
Simpsonville Denton
Pioneer
Summerville DeSoto
Powell
Wellford Edinburg
Sevierville
West Columbia El Paso
Seymour
Elgin
Signal Mountain

Appendix A 117
Euless McKinney The Colony
Farmers Branch Mesquite Tyler
Farmersville Mission Watauga
Flint Missouri City Weatherford
Fort Worth Murphy Wichita Falls
Fredericksburg New Braunfels Winona
Frisco New Ulm Wolfe City
Garland North Richland Hills Wylie
Georgetown Odessa
Gonzales Olney Utah
Grand Prairie Pasadena Clarkston
Grapevine Pearland Clearfield
Haltom City Pflugerville Ephraim
Helotes Pharr Layton
Hereford Pinehurst Lehi
Highlands Plano Logan
Holliday Porter Midvale
Houston Princeton Murray
Huffman Richardson Orem
Humble Richland Hills Pleasant Grove
Iowa Park Richmond Provo
Irving Rio Grande City Salt Lake City
Katy Roma Sandy
Keller Rosharon Santaquin
Kilgore Round Rock Saratoga Springs
Killeen Rowlett Spanish Fork
Kingwood Sachse Springville
La Feria San Antonio Taylorsville
Lake Dallas San Juan West Jordan
Lancaster San Saba West Valley City
LaPorte Schertz
Leander Seguin
Vermont
Lewisville Selma Essex Junction
Liberty Hill Spring Highgate
Lindale Spring Branch Lyndonville
Live Oak Stafford Passumpsic
Llano Stephenville Rochester
Longview Sterling City South Burlington
Louisville Sugar Land Swanton
Lucas Sunnyvale White River Junction
McAllen Taylor

118 Appendix A
Virginia Washington Puyallup

Aldie Auburn Redmond

Alexandria Bellevue Renton

Barboursville Bellingham Seattle

Boones Mill Blaine Sequim

Bristow Brush Prairie Shelton

Burke Burien Shoreline

Callaway Camano Island Snoqualmie

Centreville Centralia Spokane

Charlottesville Clinton Steilacom

Christiansburg Coupeville Tacoma

Cumberland Des Moines Tukwila

Daleville Dupont University Place

Fairfax Duvall
West Virginia
Falls Church Eatonville
Anmoore
Ferrum Edmonds
Bridgeport
Forest Everett
Buckhannon
Fredericksburg Federal Way
Charleston
Glade Hill Ferndale
Clarksburg
Hampton Freeland
Fairmont
Hardy Gig Harbor
Fort Ashby
Lynchburg Kenmore
Franklin
New Castle Kennewick
Grafton
Newport Kent
Harpers Ferry
Newport News Lacey
Hedgesville
Norfolk Lake Tapps
Huntington
North Tazewell Lakewood
Keyser
Oak Hill Langley
Lost Creek
Reston Liberty Lake
Mineral Wells
Roanoke Lynnwood
Monongah
Rocky Mount Milton
Morgantown
Salem Moses Lake
Nutter Fort
Springfield Mountlake Terrace
Ona
Sterling Newcastle
Parkersburg
Troutville Oak Harbor
Ridgeley
Warrenton Okanogan
Shinnston
Winchester Olympia
Springfield
Woodbridge Packwood
West Milford
Port Angeles
Port Hadlock
Port Townsend

Appendix A 119
Wisconsin
Bonduel
Burlington
Cecil
Cedarburg
Cudahy
Delavan
Eagle River
Elkhorn
Fontana
Franklin
Germantown
Glendale
Greenfield
Hales Corners
Hartford
Lake Geneva
Lakewood
Manawa
Marathon
Milwaukee
Muskego
New Berlin
Oak Creek
Oconomowoc
Racine
Saint Francis
Salem
South Milwaukee
Superior
Townsend
Union Grove
Wabeno
Waterford
Wauwatosa
West Allis
Winneconne

120 Appendix A
Appendix B

Batería IV Pruebas
de aprovechamiento
Examiner Training
Checklist
Name of Examiner:__________________________________ Date:___________________________________________

Name of Examinee:__________________________________ Name of Observer:________________________________

Y = Yes N = No N/O = Not Observed

Prueba 1: Identificación de letras y palabras


(circle one)
Y N N/O 1. Knows exact pronunciation of each item.

Y N N/O 2. Uses suggested starting points.

Y N N/O 3. Asks examinee to reread all items on page if response is unclear and then scores only item
in question.

Y N N/O 4. Does not tell examinee any letters or words during test.

Y N N/O 5. Gives reminder to pronounce words smoothly only once during test.

Y N N/O 6. Tests by complete pages.

Y N N/O 7. Encourages examinee to try next word after 5 seconds unless examinee is still actively
engaged in trying to pronounce word.

Y N N/O 8. Counts all items below basal as correct.

Prueba 2: Problemas aplicados


Y N N/O 1. Uses worksheet in Response Booklet.

Y N N/O 2. Uses suggested starting points.

Appendix B 121
Y N N/O 3. Reads all items to examinee.

Y N N/O 4. Provides Response Booklet and pencil at any time if examinee requests it or appears to need
it (e.g., uses finger to write on table or in air).

Y N N/O 5. Gives examinee pencil and Response Booklet at Item 27.

Y N N/O 6. Repeats any questions if requested by examinee.


Y N N/O 7. Does not require examinee responses to contain unit labels unless specified in Test Book
correct keys.

Y N N/O 8. Scores item incorrect if numeric response is wrong or if examinee provides incorrect label
(required or not).

Y N N/O 9. Tests by complete pages.


Y N N/O 10. Counts all items below basal as correct.

Prueba 3: Ortografía
Y N N/O 1. Uses Response Booklet and pencil.

Y N N/O 2. Uses suggested starting points.

Y N N/O 3. Knows correct pronunciation of all items.

Y N N/O 4. Does not penalize for poor handwriting or reversed letters as long as letter does not form
different letter (e.g., reversed b becomes d and would be an error).

Y N N/O 5. Requests printed (manuscript) responses but accepts cursive responses.

Y N N/O 6. Accepts upper- or lowercase responses unless case is specified.

Y N N/O 7. Counts all items below basal as correct.

Prueba 4: Comprensión de textos


Y N N/O 1. Begins with Introduction for examinees at preschool or kindergarten level.

Y N N/O 2. Begins with Item 7 for examinees at grade 1 level.


Y N N/O 3. Begins with Sample Item B for all other examinees and then selects appropriate
starting point.

Y N N/O 4. Does not insist on silent reading if examinee persists in reading aloud.

Y N N/O 5. Does not tell examinee any words.

Y N N/O 6. Accepts only one-word responses as correct, unless otherwise indicated by scoring key.

Y N N/O 7. Asks examinee to provide one word that goes in blank when he or she reads item aloud and
provides answer in context.

Y N N/O 8. Scores responses correct if they differ in verb tense or number, unless otherwise indicated.

Y N N/O 9. Scores responses incorrect if examinee substitutes different part of speech, unless otherwise
indicated.

122 Appendix B
Y N N/O 10. Tests by complete pages.
Y N N/O 11. Counts all items below basal as correct.

Prueba 5: Cálculo
Y N N/O 1. Uses Response Booklet and pencil.

Y N N/O 2. Uses suggested starting points.

Y N N/O 3. Discontinues testing and records score of 0 if examinee responds incorrectly to both
sample items.

Y N N/O 4. Accepts poorly formed or reversed numbers.

Y N N/O 5. Scores transposed numbers (e.g., “14” for 41) as incorrect.

Y N N/O 6. Scores items skipped by examinee as incorrect.

Y N N/O 7. Completes any applicable queries as listed in Test Book.

Y N N/O 8. Does not point out mathematical signs or operands to examinee.

Y N N/O 9. Counts all items below basal as correct.

Prueba 6: Expresión de lenguaje escrito


Y N N/O 1. Uses Response Booklet and pencil.

Y N N/O 2. Uses suggested starting points.

Y N N/O 3. Administers prescribed block of items and then follows the Continuation Instructions.

Y N N/O 4. Reads any word to examinee upon request.

Y N N/O 5. Knows the five reminders to use during administration and provides them each just once
during the test at the first occurrence of that error type.

Y N N/O 6. Does not penalize for spelling, punctuation, capitalization, or usage errors unless otherwise
indicated.

Y N N/O 7. Asks examinee to write as neatly as possible if responses are illegible or difficult to read.
Y N N/O 8. Does not penalize for handwriting errors unless words are illegible.

Y N N/O 9. Does not penalize for spelling errors unless the misspelling interferes with understanding
the examinee’s response or forms another real word.

Y N N/O 10. Scores sentences that are illegible as 0.

Y N N/O 11. Scores Items 1 through 36 as 1 or 0 points.

Y N N/O 12. Scores Items 37 through 40 as 2, 1, or 0 points.

Y N N/O 13. Does not ask examinee to read his or her response to score item.

Y N N/O 14. Enters score for each block administered and enters an X for any block not administered
into online scoring and reporting program.

Appendix B 123
Prueba 7: Análisis de palabras
Y N N/O 1. Uses suggested starting points.

Y N N/O 2. Knows correct pronunciation of each item.

Y N N/O 3. Says most common sound (phoneme) for letters printed within slashes (e.g., /p/), not
letter name.

Y N N/O 4. Reminds examinee to say words smoothly only once during test if examinee pronounces
nonword phoneme by phoneme or syllable by syllable.

Y N N/O 5. Asks examinee to reread all items on page if response is unclear and then scores only item
in question.

Y N N/O 6. Does not tell examinee any letters or words during test.

Y N N/O 7. Tests by complete pages.

Y N N/O 8. Counts all items below basal as correct.

Y N N/O 9. Records errors for further analysis.

Prueba 8: Lectura oral


Y N N/O 1. Uses suggested starting points.

Y N N/O 2. Follows Continuation Instructions to determine what to administer or when to


discontinue testing.

Y N N/O 3. Has examinee read sentences aloud.

Y N N/O 4. Knows correct pronunciation of each item.

Y N N/O 5. Scores as incorrect mispronunciations, omissions, insertions, substitutions, hesitations of 3


seconds, repetitions, transpositions, and ignoring punctuation.

Y N N/O 6. Marks slash (/) at each point on Test Record where error occurs.

Y N N/O 7. After hesitation of 3 seconds, marks word as incorrect and tells examinee to go on to
next word.
Y N N/O 8. Knows that self-corrections within 3 seconds are not counted as errors.

Y N N/O 9. Scores each sentence as 2 (no errors), 1 (one error), or 0 (two or more errors).

Y N N/O 10. Records Number of Points earned on items administered.

Prueba 9: Fluidez en lectura de frases


Y N N/O 1. Uses stopwatch.

Y N N/O 2. Uses Response Booklet and pencil.

Y N N/O 3. Begins with sample items and practice exercise for all examinees.

Y N N/O 4. Discontinues testing if examinee has 2 or fewer items correct on Practice Exercises C–F and
records score of 0 on Test Record.

Y N N/O 5. Adheres to 3-minute time limit.

124 Appendix B
Y N N/O 6. Records exact starting and stopping times if stopwatch is unavailable.
Y N N/O 7. Records exact finishing time in minutes and seconds on Test Record.
Y N N/O 8. Reminds examinee to read each sentence if he or she appears to be answering items
without reading.

Y N N/O 9. Does not tell examinee any letters or words.


Y N N/O 10. Reminds examinee to continue if he or she stops at bottom of page or column.

Y N N/O 11. Counts number of correct responses and number of errors.


Y N N/O 12. Does not count skipped items as incorrect.
Y N N/O 13. Enters both Number Correct and Number Incorrect into online scoring and
reporting program.

Y N N/O 14. Subtracts Number Incorrect from Number Correct when obtaining estimated AE/GE from
Test Record.

Y N N/O 15. Uses scoring guide overlay to facilitate scoring.

Prueba 10: Fluidez en datos matemáticos


Y N N/O 1. Uses stopwatch.

Y N N/O 2. Uses Response Booklet and pencil.

Y N N/O 3. Begins with Item 1 for all examinees.

Y N N/O 4. Discontinues testing if examinee has 3 or fewer items correct after 1 minute and records
time of 1 minute and Number Correct (0 to 3) on Test Record.

Y N N/O 5. Adheres to 3-minute time limit.

Y N N/O 6. Records exact starting and stopping times if stopwatch is unavailable.

Y N N/O 7. Records exact finishing time in minutes and seconds on Test Record.

Y N N/O 8. Does not draw attention to mathematical signs or remind examinee to pay attention to signs
during test.
Y N N/O 9. Does not penalize for poorly formed or reversed numbers.

Y N N/O 10. Reminds examinee to proceed across page from left to right, row by row, if he or she starts
skipping around.

Y N N/O 11. Reminds examinee to continue if he or she stops at bottom of first page.

Y N N/O 12. Uses scoring guide overlay to facilitate scoring.

Prueba 11: Fluidez en escritura de frases


Y N N/O 1. Uses stopwatch.

Y N N/O 2. Uses Response Booklet and pencil.

Y N N/O 3. Begins with sample items for all examinees.

Appendix B 125
Y N N/O 4. Discontinues testing if examinee has score of 0 on Sample Items B–D after error correction
and records score of 0 on Test Record.

Y N N/O 5. Discontinues testing if examinee has 3 or fewer correct after 2 minutes and records time of
2 minutes and Number Correct (0 to 3) on Test Record.

Y N N/O 6. Adheres to 5-minute time limit.

Y N N/O 7. Records exact starting and stopping times if stopwatch is unavailable.

Y N N/O 8. Records exact finishing time in minutes and seconds on Test Record.

Y N N/O 9. Reads stimulus word to examinee upon request.

Y N N/O 10. Reminds examinee to continue if he or she stops at bottom of page.

Y N N/O 11. Scores as correct all responses that are complete, reasonable sentences using all target words.
Y N N/O 12. Knows target words may not be changed in any way (e.g., verb tense or nouns changed
from singular to plural).

Y N N/O 13. Does not penalize for spelling, punctuation, or capitalization errors.
Y N N/O 14. Does not penalize for poor handwriting or spelling unless response is illegible.
Y N N/O 15. Scores skipped items as incorrect.
Y N N/O 16. Scores responses that omit critical words as incorrect.
Y N N/O 17. Scores responses that omit less meaningful words (e.g., la or este) as correct if all other
criteria are met.

Y N N/O 18. Accepts symbols (e.g., & for y) if all other criteria are met.

Prueba 12: Rememoración de lectura


Y N N/O 1. Uses suggested starting points.

Y N N/O 2. Follows Continuation Instructions to determine when to continue testing or when to stop.

Y N N/O 3. Does not tell examinee any words during test.

Y N N/O 4. Allows examinee to read each story silently only once.


Y N N/O 5. Knows elements to be scored are listed on Test Record.

Y N N/O 6. Scores element as correct if examinee uses key word (in bold) or close synonym during
retelling.

Y N N/O 7. Does not penalize for mispronunciations resulting from articulation errors, dialect
variations, or regional speech patterns.

Y N N/O 8. Scores response correct if it differs from correct response listed only in possessive case, verb
tense, or number (singular/plural), unless otherwise indicated in scoring key.

Y N N/O 9. Knows that any number that is a key word (in bold) must be recalled exactly.

Y N N/O 10. Scores derivations of names as correct (e.g., Annie for Ann).

126 Appendix B
Prueba 13: Números matrices
Y N N/O 1. Gives examinee worksheet in Response Booklet and pencil when directed.

Y N N/O 2. Uses suggested starting points.

Y N N/O 3. Provides corrective feedback as indicated for Sample Items A and B.

Y N N/O 4. Tests by complete pages.

Y N N/O 5. Allows 30 seconds for Items 1–11 and 1 minute for Items 12–30 before moving to next item.

Y N N/O 6. Allows more time if examinee is actively engaged in solving problem.

Y N N/O 7. Counts all items below basal as correct.

Y N N/O 8. Records total Number Correct.

Appendix B 127
Appendix C

Batería IV General Test


Observations Checklist
Name of Examiner:__________________________________ Date:___________________________________________

Name of Examinee:__________________________________ Name of Observer:________________________________

Y = Yes N = No N/O = Not Observed

Beginning the Test Session


(circle one)
Y N N/O 1. Records examinee’s identifying information correctly, including age and grade level.

Y N N/O 2. Develops seating arrangement in which examiner can see both sides of Test Book but
examinee can see only examinee pages.

Administration
Y N N/O 3. Keeps Test Record behind Test Book and out of examinee’s view.

Y N N/O 4. Begins each test by turning to tabbed page.

Y N N/O 5. Points with left hand while recording responses with right hand (reversed for left-handed
examiner).

Y N N/O 6. Watches where and how he or she points on examinee’s page.

Y N N/O 7. Uses exact wording for examiner page instructions.

Y N N/O 8. Knows correct pronunciation of all words in test.

Y N N/O 9. Communicates to examinee that test session is enjoyable.

Y N N/O 10. Moves smoothly from one test to another.

Y N N/O 11. Administers test fluidly.

Y N N/O 12. Moves to next item after allowing examinee appropriate, but not excessive, amount of time
to respond.

Y N N/O 13. Is familiar with contents of all examiner page boxes containing supplementary instructions.

Y N N/O 14. Follows all basal and ceiling rules.

Y N N/O 15. When testing backward to obtain basal, starts with first item on preceding page and presents
all items on page if stimuli are visible to examinee.

Appendix C 129
Y N N/O 16. Administers all items on page when stimuli are visible to examinee rather than stopping in
middle of page when ceiling is reached.

Y N N/O 17. Follows Continuation Instructions correctly on Tests 6, 8, and 12.


Y N N/O 18. Encourages effort and praises examinee for putting forth his or her best effort.
Y N N/O 19. Queries whenever needed and allowed to clarify examinee’s response.
Y N N/O 20. Uses stopwatch for all timed tests.

Y N N/O 21. Presents Response Booklet as directed in Test Book.

Scoring
Y N N/O 22. Does not penalize examinee for mispronunciations resulting from articulation, speech, or
dialectical differences.

Y N N/O 23. Uses item-scoring procedures specified in manual (e.g., 1 = correct response, 0 = incorrect
response, and blanks for items not administered).

Y N N/O 24. Scores last response examinee gives.

Y N N/O 25. Calculates raw scores correctly.

Y N N/O 26. Completes “Test Session Observations Checklist.”

Y N N/O 27. Uses optional “Qualitative Observation” checklists for Tests 1–11, as appropriate.

Y N N/O 28. Enters all identifying information and scores correctly into online scoring and
reporting program.

Comments:

Suggestions for improvement and further study:

130 Appendix C
Appendix D

Glossary of Batería IV
Terms in English and
Spanish
Tests in the Cognitive Battery
Test 1: Oral Vocabulary Prueba 1: Vocabulario oral
Test 1A: Oral Vocabulary–Synonyms Prueba 1A: Vocabulario oral – Sinónimos
Test 1B: Oral Vocabulary–Antonyms Prueba 1B: Vocabulario oral – Antónimos
Test 2: Number Series Prueba 2: Series numéricas
Test 3: Verbal Attention Prueba 3: Atención verbal
Test 4: Letter-Pattern Matching Prueba 4: Pareo de letras idénticas
Test 5: Phonological Processing Prueba 5: Procesamiento fonético
Test 5A: Phonological Processing–Word Prueba 5A: Procesamiento fonético – Acceso
Access de palabras
Test 5B: Phonological Processing–Word Prueba 5B: Procesamiento fonético – Fluidez
Fluency de palabras
Test 5C: Phonological Processing– Prueba 5C: Procesamiento fonético –
Substitution Sustitución
Test 6: Story Recall Prueba 6: Rememoración de cuentos
Test 7: Visualization Prueba 7: Visualización
Test 7A: Visualization–Spatial Relations Prueba 7A: Visualización – Relaciones
espaciales
Test 7B: Visualization–Block Rotation Prueba 7B: Visualización – Rotación de
bloques
Test 8: General Information Prueba 8: Información general
Test 8A: General Information–Where Prueba 8A: Información general – Dónde
Test 8B: General Information–What Prueba 8B: Información general – Qué
Test 9: Concept Formation Prueba 9: Formación de conceptos
Test 10: Numbers Reversed Prueba 10: Inversión de números
Test 11: Number-Pattern Matching Prueba 11: Pareo de números idénticos
Test 12: Nonword Repetition Prueba 12: Repetición de palabras sin sentido

Appendix D 131
Test 13: Pair Cancellation Prueba 13: Cancelación de pares
Test 14: Rapid Picture Naming Prueba 14: Rapidez en la identificación de
dibujos

Tests in the Achievement Battery


Test 1: Letter-Word Identification Prueba 1: Identificación de letras y palabras
Test 2: Applied Problems Prueba 2: Problemas aplicados
Test 3: Spelling Prueba 3: Ortografía
Test 4: Passage Comprehension Prueba 4: Comprensión de textos
Test 5: Calculation Prueba 5: Cálculo
Test 6: Written Language Expression Prueba 6: Expresión de lenguaje escrito
Test 7: Word Attack Prueba 7: Análisis de palabras
Test 8: Oral Reading Prueba 8: Lectura oral
Test 9: Sentence Reading Fluency Prueba 9: Fluidez en lectura de frases
Test 10: Math Facts Fluency Prueba 10: Fluidez en datos matemáticos
Test 11: Sentence Writing Fluency Prueba 11: Fluidez en escritura de frases
Test 12: Reading Recall Prueba 12: Rememoración de lectura
Test 13: Number Matrices Prueba 13: Números matrices

Clusters
General Intellectual Ability (GIA) Habilidad intelectual general (GIA)
Brief Intellectual Ability (BIA) Habilidad intelectual breve (BIA)
Gf-Gc Composite Gf-Gc combinado
Comprehension-Knowledge (Gc) Comprensión-conocimiento (Gc)
Fluid Reasoning (Gf) Razonamiento fluido (Gf)
Short-Term Working Memory (Gwm) Memoria de trabajo a corto plazo (Gwm)
Cognitive Processing Speed (Gs) Velocidad de procesamiento cognitivo (Gs)
Auditory Processing (Ga) Procesamiento auditivo (Ga)
Long-Term Storage and Retrieval (Glr) Almacenamiento y recuperación a largo plazo
(Glr)
Visual Processing (Gv) Procesamiento visual (Gv)
Quantitative Reasoning Razonamiento cuantitativo
Auditory Memory Span Alcance de la memoria auditiva
Number Facility Destreza numérica
Perceptual Speed Rapidez perceptual
Vocabulary Vocabulario
Cognitive Efficiency Eficiencia cognitiva
Reading Lectura
Broad Reading Lectura amplia
Basic Reading Skills Destrezas básicas en lectura

132 Appendix D
Reading Comprehension Comprensión de lectura
Reading Fluency Fluidez en la lectura
Mathematics Matemáticas
Broad Mathematics Matemáticas amplias
Math Calculation Skills Destrezas en cálculos matemáticos
Math Problem Solving Resolución de problemas matemáticos
Written Language Lenguaje escrito
Broad Written Language Lenguaje escrito amplio
Written Expression Expresión escrita
Academic Skills Destrezas académicas
Academic Fluency Fluidez académica
Academic Applications Aplicaciones académicas
Brief Achievement Aprovechamiento breve
Broad Achievement Aprovechamiento amplio
Scholastic Aptitude Aptitud académica
Reading Aptitude Aptitud de lectura
Math Aptitude Aptitud matemática
Writing Aptitude Aptitud de escritura
Broad Oral Language Amplio lenguaje oral
Oral Language Lenguaje oral
Listening Comprehension Comprensión auditiva

Test Components and Elements


Introducing the test Presentación de las pruebas
Selective Testing Table Tabla de selección de pruebas
test prueba
test book libro de pruebas
Administration Overview administración de la prueba
scoring calificación
basal nivel básico
ceiling nivel máximo
block bloque
Continuation Instructions Instrucciones para continuar
Suggested Starting Points Puntos de partida sugeridos
starting points puntos de partida
Introduction Introducción
Sample Item Ítem de ejemplo
Practice Exercise Ejercicio de práctica
Test Items Ítems de la prueba
Correct Correcto

Appendix D 133
Incorrect Incorrecto
Query Si responde
Error or No Response Error o falta de respuesta
Test Record Protocolo de pruebas
Identifying Information Datos personales
Test Session Observations Checklist Información del contacto y el uso del idioma
score entry anotación de puntuaciones
Qualitative Observation Observación cualitativa
Scoring Table Tabla de puntuaciones
Response Booklet Folleto de respuestas
Scoring Guide Plantilla de calificación
audio equipment equipo de audio
audio recording grabación de audio
headphones audífonos
speakers altavoces

Scores
raw score Puntaje bruto
W score Puntuación W
W Difference score Diferencia W
age equivalent (AE) Equivalente de edad (AE)
grade equivalent (GE) Equivalente de grado (GE)
developmental zone Zona de desarrollo
instructional zone Zona de instrucción
relative proficiency index (RPI) Índice de proficiencia relativa (RPI)
Comparative Language Index (CLI) Índice comparativo de lenguaje (CLI)
CALP Levels Niveles CALP
normal curve equivalent (NCE) Equivalente de la curva normal (NCE)
percentile rank (PR) Rango percentil (PR)
standard score (SS) Puntuación estándar (SS)
standard error of measurement (SEM) Error estándar de midicion (SEM)
discrepancy score Puntuación de la discrepancia
discrepancy percentile rank Rango percentil de las discrepancias
discrepancy SD score Puntuación de la discrepancia del SD

134 Appendix D
Appendix E

Batería IV Technical
Supplement
The Batería IV Woodcock-Muñoz (Batería IV) is the parallel Spanish version of the Woodcock-
Johnson IV (WJ IV), and both batteries rely upon the same norming sample for the derivation
of norm-referenced scores. Support for the use and interpretation of the Batería IV scores
draws from the large body of validity evidence gathered for the prior versions of the battery
and from evidence presented in this manual and in the Woodcock-Johnson IV Technical Manual
(McGrew, LaForte, & Schrank, 2014). This evidence includes documentation of the goals and
objectives of the Batería IV revision and the procedures used for test development, norming,
and equating. Users should consult the WJ IV Technical Manual for detailed information
about the norming study and the battery’s technical characteristics. This appendix is a
supplement to that manual and includes information specific to the development, calibration,
and equating of the Batería IV forms of the tests.

Translation/Adaptation
All of the Batería IV tests are either translations or adaptations of the parallel tests in the
WJ IV. Tests that are direct translations contain the same items as the WJ IV forms of the
tests; for these tests, only the item directions were translated into Spanish. Batería IV COG
Prueba 2: Series numéricas is an example of a translated test. In this test, the stimulus material
is exactly the same on the WJ IV and the Batería IV; the directions are precisely parallel
but are in different languages. In contrast, some tests could not be translated directly and
needed to be adapted for use with Spanish-speaking individuals. A test is considered an
adaptation when the measured construct is the same in English and Spanish, but the items
were changed or adapted to be appropriate for Spanish-speaking examinees. For example, in
Batería IV APROV Prueba 3: Ortografía, most Batería IV items are different from the WJ IV
items, but the test measures the same broad and narrow abilities using the same procedure.
Table E-1 contains a list of the Batería IV tests and indicates whether each test was translated
or adapted. In general, most of the visual processing, fluid reasoning, processing speed, and
quantitative ability tests were translated, whereas the comprehension-knowledge, auditory,
long-term storage and retrieval, reading, and writing tests required adaptation.

Appendix E 135
Table E-1. Test Name Translated Adapted
Translated and Adapted Pruebas de habilidades cognitivas
Tests of the Batería IV
Prueba 1: Vocabulario oral ■
Prueba 2: Series numéricas ■
Prueba 3: Atención verbal ■
Prueba 4: Pareo de letras idénticas ■
Prueba 5: Procesamiento fonético ■
Prueba 6: Rememoración de cuentos ■
Prueba 7: Visualización ■
Prueba 8: Información general ■
Prueba 9: Formación de conceptos ■
Prueba 10: Inversión de números ■
Prueba 11: Pareo de números idénticos ■
Prueba 12: Repetición de palabras sin sentido ■
Prueba 13: Cancelación de pares ■
Prueba 14: Rapidez en la identificación de dibujos ■1
Pruebas de aprovechamiento
Prueba 1: Identificación de letras y palabras ■
Prueba 2: Problemas aplicados ■
Prueba 3: Ortografía ■
Prueba 4: Comprensión de textos ■
Prueba 5: Cálculo ■
Prueba 6: Expresión de lenguaje escrito ■2
Prueba 7: Análisis de palabras ■
Prueba 8: Lectura oral ■3
Prueba 9: Fluidez en lectura de frases ■
Prueba 10: Fluidez en datos matemáticos ■
Prueba 11: Fluidez en escritura de frases ■
Prueba 12: Rememoración de lectura ■
Prueba 13: Números matrices ■
1 T his test is a direct translation of the WJ IV test with the exception of Item 104, which was changed from a football to a soccer ball in the Batería IV
form.
2 The WJ IV Writing Samples test was replaced with Written Language Expression (Expresión de lenguaje escrito) in Batería IV. The two tests contain
very similar item types; however, Written Language Expression scoring is simpler and does not require the examiner to use a separate scoring
guide.
3 Two items in the English form of this test did not translate well into Spanish; therefore, these items are slightly different in Batería IV.

136 Appendix E
The Batería IV test translation and adaptation work was performed by, or under the direction
and supervision of, Dr. Criselda Alvarado. Some of the tests included in the Batería IV were
translated or adapted during the development of the earlier editions of the Batería; other tests
were brand new in the WJ IV and were translated into Spanish for the first time during the
Batería IV development. An example of one such test is APROV Prueba 8: Lectura oral.
For some adapted tests, Dr. Alvarado and her project team wrote new items to augment the
existing Spanish item pools so that the Batería IV tests would contain new content and would
be relevant for a wide range of Spanish-speaking examinees representing different linguistic
and cultural backgrounds. For instance, for COG Prueba 6: Rememoración de cuentos,
Dr. Alvarado and her team wrote 10 new stories containing a total of 113 new test items.

Calibration Study
Several tests required calibration, either because they were new tests in the Batería IV or
because they contained new items. A calibration study was conducted that included six
Batería IV tests: COG Prueba 5A: Procesamiento fonético – Acceso de palabras, COG Prueba 5C:
Procesamiento fonético – Sustitución, COG Prueba 6: Rememoración de cuentos, COG Prueba
12: Repetición de palabras sin sentido, APROV Prueba 8: Lectura oral, and APROV Prueba 12:
Rememoración de lectura. The primary goals of the study were to determine the difficulty
levels of the new Spanish items and to equate those items to the scales underlying the English
forms of the tests.1

Construction of Calibration Study Forms


For APROV Prueba 8: Lectura oral, the calibration form was almost a direct translation of
Form C of the English WJ IV Oral Reading test. The only exceptions were two sentences that
did not translate accurately into Spanish, for which comparable Spanish substitutions were
made. For all other tests, the calibration forms contained a set of new Spanish items plus a
set of “linking” items that were direct translations (or reasonable conceptual links) to items
in the English forms of the tests. Linking items were distributed across the difficulty range
of each test and served as statistical anchors in the Spanish-to-English equating process. The
percentage of linking items on each calibration test form ranged from 25% for APROV
Prueba 12: Rememoración de lectura to 43% for COG Prueba 5A: Procesamiento fonético –
Acceso de palabras.
With the exception of APROV Prueba 8: Lectura oral, the calibration form of each test was
approximately 10 to 15% longer than the targeted length for a published form of the test, to
allow for flexibility to select the best-performing items after the calibration study. Traditional
basal and ceiling rules and cutoff rules were used during administration of the calibration
forms to minimize testing time, but the rules were set conservatively to ensure that every
examinee in the study encountered all appropriately targeted items. Every examinee in the
calibration study was administered all six tests.

1 Severaladditional adapted tests were not included in the calibration study because adequate item data from prior Spanish calibration studies existed to support
construction of Batería IV test forms. These tests included Prueba 1: Vocabulario oral and Prueba 8: Información general in the COG battery and Prueba 1: Identificación
de letras y palabras, Prueba 2: Problemas aplicados, Prueba 3: Ortografía, Prueba 4: Comprensión de textos, Prueba 6: Expresión de lenguaje escrito, and Prueba 7: Análisis
de palabras in the APROV battery. The extant data from these earlier studies were used to equate the Spanish items for these tests to the scale underlying the English
WJ IV tests, following the procedures described under “Calibration and Equating of Items” below. In addition, the extant item data for Batería III COG Prueba 2:
Aprendizaje visual-auditivo and Prueba 13: Reconocimiento de dibujos were used to equate these two Batería III tests to the scales underlying the WJ IV forms of these
tests so that the tests (and clusters that utilize the tests) can be scored with WJ IV/Batería IV norms.

Appendix E 137
Calibration Study Data Collection
The Batería IV calibration study was conducted between December 2017 and April 2018.
In this study, the six Batería IV tests were administered to a sample of 601 native Spanish-
speaking examinees between the ages of 2 and 81 years.
All calibration study data were collected by trained examiners. At the outset of the study,
Riverside Insights project staff recruited examiners via e-mail targeted to customer databases
and member databases from the National Association of School Psychologists and the
National Latino/Latina Psychological Association. All professional norming study examiners
completed a 2-hour online training course consisting of test-by-test video modules with
embedded practice exercises and a summative quiz. Examiners were required to achieve a
minimum passing score on the quiz to be approved for participation. After completion of
online training, examiners completed one practice test administration. Practice administration
protocols were reviewed by project staff to ensure that examiners were proficient in the
administration and scoring of all norming tests. Riverside Insights project staff provided
feedback to examiners on any issues or concerns that were noted on their practice cases.
After the practice case was approved, examiners could begin recruiting and testing
calibration study participants. Riverside Insights project staff reviewed all submitted protocols
for completion and accuracy of administration procedures (e.g., adherence to basal and
ceiling rules and continuation rules). Riverside Insights project staff continually monitored
the sample acquisition to ensure adherence to the demographic variable distributions of the
sampling plan. All calibration data, including demographic information, item scores, and item
responses, were manually key-entered and verified prior to analysis.
Table E-2 presents the distribution of the calibration sample by age group.

Table E-2. Percentage


Distribution of the Batería IV Number of of Calibration
Calibration Sample by Age Age Examinees Study Sample
Group 2–5 101 16.8
6–8 101 16.8
9–13 99 16.5
14–19 99 16.5
20–39 103 17.1
40+ 98 16.3
Total 601 100.0

The calibration study examinees were selected from all regions of the United States. The
sample was chosen to ensure a broad representation of sex, parent or examinee education
level, and country of Hispanic origin/nativity. Table E-3 contains the distribution of these
sampling variables in the calibration study.

138 Appendix E
Percentage of
Table E-3. Number in Calibration Study
Distribution of Sampling Sampling Variable Calibration Study Sample
Variables in the Batería IV Sex
Calibration Study
Male 246 40.9
Female 355 59.1
Parent1 or Examinee Education
High School or Less 301 50.1
> High School 300 49.9
Geographic Location
Arizona 3 0.5
California 22 3.7
Connecticut 1 0.2
Florida 5 0.8
Illinois 99 16.5
New Jersey 5 0.8
New York 7 1.2
Tennessee 67 11.2
Texas 375 62.4
Virginia 17 2.8
Hispanic Origin
Cuban 9 1.5
Dominican 5 0.8
Guatemalan 12 2.0
Mexican 421 70.1
Puerto Rican 33 5.5
Salvadoran 17 2.8
Other/Mixed 104 17.3
1 Parent education is reported for examinees who are less than 18 years old.

Calibration and Equating of Items


At the completion of the data collection, data were analyzed using the Rasch model. The item
data were freely calibrated and item W difficulties were estimated. Extremely unexpected,
or “misfitting,” examinee responses were identified through an examination of the Rasch
person fit indices and were removed from the item calibration analysis. Misfitting responses
contribute excessive noise to the data and can degrade the quality of the item calibrations
(Linacre, 2002). For most tests, fewer than one tenth of 1% of the total examinee item
responses were removed during the item calibration step.
Item difficulties for the Spanish form of each test were then linked to the W scale
underlying each corresponding English test through the Rasch equating procedures (Wright
& Stone, 1979)2 described below:

2 Wolfe
 (2004) terms this type of equating the “equating constants” method, while Linacre (2012) refers to it as the “Fahrenheit-Celsius” method. This method differs
from the Rasch common-item-anchor equating design employed in the WJ IV norming (and described in the WJ IV Technical Manual) in that the item difficulty
parameters for each data set are estimated separately, and the difficulty measures from one set of items are then transformed onto the other scale outside of the
estimation process.

Appendix E 139
1. Identify stable common linking items. For each test, the separate Spanish and English
item difficulties for the common items were cross-plotted. Extreme outliers, identified
using a linear regression procedure, revealed some items with very different relative
W-difficulty estimates in Spanish and English. These outlier items were removed from
the common item linking set.
2. Apply the scale transformation equation. For each test, item W-difficulty means
(Ms) and standard deviations (SDs) were computed for the subsets of common
items from the Spanish and English item pools. Spanish item W-difficulty values
were then adjusted to the scale of the English item pools using the following unit
transformation equation:

SD
De′ = ____
​  e  
​ (D – Ms) + Me , (E.1)
SDs s

where De′ is the item difficulty of any Spanish item transformed onto the English
item difficulty scale, SDe is the standard deviation of the English common-item
difficulties, SDs is the standard deviation of the Spanish common-item difficulties,
Ds is the difficulty of the Spanish item to be transformed, Ms is the mean of the
Spanish common-item difficulties, and Me is the mean of the English common-item
difficulties. Application of this transformation equation placed the Spanish items onto
the scale of the WJ IV English item pools.

Review of Item Statistics


In addition to content and bias considerations, the authors relied on both classical and
Rasch-based statistical information to guide their item selection. In general, items under the
following conditions were flagged and removed from consideration:
1. Items with point-measure correlations less than .20. The item point-measure correlation
is the correlation between each examinee’s W score and his or her score (1/0) on the
item. This measure provides insight into how well each item discriminates between
low- and high-ability examinees. Items with point-measure correlations less than
.20 may not discriminate well or may be measuring something other than what is
intended by the other items in the scale.
2. Rasch mean-square fit statistics greater than 1.30. Rasch fit statistics describe the difference
between an item’s expected scores (i.e., under the Rasch model) and its observed scores
(i.e., in the data). Mean-square fit statistics have an expected value of 1.0; values greater
than 1.3 indicate that there may be more “noise” than useful measurement in the
data. Low fit values (< 0.7) indicate that the item responses are more predictable than
expected; this condition may reduce the statistical information in each item response but
does not degrade measurement to the extent that values greater than 1.3 do.

Item Bias Analysis


Bias in item difficulty is often referred to as differential item functioning, or DIF. DIF occurs
when an item is more difficult for a particular subgroup of examinees, even when the overall
ability of those examinees is the same as that of other groups. For the Batería IV
calibration items, gender DIF was evaluated during item calibration using the Rasch
iterative-logit method within the WINSTEPS software (Linacre, 2012). In this method, item
difficulty calibrations, and their associated standard errors, are estimated for each item and
each subgroup individually, while all other item difficulty estimates (and examinee ability
estimates) are held constant. The difference between the subgroup item difficulty estimate for
each item, or the DIF contrast, was then evaluated using Welch’s t statistic for the difference

140 Appendix E
between two means (Linacre, 2012). Items were flagged if the DIF contrast between males
and females was greater than or equal to 5.82 W points.3 Items were also flagged if significant
(p <.05) Rasch-Welch t-test4 or Mantel-Haenszel DIF5 statistics were reported. Items were
flagged regardless of the direction of the apparent bias. The percentages of flagged items with
both DIF contrast greater than or equal to 5.82 points and significance at the p <.05 level are
reported for the Batería IV calibration study tests in Table E-4.

Table E-4. Total Percentage of Items


Percentage of Batería IV Number of More Difficult for
Calibration Items Flagged Calibration
for Potential Gender DIF Test Name Items Males Females
COG Prueba 5A: Procesamiento fonético – Acceso de
27 3.7 0.0
palabras
COG Prueba 5C: Procesamiento fonético – Sustitución 23 0.0 4.3
COG Prueba 6: Rememoración de cuentos 159 3.1 7.5
COG Prueba 12: Repetición de palabras sin sentido 51 0.0 5.9
APROV Prueba 8: Lectura oral 27 0.0 0.0
APROV Prueba 12: Rememoración de lectura 143 0.7 3.5

Assembly and Evaluation of Final Test Forms


After all test items had been placed onto the underlying W scales, the authors, with assistance
from several native Spanish-speaking education and language professionals, chose items
for the publication forms of the Batería IV. During the assembly of these forms, the authors
followed some general principles of test construction. For instance, a common goal across all
Batería IV tests is that items should be evenly distributed across the W-score range of the test,
with approximately three to four items per 10 W points of difficulty. Item content was chosen
that would be current and relevant to as large an audience as possible, including individuals
from a variety of Spanish-speaking countries. Finally, care was taken to ensure that no item
cued the correct response to any other item in the same test.

Reliability
Reliability refers to the precision of a test score. High reliability indicates that an individual’s
measure on a test would be unlikely to change if he or she were retested under similar
conditions. Reliability is a necessary, but not sufficient, condition for validity. Although high
reliability does not necessarily imply that a test score is valid for a specific purpose, reliability
is an important element of the overall validity argument for a test. The reliability coefficient
can be thought of as an index of the precision with which relative standing or position in a
group is measured.

3 A DIF contrast with a W-point difference greater than or equal to 5.82 W points (i.e., 0.64 × 9.1024 W points, which is the value of 1 Rasch logit) corresponds to the
commonly used Educational Testing Service (ETS) “C” classification for moderate to severe DIF (Linacre, 2016; Zieky, 1993).
4 In a test of 20 items, one would expect one item to exhibit significant DIF by chance (p <.05, the Type I error rate). Several authors (e.g., Linacre, 2012; Wolfe et

al., 2006) suggest the use of the Bonferroni correction to adjust for Type I error when performing multiple statistical tests. Because the purpose of this DIF analysis
was exploratory—items exhibiting significant DIF contrast were not rejected outright but rather were flagged for further review—no correction was applied in
these analyses. The numbers of pairwise t tests in the analysis of DIF for each test suggests that some unbiased items were likely flagged; however, this potential
overidentification was deemed acceptable for the purposes of this DIF study.
5 The Mantel-Haenszel procedure is a statistical approach that utilizes a contingency table to test the significance of score differences between a referent and a focal

group across an ability continuum.

Appendix E 141
Test Reliabilities
For the six tests that were included in the Batería IV calibration study, reliability coefficients
were calculated using item-level data from the calibration study. For all other tests in the
Batería IV battery, reliability coefficients were calculated using item-level data from the
norming study. For all nontimed, or nonspeeded, tests, internal consistency reliabilities were
calculated using the split-half procedure. Raw scores were computed based on the odd- and
even-numbered items, and correlations were computed between these sets of scores.
The split-half procedure is inappropriate6 for tests containing multiple-point items
(e.g., APROV Prueba 8: Lectura oral). The reliabilities for these tests were calculated using
information provided by the Rasch model.
As described in Chapter 2 of the WJ IV Technical manual, all Batería IV speeded tests (e.g.,
COG Prueba 4: Pareo de letras idénticas, COG Prueba 5B: Procesamiento fonético – Fluidez
de palabras) were calibrated using a rate-based metric. Although this rate-based metric is
useful for calibrating items and rank-ordering examinees, it yields inflated standard errors
for ability measures due to the limited number of possible scores for each time interval. For
this reason, the procedures for calculating Rasch-based reliability coefficients that were used
for the tests with multiple-point items were not appropriate for the speeded tests. Instead, a
test-retest study was conducted during the WJ IV norming for all speeded tests. Examinees in
three separate age groups were administered the norming form of each speeded test, followed
by a second administration of the same form of the test 1 day later. The retest interval in
this study was intentionally short to minimize changes in test scores due to changes in the
examinee’s state or latent trait. Correlations between the first and second administrations
were computed, and a correction was applied for restriction of range in the study samples
(Sackett & Yang, 2000).
For the tests with subtests (COG Prueba 1: Vocabulario oral, COG Prueba 5: Procesamiento
fonético, COG Prueba 7: Visualización, and COG Prueba 8: Información general), test
reliabilities were computed using Mosier’s (1943) formula for reliability of composite scores.
Details of the procedures for computing reliabilities are included in Chapter 4 of the WJ IV
Technical Manual.
All reliability coefficients were corrected for published test length using the Spearman-
Brown correction formula. Table E-5 presents the median reliability coefficients by age group
for the nonspeeded tests included in the Batería IV. Table E-6 presents the results of the
speeded test-retest study for several age groups.

Cluster Reliabilities
Cluster reliabilities were also computed using Mosier’s (1943) formula for composite
reliability. Table E-7 presents the median cluster reliabilities for all Batería IV clusters by age
group.

6 Internalconsistency reliability methods, such as the split-half procedure, assume that the average correlation between items within a test is the same as the average
correlation between items from the hypothetical alternative forms created by splitting the test into two smaller tests (e.g., odd and even items). This assumption
is violated when tests contain items that produce a different range of scores for each item (as in the Batería IV tests with multiple-point item scoring). In this case,
splitting the test in half may produce tests that are no longer equivalent; the items on one half of the test may have a higher maximum possible total score than the
items on the other half.

142 Appendix E
Table E-5. Age
Reliability Coefficients for Test Name 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16
Batería IV Nonspeeded Tests
Pruebas de habilidades
by Age Group
cognitivas
Prueba 1: Vocabulario 0.89 0.90 0.90 0.89 0.89 0.86 0.86 0.84 0.84 0.89 0.89 0.88 0.88
oral
Prueba 2: Series 0.91 0.91 0.92 0.92 0.90 0.90 0.87 0.87 0.84 0.84 0.91 0.91
numéricas
Prueba 3: Atención 0.88 0.88 0.90 0.90 0.89 0.89 0.82 0.82 0.86 0.86 0.83 0.83 0.85 0.85
verbal
Prueba 5: 0.82 0.82 0.85 0.85 0.85 0.85 0.81 0.81 0.80 0.80 0.83 0.83 0.86 0.86
Procesamiento fonético
Prueba 6: 0.96 0.96 0.96 0.95 0.95 0.95 0.95 0.95 0.95 0.95 0.95 0.95 0.95 0.95 0.95
Rememoración de
cuentos
Prueba 7: Visualización 0.90 0.90 0.90 0.87 0.87 0.83 0.83 0.79 0.79 0.81 0.81 0.84 0.84 0.82 0.82
Prueba 8: Información 0.90 0.90 0.90 0.85 0.85 0.83 0.83 0.84 0.84 0.80 0.80 0.85 0.85 0.87 0.87
general
Prueba 9: Formación de 0.84 0.84 0.84 0.94 0.94 0.94 0.94 0.94 0.94 0.93 0.93 0.93 0.93 0.91 0.91
conceptos
Prueba 10: Inversión de 0.80 0.80 0.83 0.83 0.84 0.84 0.82 0.82 0.84 0.84 0.89 0.89 0.87 0.87
números
Prueba 12: Repetición 0.91 0.91 0.92 0.92 0.93 0.93 0.94 0.94 0.94 0.94 0.91 0.91 0.87 0.87
de palabras sin sentido

Age
Test Name 17 18 19 20–29 30–39 40–49 50–59 60–69 70–79 80+ Median
Pruebas de habilidades
cognitivas
Prueba 1: Vocabulario 0.88 0.88 0.88 0.89 0.90 0.92 0.92 0.92 0.93 0.93 0.89
oral
Prueba 2: Series 0.92 0.92 0.92 0.86 0.88 0.89 0.90 0.90 0.93 0.93 0.91
numéricas
Prueba 3: Atención 0.91 0.91 0.91 0.82 0.86 0.87 0.83 0.83 0.82 0.82 0.86
verbal
Prueba 5: 0.86 0.86 0.86 0.87 0.88 0.91 0.91 0.91 0.90 0.90 0.85
Procesamiento fonético
Prueba 6: 0.95 0.95 0.95 0.96 0.96 0.96 0.96 0.96 0.96 0.96 0.95
Rememoración de
cuentos
Prueba 7: Visualización 0.85 0.85 0.85 0.83 0.85 0.87 0.87 0.87 0.87 0.87 0.85
Prueba 8: Información 0.88 0.88 0.88 0.88 0.92 0.91 0.91 0.91 0.95 0.95 0.88
general
Prueba 9: Formación de 0.92 0.92 0.92 0.91 0.94 0.93 0.95 0.95 0.96 0.96 0.93
conceptos
Prueba 10: Inversión de 0.91 0.91 0.91 0.91 0.93 0.94 0.90 0.90 0.91 0.91 0.88
números
Prueba 12: Repetición
de palabras sin sentido 0.87 0.87 0.87 0.85 0.85 0.90 0.90 0.90 0.90 0.90 0.91

Appendix E 143
Table E-5. (cont.) Age
Reliability Coefficients for Test Name 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16
Batería IV Nonspeeded Tests
Pruebas de
by Age Group
aprovechamiento
Prueba 1: Identificación 0.97 0.97 0.97 0.98 0.98 0.96 0.96 0.94 0.94 0.92 0.92 0.88 0.88 0.90 0.90
de letras y palabras
Prueba 2: Problemas 0.92 0.92 0.92 0.91 0.91 0.92 0.92 0.89 0.89 0.87 0.87 0.89 0.89 0.90 0.90
aplicados
Prueba 3: Ortografía 0.92 0.92 0.92 0.93 0.93 0.92 0.92 0.92 0.92 0.90 0.90 0.88 0.88 0.89 0.89
Prueba 4: Comprensión 0.88 0.88 0.88 0.98 0.98 0.94 0.94 0.89 0.89 0.81 0.81 0.84 0.84 0.87 0.87
de textos
Prueba 5: Cálculo       0.93 0.93 0.94 0.94 0.91 0.91 0.89 0.89 0.89 0.89 0.93 0.93
Prueba 6: Expresión de     0.99 0.99 0.99 0.99 0.99 0.94 0.94 0.89 0.89 0.79 0.79 0.79 0.79
lenguaje escrito
Prueba 7: Análisis de       0.96 0.96 0.94 0.94 0.93 0.93 0.91 0.91 0.89 0.89 0.87 0.87
palabras
Prueba 8: Lectura oral       0.92 0.92 0.92 0.92 0.91 0.91 0.91 0.91 0.90 0.90 0.89 0.89
Prueba 12:       0.96 0.96 0.97 0.97 0.97 0.97 0.97 0.97 0.95 0.95 0.94 0.94
Rememoración de
lectura
Prueba 13: Números       0.78 0.78 0.91 0.91 0.94 0.94 0.91 0.91 0.94 0.94 0.89 0.89
matrices

Age
Test Name 17 18 19 20–29 30–39 40–49 50–59 60–69 70–79 80+ Median
Pruebas de
aprovechamiento
Prueba 1: Identificación 0.91 0.91 0.91 0.91 0.91 0.93 0.95 0.95 0.94 0.94 0.94
de letras y palabras
Prueba 2: Problemas 0.92 0.92 0.92 0.92 0.92 0.90 0.91 0.91 0.94 0.94 0.91
aplicados
Prueba 3: Ortografía 0.89 0.89 0.89 0.90 0.90 0.90 0.94 0.94 0.93 0.93 0.92
Prueba 4: Comprensión 0.89 0.89 0.89 0.91 0.91 0.91 0.92 0.92 0.93 0.93 0.89
de textos
Prueba 5: Cálculo 0.94 0.94 0.94 0.93 0.93 0.95 0.93 0.93 0.94 0.94 0.93
Prueba 6: Expresión de 0.79 0.79 0.79 0.79 0.79 0.79 0.79 0.79 0.79 0.79 0.79
lenguaje escrito
Prueba 7: Análisis de 0.88 0.88 0.88 0.89 0.87 0.88 0.94 0.94 0.93 0.93 0.91
palabras
Prueba 8: Lectura oral 0.89 0.89 0.89 0.89 0.89 0.90 0.90 0.90 0.90 0.90 0.90
Prueba 12: 0.94 0.94 0.94 0.96 0.96 0.97 0.97 0.97 0.97 0.97 0.97
Rememoración de
lectura
Prueba 13: Números 0.92 0.92 0.92 0.90 0.92 0.91 0.95 0.95 0.93 0.93 0.92
matrices

144 Appendix E
Table E-6. Age
Test-Retest Reliability Test Name 7–11 14–17 26–79
Coefficients From the
Pruebas de habilidades cognitivas
WJ IV/Batería IV Speeded
Test-Retest Study Prueba 4: Pareo de letras idénticas 0.91 0.88 0.91
Prueba 11: Pareo de números idénticos 0.85 0.84 0.88
Prueba 13: Cancelación de pares 0.89 0.89 0.95
Prueba 14: Rapidez en la identificación de dibujos 0.90 0.79 0.90
Pruebas de aprovechamiento
Prueba 9: Fluidez en lectura de frases 0.95 0.93 0.93
Prueba 10: Fluidez en datos matemáticos 0.95 0.97 0.95
Prueba 11: Fluidez en escritura de frases 0.83 0.76 0.88

Table E-7. Age


Reliability Coefficients for
Cluster Name 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16
Batería IV Clusters by Age
Group Pruebas de habilidades
cognitivas
Habilidad intelectual 0.97 0.97 0.97 0.97 0.96 0.96 0.95 0.95 0.96 0.96 0.96 0.96
general
Habilidad intelectual 0.95 0.95 0.95 0.95 0.93 0.93 0.92 0.92 0.92 0.92 0.94 0.94
breve
Gf-Gc combinado 0.95 0.95 0.95 0.95 0.94 0.94 0.93 0.93 0.93 0.93 0.95 0.95
Comprensión- 0.93 0.92 0.92 0.92 0.92 0.91 0.91 0.89 0.89 0.92 0.92 0.93 0.93
conocimiento (Gc)
Comprensión- 0.96 0.93 0.93 0.93 0.93 0.93 0.93 0.92 0.92 0.95 0.95 0.94 0.94
conocimiento – Extendida
Razonamiento fluido (Gf) 0.94 0.94 0.95 0.95 0.94 0.94 0.93 0.93 0.92 0.92 0.94 0.94
Memoria de trabajo a 0.88 0.88 0.90 0.90 0.91 0.91 0.88 0.88 0.89 0.89 0.91 0.91 0.91 0.91
corto plazo (Gwm)
Velocidad de 0.94 0.94 0.94 0.94 0.94 0.94 0.94 0.94 0.94 0.94 0.94 0.93 0.93
procesamiento cognitivo
(Gs)
Procesamiento auditivo 0.91 0.91 0.91 0.92 0.92 0.93 0.93 0.92 0.92 0.92 0.92 0.91 0.91 0.90 0.90
(Ga)
Destreza numérica 0.88 0.88 0.88 0.88 0.88 0.88 0.88 0.88 0.90 0.90 0.90 0.90
Rapidez perceptual 0.92 0.92 0.92 0.92 0.92 0.92 0.92 0.93 0.93 0.93 0.93 0.93 0.93
Vocabulario 0.95 0.91 0.91 0.90 0.90 0.89 0.89 0.88 0.88 0.93 0.93 0.92 0.92
Eficiencia cognitiva 0.92 0.92 0.92 0.92 0.91 0.91 0.92 0.92 0.93 0.93 0.92 0.92
Eficiencia cognitiva – 0.94 0.94 0.94 0.94 0.93 0.93 0.94 0.94 0.95 0.95 0.94 0.94
Extendida
Aptitud de lectura 1 0.92 0.92 0.92 0.93 0.93 0.93 0.93 0.93 0.93 0.92 0.92 0.92 0.92
Aptitud de lectura 2 0.93 0.93 0.92 0.92 0.92 0.92 0.92 0.92 0.92 0.92 0.91 0.91 0.92 0.92
Aptitud matemática 0.94 0.94 0.94 0.94 0.93 0.93 0.93 0.93 0.93 0.93 0.94 0.94
Aptitud de escritura 0.93 0.91 0.91 0.92 0.92 0.92 0.92 0.92 0.92 0.91 0.91 0.91 0.91

Appendix E 145
Table E-7. (cont.) Age
Reliability Coefficients for
Cluster Name 17 18 19 20–29 30–39 40–49 50–59 60–69 70–79 80+ Median
Batería IV Clusters by Age
Group Pruebas de habilidades
cognitivas
Habilidad intelectual 0.97 0.97 0.97 0.96 0.97 0.97 0.97 0.97 0.97 0.97 0.97
general
Habilidad intelectual 0.95 0.95 0.95 0.92 0.94 0.95 0.94 0.94 0.95 0.95 0.94
breve
Gf-Gc combinado 0.95 0.95 0.95 0.94 0.96 0.96 0.96 0.96 0.97 0.97 0.95
Comprensión- 0.93 0.93 0.93 0.93 0.95 0.95 0.95 0.95 0.97 0.97 0.93
conocimiento (Gc)
Comprensión- 0.95 0.95 0.95 0.96 0.96 0.96 0.96 0.96 0.97 0.97 0.95
conocimiento – Extendida
Razonamiento fluido 0.95 0.95 0.95 0.92 0.94 0.94 0.95 0.95 0.96 0.96 0.94
(Gf)
Memoria de trabajo a 0.94 0.94 0.94 0.92 0.94 0.95 0.92 0.92 0.92 0.92 0.91
corto plazo (Gwm)
Velocidad de 0.94 0.94 0.94 0.93 0.93 0.94 0.94 0.94 0.94 0.94 0.94
procesamiento cognitivo
(Gs)
Procesamiento auditivo 0.90 0.90 0.90 0.90 0.90 0.93 0.93 0.93 0.93 0.93 0.92
(Ga)
Destreza numérica 0.92 0.92 0.92 0.92 0.93 0.93 0.91 0.91 0.91 0.91 0.90
Rapidez perceptual 0.93 0.93 0.93 0.93 0.93 0.93 0.93 0.93 0.92 0.92 0.93
Vocabulario 0.93 0.93 0.93 0.94 0.94 0.94 0.94 0.94 0.94 0.94 0.93
Eficiencia cognitiva 0.94 0.94 0.94 0.94 0.95 0.95 0.94 0.94 0.94 0.94 0.93
Eficiencia cognitiva – 0.96 0.96 0.96 0.95 0.96 0.96 0.95 0.95 0.95 0.95 0.95
Extendida
Aptitud de lectura 1 0.94 0.94 0.94 0.95 0.95 0.96 0.95 0.95 0.95 0.95 0.93
Aptitud de lectura 2 0.94 0.94 0.94 0.94 0.94 0.95 0.94 0.94 0.94 0.94 0.93
Aptitud matemática 0.94 0.94 0.94 0.94 0.94 0.95 0.95 0.95 0.95 0.95 0.94
Aptitud de escritura 0.92 0.92 0.92 0.94 0.94 0.95 0.94 0.94 0.93 0.93 0.92

146 Appendix E
Table E-7. (cont.) Age
Reliability Coefficients for Cluster Name 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16
Batería IV Clusters by Age
Pruebas de
Group
aprovechamiento
Lectura 0.96 0.96 0.96 0.99 0.99 0.97 0.97 0.95 0.95 0.93 0.93 0.92 0.92 0.94 0.94
Lectura amplia 0.99 0.99 0.97 0.97 0.97 0.97 0.96 0.96 0.96 0.96 0.96 0.96
Destrezas básicas en 0.98 0.98 0.97 0.97 0.96 0.96 0.95 0.95 0.93 0.93 0.94 0.94
lectura
Comprensión de lectura 0.99 0.99 0.97 0.97 0.94 0.94 0.91 0.91 0.92 0.92 0.92 0.92
Fluidez en la lectura 0.96 0.96 0.95 0.95 0.95 0.95 0.95 0.95 0.95 0.95 0.95 0.95
Matemáticas 0.95 0.95 0.96 0.96 0.94 0.94 0.93 0.93 0.94 0.94 0.95 0.95
Matemáticas amplias 0.97 0.97 0.97 0.97 0.97 0.97 0.96 0.96 0.96 0.96 0.97 0.97
Destrezas en cálculos 0.96 0.96 0.97 0.97 0.96 0.96 0.96 0.96 0.95 0.95 0.97 0.97
matemáticos
Resolución de 0.91 0.91 0.94 0.94 0.95 0.95 0.93 0.93 0.95 0.95 0.94 0.94
problemas matemáticos
Lenguaje escrito 0.99 0.99 0.97 0.97 0.95 0.95 0.93 0.93 0.90 0.90 0.91 0.91
Lenguaje escrito amplio 0.99 0.99 0.96 0.96 0.95 0.95 0.94 0.94 0.92 0.92 0.93 0.93
Expresión escrita 0.99 0.99 0.95 0.95 0.92 0.92 0.91 0.91 0.87 0.87 0.87 0.87
Destrezas académicas 0.99 0.99 0.97 0.97 0.95 0.95 0.93 0.93 0.92 0.92 0.94 0.94
Fluidez académica 0.97 0.97 0.97 0.97 0.97 0.97 0.97 0.97 0.97 0.97 0.97 0.97
Aplicaciones académicas 0.99 0.99 0.98 0.98 0.95 0.95 0.93 0.93 0.93 0.93 0.94 0.94
Aprovechamiento breve 0.98 0.98 0.98 0.98 0.98 0.97 0.97 0.96 0.96 0.96 0.96 0.95 0.95 0.96 0.96
Aprovechamiento amplio 0.99 0.99 0.99 0.99 0.99 0.99 0.98 0.98 0.98 0.98 0.98 0.98

Appendix E 147
Table E-7. (cont.) Age
Reliability Coefficients for
Cluster Name 17 18 19 20–29 30–39 40–49 50–59 60–69 70–79 80+ Median
Batería IV Clusters by Age
Group Pruebas de
aprovechamiento
Lectura 0.94 0.94 0.94 0.95 0.95 0.95 0.96 0.96 0.96 0.96 0.95
Lectura amplia 0.96 0.96 0.96 0.96 0.96 0.97 0.97 0.97 0.97 0.97 0.96
Destrezas básicas en 0.94 0.94 0.94 0.94 0.94 0.95 0.97 0.97 0.96 0.96 0.95
lectura
Comprensión de lectura 0.94 0.94 0.94 0.94 0.94 0.94 0.95 0.95 0.96 0.96 0.94
Fluidez en la lectura 0.95 0.95 0.95 0.95 0.95 0.95 0.95 0.95 0.95 0.95 0.95
Matemáticas 0.96 0.96 0.96 0.96 0.96 0.96 0.96 0.96 0.97 0.97 0.96
Matemáticas amplias 0.97 0.97 0.97 0.97 0.97 0.97 0.97 0.97 0.98 0.98 0.97
Destrezas en cálculos 0.97 0.97 0.97 0.96 0.96 0.97 0.97 0.97 0.97 0.97 0.97
matemáticos
Resolución de 0.95 0.95 0.95 0.95 0.95 0.94 0.96 0.96 0.96 0.96 0.95
problemas matemáticos
Lenguaje escrito 0.90 0.90 0.90 0.91 0.91 0.91 0.93 0.93 0.92 0.92 0.92
Lenguaje escrito amplio 0.93 0.93 0.93 0.93 0.93 0.93 0.95 0.95 0.94 0.94 0.94
Expresión escrita 0.87 0.87 0.87 0.87 0.87 0.87 0.88 0.88 0.88 0.88 0.88
Destrezas académicas 0.94 0.94 0.94 0.95 0.95 0.95 0.96 0.96 0.96 0.96 0.95
Fluidez académica 0.96 0.96 0.96 0.96 0.97 0.97 0.97 0.97 0.97 0.97 0.97
Aplicaciones académicas 0.95 0.95 0.95 0.95 0.95 0.95 0.95 0.95 0.96 0.96 0.95
Aprovechamiento breve 0.96 0.96 0.96 0.96 0.96 0.96 0.97 0.97 0.97 0.97 0.96
Aprovechamiento amplio 0.98 0.98 0.98 0.98 0.98 0.99 0.99 0.99 0.99 0.99 0.98

148 Appendix E
800.323.9540
wj-iv.com

You might also like