CAST recommends ways to improve the draft of the US Department of Education’s guidance on Title I assessment peer review process.
September 23, 2013
U.S. Department of Education
400 Maryland Avenue, SW
Washington, DC 20202
To Whom It May Concern:
CAST appreciates the opportunity to comment on ED’s Title I assessment peer review process. CAST is an organization that works to expand learning opportunities and outcomes for all individuals through Universal Design for Learning (UDL). CAST defined the principles and practices of UDL, which were incorporated into the Higher Education Opportunity Act (HEOA) of 2008. When applying the principles of UDL, we believe that instruction represents the entire episode of learning—i.e., the entire assessment-instructional cycle.
CAST is known for its development of innovative, technology-based educational resources and strategies based on the principles of UDL. For example, CAST created Bobby, the first software to check website accessibility; WiggleWorks (with Scholastic), the first universally designed literacy program; and CAST eReader, one of the first computer-based literacy tools. Additionally, CAST held an instrumental role in the development of the National Instructional Materials Accessibility Standard (NIMAS) and currently leads the National Accessible Instructional Materials (AIM) Center. CAST has also partnered with the University of Kansas and NASDSE in the Center on Online Learning and Students with Disabilities and serves as the lead partner (with Vanderbilt University) in the National Center on the Use of Emerging Technologies to Improve Literacy Achievement for Students with Disabilities in Middle School.
Through strategic collaborations, CAST continues to work on behalf of all learners by seeding the fields of education research, policy, professional development, and product development with UDL-based solutions. Based on CAST’s extensive experience, we offer the following comments to ED regarding the Title I assessment peer review process:
(1) ED should ensure that states are not focusing exclusively on summative assessment but also emphasize formative assessment as part of the entire assessment-instructional cycle.
Consistent with the overarching goal of assessment, formative assessment allows educators to evaluate student understanding of the knowledge and skills embedded in college and career-ready standards.
Teachers, students, administrators, and parents benefit from the data collected in well-designed formative assessment.
The formative assessment process provides information about performance during the instructional episode so that modifications, changes, and alterations in instruction may be made to support achievement toward the instructional goals.
ED should be looking for evidence that all assessment data collected by states can be used to inform improved learning and instructional practices.
Without established and well implemented formative assessment procedures, educators, students, and parents may not be well informed about progress toward a goal—in other words, they may not be informed until after it is too late to support or change instruction. For this reason, CAST supports formative evaluation, specifically that of progress monitoring.
Well-developed and implemented formative assessments can lead to improvements in each learner’s attention to and analysis of their own learning process and products.
The PARCC and SBAC consortia do not seem to be addressing formative assessment at this time as charged. We have concerns that the formative assessments that PARCC and SBAC are proposing may be more summative in nature.
(2) ED should ensure that monitoring activities consider the potential effects of assessment on classroom instruction.
Best practice suggests that assessment accommodations align with those accommodations that the student receives during classroom instruction.
There is a danger of assessment policy and procedures driving instructional practices, including materials and tools (e.g., accessible instructional materials), used for students in the classroom. In particular, limited assessment practices could adversely impact the instructional decision-making process of the IEP team.
Schools and/or teachers may not allow certain accommodations for instruction because these accommodations are not be allowed on the assessment—for example, a state was not able to provide computer based writing tests and therefore determined that all writing instruction in classrooms should be using paper and pencil in order to parallel the annual high stakes assessment.
Classroom environments are naturally changing to include advances in technology and the use of multimedia tools to support all learners. Limited assessment practices can potentially create a conflict between the use of technology in instruction and availability on assessments, ultimately preventing schools from becoming more innovative.
By inappropriately limiting the students who may use certain accommodations on assessments, these policies may inadvertently limit the number of students who will receive and benefit from the same accommodations during classroom instruction, in violation of their rights under IDEA and Section 504.
All accommodations and supports provided during assessment need to be taught and practiced prior to use. Additionally, use of accommodations and supports should be made an essential component of training for teachers/administrators prior to assessment administration.
(3) ED should closely monitor the impact of computer adaptive testing (CAT) on all subgroups, including students with disabilities and English Language Learners.
There is a lack of research on the accuracy and viability of CAT on the various categories of students with disabilities (Laitusis et al., 2011); the majority of benefits for students with disabilities ascribed to CAT appear to be based on assumptions unsupported by existing research data.
A down-leveling of test items following an item failure could result in the presentation of out-of-level items based on standards from a lower grade. This could render the assessment out of compliance with the ESEA requirement to measure student performance against the expectations for a student’s grade level (Way, 2006; US Department of Education, 2007). Such a result could also have the effect of violating the student’s rights under IDEA and Section 504. Research suggests that maintaining alignment with content standards may be more successful if the adaptation occurs at the testlet/subtest level, rather than at the item level (Folks & Smith, 2002).
Students with uneven skill sets may fail basic items and never have the opportunity to exhibit skills on higher-level tasks; this is particularly relevant to various students with disabilities who may exhibit idiosyncratic and uneven academic skills (Thurlow, et al., 2010; Almond, et al., 2010; Kingsbury & Houser, 2007).
CAT approaches are reported to be efficient and accurate when item responses are limited to multiple choice and short answers (Way, et al., 2010), while the accuracy and efficiency of more varied response types may pose significant challenges to adaptive algorithms, and hence to validity.
The majority of CAT systems deployed to date may not allow or may significantly restrict a student’s ability to return to a previous item to review or change a response (Way, et al., 2006), further narrowing the range of test-taking strategies a student may employ. Some solutions to the application of a review and change strategy for CAT have emerged (Yen, 2012; Papanastasiou, 2007).
(4) ED should ensure that state assessment systems are valid, reliable, and fair for all students, including students with disabilities and English Language Learners.
It is important for states to provide detail regarding the item and task development process in order to ensure that there is precision with respect to the identification of intended constructs associated with individual assessment items. Without this precision, there is the danger that items or tasks will measure construct irrelevant information for certain students and that, as a result, the inferences that are drawn from the assessment scores for these students will be invalid.
With respect to reading, states should be advised and monitored to be exact in identifying the particular constructs associated with each item in order to allow a skill such as decoding to be measured separately from higher level reading comprehension. With today’s widely available technologies, students can independently demonstrate achievement of high levels of reading comprehension without having to decode specific elements of text. For both students with visual impairments or those with a specific learning disabilities, technology can support high levels of language processing necessary for deep understanding and interpretation of text. In many such cases, college and career. A parallel argument may be constructed for application to mathematics.
(5) ED should ensure that states maintain high expectations, while allowing for the appropriate use of accommodations and supports, in order to provide optimal accessibility throughout the assessment.
Digitally-based assessments have the potential to promote enhanced access to the assessment and the general education curriculum for students with disabilities. It’s important for ED to advise and monitor states in the development and administration of digitally-based assessments to assure that these assessments are effectively facilitating such access.
Peer assessment review panels should include experts on accessibility as well as individuals with disabilities. The peer review process should also include opportunities for states to share their experiences with one another.
It is important for states to consider the participation needs of students from all disability categories in the assessment design to help ensure that appropriate navigation and access is available throughout the entire assessment (e.g. single switch technology for students with physical disabilities). Students from all disability categories should be included in validation studies of assessments.
ED should note the importance of states monitoring the type and quality of accommodations that are provided to students who will taking paper-based assessments.
CAST favors offering a balance of embedded and external accommodations and assistive technologies so that students may benefit from essential AT that is not embedded, is familiar to the student from daily use during instruction, and does not violate construct for selected assessment items. We will be very interested in learning more about the guidelines that will be provided with respect to the selection and use of locally provided accommodations, assistive and communication technologies.
ED would encourage states to ensure that the technologies used to administer the assessments would be fully compatible with allowable external AT (used in both assessment and instructional settings).
ED should provide guidelines and monitor states to ensure that all accommodations and supports available during assessment need to be taught and practiced prior to use. Additionally, use of accommodations and supports should be made an essential component of training for teachers/administrators prior to assessment administration.
Tracey E. Hall, PhD, Senior Research Scientist
Chuck Hitchcock, MEd, Chief of Policy and Technology
Richard Jackson, EdD, Senior Research Scientist
Joanne Karger, JD, EdD, Research Scientist/Policy Analyst
David H. Rose, EdD, Chief Education Officer and Founder
Skip Stahl, MS, Senior Policy Analyst
Joy Zabala, EdD, Director of Technical Assistance, CAST and AIM Center
Almond, P., Winter, P., Cameto, R., Russell, M., Sato, E., Clarke-Midura, J., ... Lazarus, S. (2010). Technology-enabled and universally designed assessment: Considering access in measuring the achievement of students with disabilities—A foundation for research. Journal of Technology, Learning, and Assessment,10(5). Retrieved from http://ejournals.bc.edu/ojs/index.php/jtla/article/view/1605
Folk, V. G. & Smith, R. G., (2002) "Models for delivery of CBTs." Computer-based testing: Building the foundation for future assessments (pp. 41-66).
Kingsbury, G. G. & Houser, R. L. (2007). ICAT: An adaptive testing procedure to allow the identification of idiosyncratic knowledge patterns. In D. J. Weiss (Ed.). Proceedings of the 2007 GMAC Conference on Computerized Adaptive Testing. Retrieved from www.psych.umn.edu/psylabs/CATCentral/
Laitusis, C. C., Buzick, H. M., Cook, L., & Stone, E. (2011). Adaptive Testing Options for Accountability Assessments. In M. Russell & M. Kavanaugh (Eds.), Assessing Students in the Margins: Challenges, Strategies, and Techniques. Charlotte, NC: Information Age Publishing.
Papanastasiou, E. C. & Reckase, M. D. (2007). A "rearrangement procedure" for scoring adaptive tests with review options. International Journal of Testing, 7(4), 387-407.
Thurlow, M., Lazarus, S. S., Albus, D., & Hodgson, J. (2010). Computer-based testing: Practices and considerations (Synthesis Report 78). Minneapolis, MN: University of Minnesota, National Center on Educational Outcomes.
U.S. Department of Education. (2007). Standards and assessments peer review guidance: Information and examples for meeting requirements of the No Child Left Behind Act of 2001. Washington, DC: Author. Retrieved from www.ed.gov/policy/elsec/guid/saaprguidance.doc
Way, W. D. (2006). Practical questions in introducing computerized adaptive testing for K–12 assessment. PEM Research Reports. Iowa City, IA: Pearson Educational Measurement. Retrieved from http://www.pearsonassessments.com/NR/rdonlyres/EC965AB8-EE70-46E5-B1A5-036BE41AB899/0/RR_05_03.pdf?WT.mc_id=TMRS_Practical_Questions_in_Introducing_Computerized
Yen, Y. C., Ho, R. G., Liao, W. W., & Chen, L. J. (2012). Reducing the Impact of Inappropriate Items on Reviewable Computerized Adaptive Testing. Educational Technology & Society 15(2), 231-243.