Public schools need new approach to close achievement gap

by Scott Phelps, contributor - 08/16/16 11:00 AM ET

In 1999 when the California legislature passed the Public School Accountability Act, the state teachers’ union campaigned against it, saying schools shouldn’t be judged based on zip codes. Critics of public education and accountability advocates said pointing to family income and parental education were excuses, and that schools should be able to overcome these socioeconomic differences.

Schools and districts embarked on 15 years of teaching the California state standards and trying to raise the scores of students on tests aligned with these standards. In general, school and district scores slowly moved up statewide. Achievement gaps between schools, districts and subgroups of students, however, persisted over this period. Affluent districts were top performers throughout the period, and high-poverty districts, despite receiving more oversight and extra resources, remained near the bottom of the rankings. Similarly, states like California did not significantly change their relative rankings on national tests.

{mosads}In recent years, California’s public schools have transitioned to a new standardized testing regime, something that has happened nearly every decade since the 1970’s. State policymakers are debating whether schools should be given a single numerical score as they have been since 2000. Advocates for lower performing students say yes, it is needed to hold schools accountable. The state teachers’ union says no, it has been used to punish schools and has not helped those schools improve.

The first results for the new testing regime were for the 2014-15 school year. When the scores were published, in general the percentages of students meeting the standard whether taken statewide or in a particular district or school or subgroup of students were lower than they were in the last year of the previous testing regime. Although this drop off of scores occurred at each transition to a new testing regime in the past 30-40 years, critics and advocates were alarmed.

Regardless of whether there will be a single performance score assigned to a school or a district or a subgroup of students, it is inevitable that the percentages of students meeting standards in a district, at a school and in subgroups of students will be compared against the performance of students in other districts, school and subgroups. That percentage is effectively a single number that can be used for these comparisons.

Focusing on these comparisons is not productive. One reason is that there is no evidence that large subgroups of students change their relative average performance on standardized measures over time. Research has shown that the so-called achievement gap persists even through community colleges’ measures of the success of their students when large subgroups of students are compared. Studies have also shown that the achievement gap is similar across public, charter and private school sectors. The gap has been there since we began measuring this way, and efforts to address it have not changed if much.

It is time to change the focus of our accountability efforts. We have tried the aggregating of many students’ scores for long enough without much to show for it. In our search for a better way to use measurement to help improve student achievement, it would be wise to review some factors which affect test scores. The first is the “normal” distribution that any test must produce in order for that test to be considered a reliable measure of content knowledge. The test is designed to contain questions of sufficient challenge and variety to produce a bell-curve like distribution of scores, with most students scoring in the middle and fewer students scoring at the bottom and top. For this reason, NCLB’s requirement all students should be proficient was ridiculed as being like Lake Woebegone, where all students are above average. A second is the distribution of circumstance that exists even within a subgroup. Students come to the test with a range of family and friend supports and resources that affect their exposure to knowledge. A third is the distribution of motivation, an internal phenomenon. Motivation to achieve varies significantly from student to student. We all know of the students who have achieved great success despite a disadvantageous background.

Consequently, if all of the scores for a subgroup were to be displayed, a distribution of scores, not a single number, would be apparent. Further, the top-scoring end from a lower-performing subgroup’s distribution would likely overlap the low-scoring end from a higher-performing subgroup’s distribution. The use of single numbers to compare subgroups of students obscures this. It also removes our focus from the individual student which is where it should be. Individual student scores, not subgroup or school scores, should be the focus of our measurement and improvement efforts. This approach is more effective for several reasons: It recognizes the reality of differences in nature on the individual level, the reality that each individual has different capacities in different areas, different circumstances and different levels of motivation; since all individuals have to grow in knowledge to make progress in life, the individual is an appropriate focus for improvement efforts; many annual polls have shown that parents give higher ratings to their own schools than they do to schools in general, indicating that school ratings are subjective anyway; for about the last 20 years, the California constitution has forbidden affirmative action in public education, so schools shouldn’t be judged on their abilities to address achievement gaps between racial/ethnic groups.

The compression of many students’ scores into one number and the use of subgroup scores do not recognize these reasons. Further, they may contribute to one-size fits all schooling. Such schooling has led many people to avoid regular public schools and turn to charter schools, private schools -where school-wide standardized test scores are not used for comparisons – or to home-schooling. So the use of measurement in the current manner, which accountability advocates hoped would improve public education, has led to a decline in general support for public education. It is time for a change.

Phelps is a California educator and member of the Pasadena Unified School District school board.

The views expressed by Contributors are their own and are not the views of The Hill.