Education Research Report: The relative stringency of state standards

States vary widely in where they set their student proficiency standards in 4th and 8th grade reading and mathematics, according to a new report released today by the National Center for Education Statistics. The report compares proficiency standards of states using the National Assessment of Educational Progress (NAEP) as the common yardstick.

The report, "Mapping State Proficiency Standards onto NAEP Scales: 2005-2007," uses NAEP to provide context for understanding the relative stringency of state standards given that each state has its own assessment system and standards for proficiency. The study compared the range of state standards in both 2005 and 2007 and measured changes in the rigor of state proficiency standards when new state standards were set after key aspects of the state assessment system changed.

Other key findings include:

* The range of state standards was wide in the four comparisons made in the study-- 4th and 8th grade reading and mathematics.

* Using NAEP achievement levels as a reference point for understanding the stringency of state standards, most were within the NAEP Basic achievement level range, except in 4th grade Reading, where most were below NAEP's Basic level.

* Overall, only two states set standards within the NAEP Proficient achievement level.

Here's the complete Executive Summary:

Since 2003, the National Center for Education Statistics (NCES) has sponsored the development of a method for mapping each state’s standard for proficient performance onto a common scale—the achievement scale of the National Assessment of Educational Progress (NAEP). When states’ standards are placed onto the NAEP reading or mathematics scales, the level of achievement required for proficient performance in one state can then be compared with the level of achievement required in another state. This allows one to compare the standards for proficiency across states.

The mapping procedure offers an approximate way to assess the relative rigor of the states’ adequate yearly progress (AYP) standards established under the No Child Left Behind Act of 2001. Once mapped, the NAEP scale equivalent score representing the state’s proficiency standards can be compared to indicate the relative rigor of those standards. The term rigor as used here does not imply a judgment about state standards. Rather, it is intended to be descriptive of state-to-state variation in the location of the state standards on a common metric.

This report presents mapping results using the 2005 and 2007 NAEP assessments in mathematics and reading for grades 4 and 8. The analyses conducted for this study addressed the following questions:

*
How do states’ 2007 standards for proficient performance compare with each other when mapped on the NAEP scale?
*
How do the 2007 NAEP scale equivalents for state standards compare with those estimated for 2005?
*
Using the 2005 NAEP scale equivalent for state standards to define a state’s proficient level of performance on NAEP, do NAEP and that state’s assessment agree on the changes in the proportion of students meeting that state’s standard for proficiency from 2005 to 2007?

To address the first question, the 2007 NAEP scale equivalent of each state reading and mathematics proficiency standard for each grade was identified. The mapping procedure was applied to the test data of 48 states.1 Key findings of the analysis presented in Section 3 of the report are:

*
In 2007, as in 2003 and 2005, state standards for proficient performance in reading and mathematics (as measured on the NAEP scale) vary across states in terms of the levels of achievement required. For example, the distance separating the five states with the highest standards and the five states with the lowest standards in grade 4 reading was comparable to the difference between Basic and Proficient performance on NAEP.2 The distance was as large in reading at grade 8 and as large in mathematics in both grades.
*
In both reading and mathematics, the 29- to 30-point distance separating the five highest and the five lowest NAEP scale equivalent of state standards for proficient performance was nearly as large as the 35 points that represent approximately one standard deviation in student achievement on the NAEP scale.
*
In grade 4 reading, 31 states set grade 4 standards for proficiency (as measured on the NAEP scale) that were lower than the cut point for Basic performance on NAEP (208). In grade 8 reading, 15 states set standards that were lower than the Basic performance on NAEP (243).
*
In grade 4 mathematics, seven states set standards for proficiency (as measured on the NAEP scale) that were lower than the Basic performance on NAEP (214). In grade 8 mathematics, eight states set standards that were lower than the Basic performance on NAEP (262).
*
Most of the variation (approximately 70 percent) from state to state in the percentage of students scoring proficient or above on state tests can be explained by the variation in the level of difficulty of state standards for proficient performance. States with higher standards (as measured on the NAEP scale) had fewer students scoring proficient on state tests.
*
The rigor of the state standards is not consistently associated with higher performance on NAEP. This association is measured by the squared correlation between the NAEP scale equivalent of the state standards and the percentages of students who scored at or above the NAEP Proficient level. In grade 4 reading and mathematics, the squared correlations are around .10 and statistically significant. In grade 8 reading and mathematics, the squared correlations are less than .07 and are not statistically significant.

To address the second question, the analyses focused on the consistency of mapping outcomes over time using both 2005 and 2007 assessments. Although NAEP did not change between 2005 and 2007, some states made changes in their state assessments in the same period, changes substantial enough that states indicated that their 2005 scores were not comparable to their 2007 scores. Other states indicated that their scores for those years are comparable. Comparisons between the 2005 and 2007 mappings in reading and mathematics at grades 4 and 8 were made separately for states that made changes in their testing systems and for those that made no such changes.3 Key findings of the analysis presented in Section 4 are:

*
In grade 4 reading, 12 of the 34 states with available data in both years indicated substantive changes in their assessments. Of those, eight showed significant differences between the 2005 and 2007 estimates of the NAEP scale equivalent of their state standards, half of which showed an increase and half a decrease.
*
In grade 8 reading, 14 of the 38 states with available data in both years indicated substantive changes in their assessments. Of those, seven showed significant differences between the 2005 and 2007 estimates of the NAEP scale equivalent of their state standards, all seven showed lower 2007 estimates of the NAEP scale equivalents.
*
In grade 4 mathematics, 14 of the 35 states with available data in both years indicated substantive changes in their assessments. Of those, 11 showed significant differences between the 2005 and 2007 estimates of the NAEP scale equivalent of their state standards: 6 states showed a decrease and 5 showed an increase.
*
In grade 8 mathematics, 18 of the 39 states with available data in both years indicated substantive changes in their assessments. Of those, 12 showed significant differences between the 2005 and 2007 estimates of the NAEP scale equivalent of their state standards: 9 showed a decrease and 3 showed an increase.

For the states with no substantive changes in their state assessments in the same period, the analyses presented in Section 4 indicate that for the majority of states in the comparison sample (14 of 22 in grade 4 reading, 13 of 24 in grade 8 reading, 15 of 21 in grade 4 mathematics and 14 of 21 in grade 8 mathematics), the differences in the estimates of NAEP scale equivalents of their state standards were not statistically significant.

To address the third question, NAEP and state changes in achievement from 2005 to 2007 were compared. The percentage of students reported to be meeting the state standard in 2007 is compared with the percentage of the NAEP students in 2007 that is above the NAEP scale equivalent of the same state standard in 2005. The analysis was limited to states with (a) available data in both years and (b) no substantive changes in their state tests. The number of states included in the analyses ranged from 21 to 24, depending on the subject and grade. The expectation was that both the state assessments and NAEP would show the same changes in achievement between the two years. Statistically significant differences between NAEP and state measures of changes in achievement indicate that more progress is made on either the NAEP skill domain or the state-specific skill domain between 2005 and 2007. A more positive change on the state test indicates students gained more on the state-specific skill domain. For example, a focus in instruction on state-specific content might lead a state assessment to show more progress in achievement than NAEP. Similarly, a less positive change on the state test indicates students gained more on the NAEP skill domain. For example, focus in instruction on NAEP content that is not a part of the state assessment might lead the state assessment to show progress in achievement that is less than that of NAEP. Key findings from Section 5 are:4

*
In grade 4 reading, 11 of 22 states showed no statistically significant difference between NAEP and state assessment measures of changes in achievement; 5 states showed changes that are more positive than the changes measured by NAEP, and 6 states showed changes that are less positive than those measured by NAEP.
*
In grade 8 reading, 9 of 24 states showed no statistically significant difference between NAEP and state assessment measures of achievement changes; 10 states showed changes that are more positive than the changes measured by NAEP, and 5 states showed changes that are less positive than those measured by NAEP.
*
In grade 4 mathematics, 13 of 21 states showed no statistically significant difference between NAEP and state assessment measures of achievement changes; 5 states showed changes that are more positive than the changes measured by NAEP, and 3 states showed changes that are less positive than those measured by NAEP.
*
In grade 8 mathematics, 9 of 21 states showed no statistically significant difference between NAEP and state assessment measures of achievement changes, 7 states showed changes that are more positive than the changes measured by NAEP, and 5 states showed changes that are less positive than those measured by NAEP.

In considering the results described above, the reader should note that state assessments and NAEP are designed for different, though related purposes. State assessments and their associated proficiency standards are designed to provide pedagogical information about individual students to their parents and teachers, whereas NAEP is designed for summary assessment at an aggregate level. NAEP’s achievement levels are used to interpret the meaning of the NAEP scales. NCES has determined (as provided by NAEP’s authorizing legislation) that NAEP achievement levels should continue to be used on a trial basis and should be interpreted with caution.

In conclusion, these mapping analyses offer several important contributions. First, they allow each state to compare the stringency of its criteria for proficiency with that of other states. Second, mapping analyses inform states whether the rigor of their proficiency standards as represented by NAEP scale equivalents changed from 2005 to 2007. Significant differences in NAEP scale equivalents might reflect changes in state assessments and standards and/or other changes such as changes in policies or practices that occurred between the years. Finally, when key aspects of a state’s assessment or standards remained the same, these mapping analyses allow NAEP to corroborate state-reported changes in student achievement and provide states with an indicator of the construct validity and generalizability of their test results.

1: Test data for the District of Columbia, Nebraska, and Utah were not available to be included in the analysis. California does not test general mathematics in grade 8.

2: NAEP defines Proficient as competency over challenging subject matter, not grade-level performance. Basic is defined as partial mastery of the skills necessary for Proficient performance.

3: The 2005 mappings in this report will not necessarily match previously published results (U.S. Department of Education 2007). Methodological differences between the procedures used in both analyses will generally cause empirical results to show small differences that are not large enough to change the whole-number scale value reported as the NAEP equivalent.

4: Because differences between changes in achievement measured by NAEP and changes measured by the state assessment and the NAEP scale equivalents are based on the same data but are analyzed in different ways, statistically significant differences can be found in one and not the other because of the nonlinear relationship between scale scores and percentiles.

Education Research Report

Thursday, October 29, 2009

The relative stringency of state standards

No comments:

About Me