Many of the most commonly cited remedies for improving educational test scores—shrinking class sizes, boosting teacher salaries or hiring teachers with advanced degrees—actually have little power to predict Standards of Learning (SOL) pass rates, says a Longwood University economist.

What does seem to matter are economic and demographic variables related to students at the school—including ethnicity and family income, according to a study by Dr. David Lehr. The study analyzed 2008-09 SOL pass rates in reading and math for third-graders in nearly every Virginia elementary school at that time.

Lehr said the percentage of students eligible for free or reduced lunches and the percentage of African-American and Hispanic students in a given school are important predictors of SOL pass rates. This is the case even after controlling for other factors such as class sizes, levels of teacher education and teacher salaries, he said, adding that these and other variables often thought to impact test scores—including a school’s attendance rate, the number of days taught in the academic year and the percentage of teachers with doctorates—appear to have little predictive ability.

"This study should convince policymakers that there is no magic bullet for SOL pass rates," said Lehr, associate professor of economics. "If you think that raising salaries or lowering class size or hiring teachers who are more educated will be the key to substantially higher pass rates, this paper should give you pause."

The study by Lehr and two others was an effort to develop an accurate forecasting model to enable policy makers to predict SOL pass rates. This was the first time Lehr knows of that so-called machine learning techniques—now often applied to "big data"—have been used with data from K-12 education.

"We wanted to demonstrate to policy makers the utility of a data-driven approach to predicting SOL scores given different educational environments. If you redistrict or alter class sizes, for instance, our work can help policy makers understand the likely impact on school performance," said Lehr. "We weren’t interested in cause and effect; the methodology we use is not designed for that purpose. Our goal was to come up with a way to slice the data so that it would provide the best possible prediction of pass rates."

The study examined the importance of 22 school variables in predicting pass rates. For both math and reading, the most important predictors were the percentage of students eligible for free or reduced lunches and the percentage of African-American students (the higher those percentages, the lower the school’s pass rate, holding constant all other variables). The percentage of Hispanic students also was an important variable for predicting pass rates in both subject areas.

Lehr thinks the approach used to analyze the data can also be useful on a student level in high schools to improve graduation rates by identifying at-risk students. "The ability of these techniques to accurately identify at-risk students could allow educators to intervene with them proactively," he said.

"Big Data" techniques—used less commonly in education than in the business world—seek to "extract as much information as possible from data to maximize their predictive power," said Lehr. Complex algorithms are used to analyze large amounts of data.

The study analyzed data from 1,101 schools, or 99 percent of Virginia’s elementary schools at the time. The SOL scores from 2008-09 were the most recent data available when the study began.

Lehr will present the findings in a paper titled "Using Stochastic Gradient Boosting to Understand and Forecast Standards of Learning (SOL) Pass Rates of Elementary Schools in Virginia" at a meeting of the Southern Economic Association to be held Nov. 23-25 in Tampa. His co-authors are Dr. Melanie Marks, professor of economics at Longwood, who initiated the study and collected the data, and his son, David Lehr, an undergraduate at the University of Pennsylvania who provided expertise in statistical learning models.

Leave a Comment