Hugh Macartney, Duke University
A key aspect of economics involves understanding how inputs combine to produce output. While this is typically thought of in a manufacturing context, it is also applicable to education settings.
Establishing the connection between inputs and output in education should matter to policy makers and society writ large, as this knowledge is crucial for allocating scarce resources. The cost effectiveness of various education inputs also has important implications for the effectiveness of incentive and competition reforms. These considerations are relevant given the large share of government spending that is spent on education and the implications of education for a country’s future prosperity and civic engagement more broadly.
A useful way to measure education output is through standardized test scores. While critiques of such metrics are not without merit, the common scale is important for comparing students to each other and to themselves over time. Moreover, persuasive evidence suggests that improved test score performance is reflected in higher earnings later in life.
What about education inputs? There are many to consider, including the accumulated knowledge of students and their peers, the ability and effort of teachers, and the size of classes. In each case, we need to establish how the input causally affects test scores, which is no easy task. To understand why, let us focus specifically on class size, which has been the subject of a great deal of work by economics of education researchers.
Class Size Reduction and Unobserved Inputs
Intuitively, we might expect that a smaller class would lead to higher test scores, all else being equal. However, even if that is the case, simply analyzing the correlation between test scores and class size is likely to be misleading. This is because of unobserved determinants of test scores that are also correlated with class size. Which is to say, among the many inputs that affect student learning, only some of them are observed by the researcher. Anything relevant that is unobserved can lead the correlation and the causal effect to diverge.
A nice example of this is provided by Lazear (2001), which looks at the relationship between class size and unobserved student disruption in the classroom. Given that students have the potential to disrupt learning by acting out, teachers must allocate their time between actual teaching and providing discipline. As students become more disruptive, teachers devote more time to the latter, which results in less learning for all students in the classroom. Lazear uses an economic model to argue that administrators should create smaller classes if students are more likely to be disruptive. So, while smaller classes might result in higher test scores, they may also indicate elevated student disruption (which is associated with lower scores), making any analysis of correlation inconclusive with respect to causation.
Identifying the Causal Effect: Experiments and Quasi-Experiments
To make headway in identifying the causal effect of class size reduction, researchers have employed more sophisticated empirical techniques, which have been made possible as student-level administrative data has become broadly available. One possibility is to carry out an experiment, which uncovers the causal effect by design by randomly assigning students to small or larger classes. However, despite clear advantages, there are at least two downsides to such an approach. First, experiments are often implemented at smaller scales due to their cost, which makes it difficult to know if findings that emerge are applicable to the broader population. Second, teachers might alter their behavior if they are aware they are part of an experiment.
Sufficiently clever quasi-experiments, which exploit natural variation in class size, are able to overcome these disadvantages. An especially convincing approach is the regression discontinuity design, which utilizes a threshold or cutoff to make a sharp distinction in how otherwise similar subjects are treated. In the case of class size, it exploits continuous variation in school enrollment around discontinuous variation in class size from maximum class size rules.
For example, if maximum class sizes are set at 20 students, a school with between 21 and 40 first graders would have two first grade classes, while a school with between 41 and 60 would have three. As enrollment goes from 40 to 41, individual class size changes from 20 to 13 or 14. By calculating the difference in test scores (if any) between schools with enrollment just below 40 and just above it, researchers are able to obtain the causal effect of class size on test scores, under the assumption that unobserved determinants of test scores are similar for each type of school.
Crucially, this assumption does not need to be taken as an article of faith. It can be tested in a variety of ways. First, we can check to ensure that schools are equally likely to be observed on either side of the cutoff. Second, we can verify that the observed characteristics of schools are alike on either side of the cutoff, so that otherwise similar schools are being compared. Third, we can examine whether discontinuous jumps in test scores occur for enrollments other than the cutoff. This should be a very rare occurrence if the research design is sound.
Beyond identifying the direct effect of smaller classes, researchers have also considered indirect effects. In equilibrium, decreasing class size requires a greater number of teachers, which can lower teacher quality, particularly for large education systems. On the other hand, if public school quality increases from having smaller class sizes, then some private school students may be drawn back into the public system at the margin, which could further increase public school quality.
In using economic theory and applied techniques like regression discontinuity design to evaluate education policies, society can begin to understand their effects—and in understanding them, we can make more informed choices about future education reforms.