Wednesday, October 13, 2021

My take on this year's Nobel.

Vivek Dehejia

2021 Nobel in Economics

The anticipation in advance of the Sveriges Riksbank Prize in Economic Sciences in Memory of Alfred Nobel — to give the prize its full title — is always palpable for academic economists, especially for those of us who’ve had the privilege of working with a future Laureate as doctoral students. To add to the sense of occasion, the Economics prize, by coincidence, happens to fall on Thanksgiving Day in Canada, a major national holiday. This year’s prize was awarded to David Card (one-half share) and to Joshua Angrist and Guido Imbens (one quarter-share each), for their pathbreaking work in allowing us make credible causal claims when analyzing data.

In a sense, this year’s prize is the flip side of the 2019 prize, awarded to Michael Kremer, Indian-born Abhijit Banerjee, and Esther Duflo. That trio won, in good measure, for their work in popularizing “randomized controlled trials” (RCTs), long a staple in the natural sciences, in the realm of economics research. RCTs allow us to make causal claims by randomly assigning subjects to a “control” and a “treatment” group — the randomness ensuring that any difference between the two groups can plausibly be attributed to the treatment, rather than any unobserved differences; the theory being that such differences should average out when subjects are randomized.

But, RCTs have some major limitations, which your columnist has argued in detail in these pages (“The experimental turn in economics”, 30 January 2016). A key problem is “external validity” — can a finding in one context be replicated in another, very different, one? Equally importantly, because creating an RCT is not always feasible, nor even ethical, in many situations, such an approach simply cannot address some of the “big” questions in economics, which perforce require that we analyze raw, non-randomized data, but find some way to tease out causality, if it is present.

Recall at the outset that observing a statistical correlation between two variables in the data is not, in itself, evidence of a causal relationship. Take an example close to home. During the pandemic, my classes switched on-line, with a pre-recorded two-hour lecture followed by a “live” one-hour Q&A session. Attendance at the latter was highly recommended, but not mandatory. Uniformly, I observed that students who attended, and participated activity, in the discussion session performed better on the course. But is this because my discussion session allowed them to perform better? Flattering as that would be for any professor, it is equally plausible that those who were anyway going to do well chose to participate — what we call “reverse causality”. Or, perhaps there were unobserved differences between those who participated and those who did not — say, access to high quality internet and the time to read, study, and discuss rather than struggling with poor internet, work, and school — which could plausibly explain the correlation instead? In other words, does non-random selection, rather than causality, matter in this case?

Economics is filled with such situations, where a correlation in the data entices us to draw a causal inference — which may be treacherous to do, in the absence of randomization, which, as noted, is impossible to achieve in most real world situations. The genius of David Card, working with the late Alan Krueger, was to find a clever solution, which was to look at a real world situation presenting as close to a natural experiment as the real world gives us — in this case, two contiguous US states which were otherwise similar, and which shared a common labour market and general macroeconomic conditions, but one of which increased the minimum wage whilst the other did not. (Interested readers may find a detailed exposition in fine write-ups on this year’s prize by economist Alex Tabarok in the Marginal Revolution blog and Tim Harford in the Financial Times .) By computing the “differences in differences” before and after the change and across the two jurisdictions allowed them to infer that any differential impact on unemployment was very likely driven by the policy change, not any unobserved differences.

In a similar vein, research by Angrist and Imbens, again with Krueger, in a series of papers, studied important questions such as whether increased schooling increases earnings, an obvious situation where any assumption of a unidirectional causal link may be problematic. (For instance, brighter students may study more and also earn higher incomes, both because of higher intrinsic ability.) In one seminal paper, Angrist and Krueger asked whether compulsory schooling could increase wages, and found a brilliant technique for randomization: given the oddities of the US school system, students born in late December would be one class behind those born in early January, and laws in some states allowed students to drop out at 16. The upshot is that there would be at least some students otherwise almost identical who got a year more of schooling for a purely random reason, and, these students, did indeed earn higher wages, making a causal claim tenable. (Again, check Tabarok and Harford for more details.)

The beauty of these contributions is that they were not founded on a complex and technical mathematical or statistical result that would be undecipherable to the layman, but on a simple and profound intuition of how randomization may be found even in our messy and non-random world, thus making causal inference tenable. Three cheers!

Vivek Dehejia is associate professor of economics and philosophy, Carleton University, Ottawa, Canada.

No comments: