ESSAYS ON INCOME INEQUALITY AND THE ENVIRONMENT by JOHN VOORHEIS A DISSERTATION Presented to the Department of Economics and the Graduate School of the University of Oregon in partial fulfillment of the requirements for the degree of Doctor of Philosophy June 2016 DISSERTATION APPROVAL PAGE Student: John Voorheis Title: Essays on Income Inequality and the Environment This dissertation has been accepted and approved in partial fulfillment of the requirements for the Doctor of Philosophy degree in the Department of Economics by: Trudy Ann Cameron Chair Peter Lambert Core Member Caroline Weber Core Member Ronald Mitchell Institutional Representative and Scott L. Pratt Dean of the Graduate School Original approval signatures are on file with the University of Oregon Graduate School. Degree awarded June 2016 ii c© 2016 John Voorheis iii DISSERTATION ABSTRACT John Voorheis Doctor of Philosophy Department of Economics June 2016 Title: Essays on Income Inequality and the Environment This dissertation considers two of the most pressing concerns of the current time, income inequality and exposure to pollution, and provides evidence that these two concerns may in fact be causally linked. In order to do this, I assemble novel datasets on income inequality and pollution exposure, and propose an strategy for causally identifying the effect of the former on the latter. In the first substantive chapter, I develop a new dataset on income inequality measured at the US state and metropolitan area level. I compare the trends in income inequality measured using different income definitions. In general, pre-tax, pre-transfer income inequality has increased in most states since 1980, but post-fiscal income inequality has seen slow or no growth since about 2000. I conduct inference on how income inequality has changed using a semi-parametric bootstrap method, and consider potential correlates with state-level income inequality. I find that de-unionization is perhaps the most important factor driving rising inequality. In the second substantive chapter, I leverage satellite-derived remote sensing data on ground-level concentrations for two important pollutants (NOx and PM2.5) to measure the distribution of pollution exposure. I propose a dashboard approach iv to measuring environmental inequality and environmental justice, proposing and applying several candidate measures to the satellite datasets. I find that environmental inequality has largely decreased since 1998, as has average exposure. I consider potential correlations between neighborhood demographics and the distribution of exposure, but find inconclusive results. In the third substantive chapter, I attempt to resolve this ambiguity by considering whether rising income inequality within metropolitan areas (the subject of the first chapter) might causally affect the distribution of exposure across people (the subject of the second). Using a simulated instrumental variables identification strategy designed to address potential endogeneity due to locational sorting, I find that income inequality decreases the average level of exposure, but increases environmental inequality. I argue this is consistent with the benefits of pollution reduction accruing to the most advantaged, and provide evidence that this may work through the political system: inequality increases the responsiveness of politicians to the environmental demands of the rich. v CURRICULUM VITAE NAME OF AUTHOR: John Voorheis GRADUATE AND UNDERGRADUATE SCHOOLS ATTENDED: University of Oregon, Eugene, OR Eastern Michigan University, Ypsilanti, MI DEGREES AWARDED: Doctor of Philosophy, Economics, 2016, University of Oregon Master of Science, Economics, 2012, University of Oregon Master of Arts, Economics, 2011, Eastern Michigan University Bachelor of Science, Economics, 2009, Eastern Michigan University AREAS OF SPECIAL INTEREST: Income Inequality Environmental Economics Political Economy Public Finance GRANTS, AWARDS AND HONORS: Young Scholar Grant, Washington Center For Equitable Growth, 2015 Edward G. Daniel Scholarship, University of Oregon, 2015 PhD Research Paper Award, University of Oregon, 2014 Kleinsorge Summer Research Fellowship, University of Oregon, 2013 Everett D. Monte Scholarship, University of Oregon, 2013 Dale Underwood Award, University of Oregon, 2012 Graduate Teaching Fellowship, University of Oregon, 2011-2016 PhD Research Paper Award, University of Oregon, 2014 vi ACKNOWLEDGEMENTS I am grateful for the advice and guidance I’ve received from from Trudy Ann Cameron, Peter Lambert and Caroline Weber. Many helpful comments from Joe Wyer made these papers much better. I have also benefitted from much useful feedback from participants at the University of Oregon Micro Group, the 2015 Winter School on Inequality and Social Welfare Theory, the 2015 AERE Annual Meeting, the 2015 Southern Economics Association Annual Meeting and the 2016 Society for Benefit Cost Analysis Annual Meeting. Any remaining errors are my own. This research was made possible by a Major Research Instrumentation grant from the National Science Foundation, Office of Cyber Infrastructure, “MRI-R2: Acquisition of an Applied Computational Instrument for Scientific Synthesis (ACISS),” Grant #: OCI- 0960354, and was directly supported by a grant from the Washington Center for Equitable Growth and by generous assistance from the Ray Mikesell Foundation at the University of Oregon. vii To my father, James Voorheis. viii TABLE OF CONTENTS Chapter Page I. INTRODUCTION . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1 II. STATE AND METROPOLITAN AREA INCOME INEQUALITY IN THE UNITED STATES: TRENDS AND DETERMINANTS . . . . . . . . . . . . 4 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4 Related Literature . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6 Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10 From Market Income to Haig-Simons Income . . . . . . . . . . . . . . . . . 17 Inference on Changes in Income Inequality . . . . . . . . . . . . . . . . . . . 22 Potential Explanations for Changing State-level Income Inequality . . . . . 32 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44 III. TRENDS IN ENVIRONMENTAL INEQUALITY IN THE UNITED STATES: EVIDENCE FROM SATELLITE DATA . . . . . . . . . . . . . . . . . . . . 46 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46 Previous Literature . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49 Data and Institutional Details . . . . . . . . . . . . . . . . . . . . . . . . . 52 Quantifying Environmental Inequality and Environmental Justice . . . . . . 54 Trends in Environmental Inequality and Environmental Justice . . . . . . . 62 Explaining the Distribution of Pollution Exposure . . . . . . . . . . . . . . 70 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 85 IV. ENVIRONMENTAL JUSTICE VIEWED FROM OUTER SPACE: HOW DOES GROWING INCOME INEQUALITY AFFECT THE DISTRIBUTION OF POLLUTION EXPOSURE? . . . . . . . . . . . . . . . 87 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 87 Previous Literature . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 89 ix Chapter Page Data and Empirical Strategy . . . . . . . . . . . . . . . . . . . . . . . . . . 92 Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 107 Mechanisms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 117 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 127 V. CONCLUSION . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 130 APPENDICES A. MEASURING STATE INCOME INEQUALITY BY COMBINING CPS AND IRS DATA . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 133 B. STATE AND MSA INCOME INEQUALITY IN THE AMERICAN COMMUNITY SURVEY . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 147 C. DECOMPOSITION ANALYSIS OF CHANGES IN ENVIRONMENTAL INEQUALITY . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 150 D. ADDITIONAL TABLES AND FIGURES . . . . . . . . . . . . . . . . . . . . . 163 REFERENCES CITED . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 175 x LIST OF FIGURES Figure Page 1. State Income Inequality, 1977–2014: Crosswalking from Market Income to Post-fiscal Income . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20 2. State Income Inequality, 1977–2014: Crosswalking from Market Income to Post-fiscal Income, 4 largest States . . . . . . . . . . . . . . . . . . . . . . . . . . 21 3. State-level Redistributiveness, 1977-2014 . . . . . . . . . . . . . . . . . . . . . . . 22 4. Difference in Gini Coefficient, 1986–1993 and 1994–2014 . . . . . . . . . . . . . . 26 5. Difference in top 1% Share, 1986–1993 and 1994–2014 . . . . . . . . . . . . . . . 27 6. Change in Lorenz Ordinates, 1994–2014, Pre-transfer and Post-transfer Income . 29 7. Change in Lorenz Ordinates, 1994–2014, Post-tax and Post-fiscal Income . . . . . 30 8. Determinants of Lorenz Ordinates, Selected Covariates . . . . . . . . . . . . . . . 43 9. Annual Average PM2.5 and NOx Exposure, 2005 . . . . . . . . . . . . . . . . . . 55 10. National Average NOx Exposure (in ppb), 2005-2011 . . . . . . . . . . . . . . . . 56 11. Kolm-Pollak Index, PM2.5 and NOx . . . . . . . . . . . . . . . . . . . . . . . . . 63 12. Atkinson Index, PM2.5 and NOx . . . . . . . . . . . . . . . . . . . . . . . . . . . 64 13. Relative Lorenz Curves, NOx (2005-2011) and PM2.5 (1998-2014) . . . . . . . . . 65 14. Absolute Lorenz Curves, NOx (2005-2011) and PM2.5 (1998-2014) . . . . . . . . 65 15. Generalized Lorenz Curves, NOx, 2005-2011 . . . . . . . . . . . . . . . . . . . . . 66 16. National Black-White Exposure Gap, PM2.5, 1998-2014 and NOx, 2005-2011 . . 67 17. National Black-White Exposure Ratio, PM2.5, 1998-2014 and NOx, 2005-2011 . . 68 18. National Black-White Exposure Gap, by Percentile (PM2.5 and NOx) . . . . . . 69 19. Average Annual NOx Exposure by Census Tract, 2005–2011 . . . . . . . . . . . . 94 20. Initial MSA Income is Unrelated to Subsequent Changes in NOx Exposure . . . . 102 21. Actual Gini Coefficient as a function of Simulated Gini Instrument, for 265 MSAs, 2005–2011 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 104 xi Figure Page 22. Reduced Form Visualizations Showing the Effect of Simulated Income Inequality on Pollution Exposure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 106 23. Illustration of how the estimated effect of Income Inequality on Absolute Environmental Inequality changes as the assumed value of κ, capturing Absolute Environmental Inequality aversion, increases . . . . . . . . . . . . . . . . . . . . . 113 24. Effect of Income Inequality on Relative Environmental Inequality, varying Relative Environmental Inequality Aversion . . . . . . . . . . . . . . . . . . . . . 115 25. Effect of an increase in income inequality on NOx exposure. . . . . . . . . . . . . 117 26. Correlation Between League of Conservation Voter Scores and Ideology, US senators . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 122 27. Histogram of Adjusted LCV Scores by Party for US Senators, 1977-2014 . . . . . 123 28. State Gini Coefficients, 1997-2012: IRS Simulation vs. CPS Baseline . . . . . . . 137 29. State Top 1% Shares, 1997-2012: IRS Simulation vs. CPS Baseline . . . . . . . . 138 30. State Gini Coefficient, 1997-2012: IRS Simulation vs. Frank (2009) . . . . . . . . 139 31. State Top 1% Shares, 1997-2012: IRS Simulation vs. Frank (2009) . . . . . . . . 139 32. State Gini Coefficient, 1997-2012: IRS Simulation vs. GB2 Simulation . . . . . . 141 33. State Top 1% Shares, 1997-2012: IRS Simulation vs. GB2 Simulation . . . . . . . 142 34. Lorenz Curve Results, ACS (2005-2011) . . . . . . . . . . . . . . . . . . . . . . . 149 35. State-level Gini Coefficient, Pre-transfer Income . . . . . . . . . . . . . . . . . . . 163 36. State-level Gini Coefficient, Post-transfer Income . . . . . . . . . . . . . . . . . . 164 37. State-level Gini Coefficient, Post-tax Income . . . . . . . . . . . . . . . . . . . . . 165 38. State-level Gini Coefficient, Post-fiscal Income . . . . . . . . . . . . . . . . . . . . 166 39. National Black-White PM2.5 Exposure Ratio (by Percentile), 1998-2014 . . . . . 171 40. National Black-White NOx Exposure Ratio (by Percentile), 2005-2011 . . . . . . 172 xii LIST OF TABLES Table Page 1. Income Sources in the CPS by Year . . . . . . . . . . . . . . . . . . . . . . . . . 12 2. Crosswalking from Market to Disposable Income . . . . . . . . . . . . . . . . . . 19 3. Lorenz Dominance Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33 4. Determinants of State Income Inequality (Pre-transfer and Post-transfer) . . . . . 37 5. Determinants of State Income Inequality (Post-tax and Post-fiscal) . . . . . . . . 38 6. Determinants of State Top 1% Share (Pre-transfer and Post-transfer) . . . . . . . 39 7. Determinants of State Top 1% Share (Post-tax and Post-fiscal) . . . . . . . . . . 40 8. Determinants of State Top 1 % Share, Frank (2009) Data . . . . . . . . . . . . . 41 9. Quantile RIF Regression Results (NOx Exposure) . . . . . . . . . . . . . . . . . 76 10. Quantile RIF Regression Results (PM2.5 Exposure) . . . . . . . . . . . . . . . . 77 11. Relative Lorenz RIF Regression Results (NOx Exposure) . . . . . . . . . . . . . . 79 12. Relative Lorenz RIF Regression Results (PM25 Exposure) . . . . . . . . . . . . . 80 13. Generalized Lorenz RIF Regression Results (NOx Exposure) . . . . . . . . . . . . 81 14. Generalized Lorenz RIF Regression Results (PM25 Exposure) . . . . . . . . . . . 82 15. Absolute Lorenz RIF Regression Results (NOx Exposure) . . . . . . . . . . . . . 83 16. Absolute Lorenz RIF Regression Results (PM2.5 Exposure) . . . . . . . . . . . . 84 17. First Stage, key coefficient only . . . . . . . . . . . . . . . . . . . . . . . . . . . . 105 18. Effect of Income Inequality on Average NOx Exposure . . . . . . . . . . . . . . . 108 19. Effect of Income Inequality on Black-White Exposure Gap . . . . . . . . . . . . . 110 20. Effect of Income Inequality on Latino-White Exposure Gap . . . . . . . . . . . . 111 21. Effect of Income Inequality on Poor-Rich Exposure Gap . . . . . . . . . . . . . . 111 22. Effect of Income Inequality on Absolute Environmental Inequality . . . . . . . . . 112 23. Effect of Income Inequality on Relative Environmental Inequality . . . . . . . . 114 xiii Table Page 24. Effect of Income Inequality on Average Black Exposure . . . . . . . . . . . . . . 116 25. Effect of Income Inequality on Average White Exposure . . . . . . . . . . . . . . 116 26. Effect of State Inequality on Senators’ LCV Scores . . . . . . . . . . . . . . . . . 125 27. Effect of State Inequality on Senators’ LCV Scores, By Party . . . . . . . . . . . 126 28. Alabama Total AGI and Number of Returns, by Size of AGI, 2012 . . . . . . . . 143 29. Intermediate Calculations in the Pareto Interpolation Process . . . . . . . . . . . 144 30. Selecting the Pareto Parameter . . . . . . . . . . . . . . . . . . . . . . . . . . . 145 31. NOx Quantile Detailed Decomposition, 2005 vs. 2011 . . . . . . . . . . . . . . . 153 32. PM2.5 Quantile Detailed Decomposition, 2005 vs. 2011 . . . . . . . . . . . . . . 154 33. NOx Relative Lorenz Curve Detailed Decomposition, 2005 vs. 2011 . . . . . . . . 155 34. PM2.5 Relative Lorenz Curve Detailed Decomposition, 2005 vs. 2011 . . . . . . . 156 35. NOx Quantile Detailed Decomposition, Diverse vs. Non-diverse tracts . . . . . . 157 36. PM2.5 Quantile Detailed Decomposition, Diverse vs. Non-diverse tracts . . . . . 158 37. NOx Generalized Lorenz Curve Detailed Decomposition, 2005 vs. 2011 . . . . . . 159 38. NOx Absolute Lorenz Curve Detailed Decomposition, 2005 vs. 2011 . . . . . . . 160 39. PM2.5 Generalized Lorenz Curve Detailed Decomposition, 2005 vs. 2011 . . . . . 161 40. PM2.5 Absolute Lorenz Curve Detailed Decomposition, 2005 vs. 2011 . . . . . . 162 41. Determinants of State Income Inequality (Pre-transfer and Post-transfer Gini), All Covariates . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 167 42. Determinants of State Income Inequality (Post-tax and Post-fiscal Gini), All Covariates . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 168 43. Effect of Income Inequality on Average Latino Exposure . . . . . . . . . . . . . . 173 44. Effect of Income Inequality on Average Poor Exposure . . . . . . . . . . . . . . . 174 45. Effect of Income Inequality on Average Rich Exposure . . . . . . . . . . . . . . . 174 xiv CHAPTER I INTRODUCTION The dual challenges of rising income inequality and environmental degradation are among the most pressing considerations facing policymakers in the contemporary United States. In broad terms, my research examines the relationship between the distribution of income and the distribution of exposure to air pollution. The three substantive chapters provide new insights concerning income inequality, environmental inequality, and examines the relationship between income ineuqality and the distribution of pollution exposure. Chapter II addresses the measurement of spatially disaggregated income inequality (within states and metropolitan areas). It offers a novel dataset on income inequality, and new insights into the causes of rising income inequality. Chapter III considers the broad question of how to measure environmental inequality in a normatively sensible way, and applies these measurement tools to data on pollution exposure from satellite remote sensing data. Chapter IV leverages the measurement work in each of the first substantive two chapters to explore the causal effect of income inequality on environmental quality, and explores potential political economy avenues by which these effects may occur. Expanding on this brief summary, Chapter II considers the quantification of income inequality measured at the US state and metropolitan area level, using income micro- data from the Census Bureau. Measuring inequality using Census Bureau survey data can be challenging, chiefly due to the fact that incomes are top-censored. To address this challenge, I adapt a semi-parametric multiple imputation method proposed by Jenkins, et al. (2009) to the new task of measuring sub-national income inequality. Further, I extend this method to perform inference on changes in income inequality via semi-parametric bootstrapping. Using this new semi-parametric bootstrap method, I find that income 1 inequality in most states and metropolitan areas has increased since the mid-1990’s, a result which stands in contrast to some literature on inequality measured with survey data. I then consider potential determinants of state income inequality. Consistent with a long line of literature starting with Dinardo, Fortin and Lemieux (1996), among the most (if not the most) important explanation for the rise in inequality since the 1970s has been the decline in union density. In addition to these results, this paper produces a dataset of state and MSA income inequality over time. This dataset is additionally used in Chapter IV, as well as several projects by myself and other researchers that are currently in progress. Chapter III considers how to measure environmental inequality in an analogous and normatively sensible fashion, and how environmental inequality has evolved within the United States. I use two data sources of data on the distribution of exposure to air pollution: satellite observations of NOx from NASA’s Aura satellite, and estimates of ground level PM2.5 derived from several extant satellites. I find that average exposure to these particular pollutants has decreased both nationally. Additionally, I find that the distribution of pollution exposure has become more equal since the late 1990’s (using the PM2.5 data) and since 2005 (using the NOx data). Using re-centered influence function regressions, which identify how variations in individual census tract demographics affect national inequality, I consider how census tract demographic characteristics are related to environmental inequality. There is evidence that many of the dimensions of disadvantage highlighted in the environmental justice literature (chiefly dimensions of race and class) are correlated with environmental inequality. Chapter IV builds on the results from the first two substantive chapters by exploring the causal effect of income inequality on the distribution of pollution exposure. I leverage the dataset developed in Chapter II on MSA level income inequality as well as the dataset 2 developed in Chapter III on the distribution of NOx exposure, and use a simulated instrumental variables strategy to achieve causal identification. I find that metropolitan area income inequality decreases average pollution exposure. There is also evidence that income inequality increases pollution exposure inequality as well as the gap in exposure between advantaged and disadvantaged groups. Together, these results imply that the most advantaged members of society are disproportionately reaping the benefits of pollution exposure reduction. I propose a political economy explanation for these results, showing in particular that income inequality appears to increase the environmentalism of US politicians, a result consistent with increasing responsiveness to the environmental demands of the rich. The final chapter of this dissertation concludes with a summary of this research agenda as it currently stands, highlighting the contributions to existing literatures on income inequality, environmental justice and political economy. This final chapter details potential future extensions of the line of research outlined in this dissertation. The final chapter also summarizes several research projects which are underway which run parallel to the work in this dissertation. 3 CHAPTER II STATE AND METROPOLITAN AREA INCOME INEQUALITY IN THE UNITED STATES: TRENDS AND DETERMINANTS Introduction Rising income inequality has become one of the most pressing concerns of the modern era. An extensive literature has attempted to explain why the distribution of income in the United States has become more unequal. This literature has almost exclusively relied on national-level data, however, both as a matter of convenience (official, national income distribution statistics are often made available by national statistical agencies) and as a matter of analytical preference (for many studies, countries are a natural unit of observation). There has been less work done on the longer run trends in, and determinants of, income inequality measured at sub-national geographic scales. There are several reasons, to study income inequality at smaller geographic resolution. Inequality can generally be thought of as being composed of between-subgroup and within-subgroup inequality. In the United States, inequality between states has actually been decreasing, implying that within-State income inequality has been the driver of overall income inequality. This in turn suggests that in order to understand national trends in income inequality it is necessary to first understand what has been happening at the sub-national level. Additionally, a small but growing literature has begun to examine whether rising income inequality affects other outcomes of interest. These effects may be the results of individuals’ responses to rising inequality. Individuals are unlikely to be able 4 to accurately perceive the national level of income inequality, but are more likely to be able to accurately perceive inequality in their own metropolitan area or state.1 This paper is innovative along two dimensions: first by offering a consistent time series of inequality measures at the state and metropolitan area levels—using Metropolitan Statistical Area (MSA) definitions—over a relatively long time period (as early as 1968 to the present), and second by applying recently developed tools for dealing with top- coded Census Bureau data to sub-national geographic scales. Specifically, I apply the Generalized Beta multiple imputation methodology proposed in Jenkins et al. (2011), previously used only with national-level data, to the State and MSA level. I believe this dataset to be more complete and better constructed than previous efforts.2 I leverage the large set of information about income sources and household composition available in Current Population survey data to construct income inequality measures using five different definitions of income, which range from an Adjusted Gross income concept that coincides with IRS tax return data to a broad post-fiscal household income definition that considers the effect of taxes, transfers and in-kind social spending that most closely aligns with the disposable income available for consumption. In addition to allowing the analysis in this paper, these data has already proven to be useful in studies where researchers may require measures of inequality at the State or MSA level. Methodologically, I utilize a number of strategies to analyze how income inequality has changed. I introduce a semi-parametric bootstrap technique for inference on inequality measures, which complements the Generalized Beta II multiple imputation process used to generate point estimates. I use this method throughout the analysis, first in pairwise comparison of inequality measures, and then in a semi-parametric extension of the Barrett 1Gimpelson and Treisman (2015) provide some suggestive evidence for this disparity. 2One notable exception being the Frank (2009) state-level dataset, which utilizes IRS Statistics of income data and Pareto interpolation from 1913 to 2005 5 et al. (2014) bootstrap test for Lorenz Dominance. I find that not only is there robust evidence of an increase in inequality, this increase in inequality is not a purely the result of increasing incomes at the top. Although the increase in income inequality is consistent with an increase in the share of the top 1%, there are also substantial changes within the bottom 99% of the distribution. Specifically, the top quartile of the distribution finds itself relatively better off over the past decade, while the bottom 50-75% is unambiguously relatively worse off. I examine several possible explanations for this increase in inequality in subsequent regression analysis, and find that evidence for associations between income inequality and both de-unionization and changes in top marginal tax rates, but less evidence for associations between inequality and changes in technology or human capital accumulation. The paper proceeds as follows. Section 2 summarizes the related literature on local area income inequality as well as the recent literature on strategies for dealing with top-censored income microdata for the study of income distribution dynamics. Section 3 describes my strategy for constructing the state and MSA income inequality panel datasets, and discusses the dynamics of the resulting income inequality measures. Sections 4 and 5 perform inference concerning the changes in income inequality over the last decade, and use state-level data to analyze the determinants of these changes. Section 6 concludes with directions for future research, included suggested applications of the inequality datasets. Related Literature There is substantive literature documenting a substantial increase in income inequality in recent decades. It is almost too large to summarize here, though earlier thorough reviews exist (see Acemoglu (2002) or Atkinson et al. (2011)). The literature 6 on income inequality in the United States can be categorized based on the underlying data source, the income-receiving unit, and the definition of income. Studies based on IRS tax return data (e.g. Piketty and Saez (2003)) use the tax unit as the unit of observation, and the income concept is market income (pre-tax, pre-transfer). Studies based on Census Bureau survey data (from the Current Population Survey, the decennial Census and the American Community Survey) have tended to use the household as the unit of observation, and pre-tax, post-transfer incomes as the income concept.3 I will use Census Bureau survey data, so I highlight the most relevant literature using this data, as well as the relatively small literature dealing with income inequality within states or metropolitan areas. Census Bureau Microdata There are two main feature of Census Bureau data which must be addressed when using these data to analyze income inequality: (1) these data do not measure capital gains (either realized or unrealized) as part of income, and (2) incomes above a certain threshold, which is different for each income source, are “topcoded” for privacy reasons. The topcoding procedures have changed significantly over the history of the CPS, but the basic process has been to replace the true reported income with a top-code value which anonymizes the top income earners in the sample. Both topcoding and the exclusion of capital gains should, other things equal, lead to lower estimates of income inequality than a “true” inequality measure where there is no topcoding and the inequality measure includes capital gains. 3Recently there has been some interest in more expansive income definitions (see, e.g. Armour et al. (2014)). I use the standard pre-tax post-transfer income measure in this study, but using more expansive income definitions might be a useful direction for further work. 7 The exclusion of capital gains can be justified on definition-of-income grounds, as in Armour et al. (2014). Unrealized capital gains do not necessarily represent changes in wealth available for consumption, and hence a reasonable definition of income might exclude these gains. Topcoding presents a more pressing issue, however. One solution, which is still occasionally used, is to simply eschew the use of income inequality measures that are sensitive to the upper tail of the income distribution (e.g. the Gini coefficient of top 1% share) and to use instead measures such as the 90-10 ratio.4 However, as Burkhauser et al. (2009) note, there is still substantial topcoding in the public-use micro data, even at the 90th percentile of the income distribution, so using the 90-10 ratio does not necessarily eliminate the top-coding problem. Two approaches have been proposed for correcting for top-coded income data. The first involves collecting suitable cell means data for top incomes from the confidential CPS data, and imputing these cell means in place of topcoded income amounts. The second involves a multiple imputation approach wherein top-coded incomes are imputed as draws from a suitable distribution. The cell means approach was actually used by the Census Bureau in the public CPS data from 1996 to 20105, and Larrimore et al. (2008) extends this cell means series back to 1976. The Larrimore series is at least as good as the internal CPS data used by the Census Bureau for official analysis, but still probably understates inequality, since top incomes are still censored. The multiple imputation approach directly addresses the censoring by fitting a parametric model of the income distribution from the raw, topcoded data, and then simulates distributions semi-parametrically by drawing replacement values for topcoded incomes from the fitted model. Jenkins et al. (2011) offers a detailed description of the suggested process. This approach is the one I adopt 4The 90-10 ratio is the ratio of income at the 90th percentile to income at the 10th percentile. 5From 2011 onwards, the CPS has a new top-coding process wherein top incomes are randomly swapped within a cell, rather than replaced with the cell means. 8 in the analysis of local area income in this paper. More detail on my modifications of the multiple imputation approach can be found in section 2.3. Local Area Income Inequality The above discussion of the current debates and developments in the inequality literature has focused on national-level income inequality within the US. With notable exceptions, very little of the economics literature on inequality has discussed inequality at smaller geographic scales. Several early analyses of inequality at the state level do exist, for instance Bishop et al. (1991). Several other papers have examined inequality at smaller geographies as well. County-level analyses of trends in inequality have been conducted, including Moller et al. (2009) and Peters (2013). No thorough MSA-level analysis of income inequality has been done, at least to my knowledge. Several papers have used state-level or local-level income inequality data as a way of analyzing how inequality might affect certain outcomes. Frank (2009) constructs a state- level panel of income inequality statistics to address a separate question (a time-series analysis of the inequality-growth relationship). Frank’s data are derived from the public- use version of the IRS data used by Piketty and Saez (2003), and extends from 1916-2005. Several papers have used CPS or Census/ACS data for similarly separate ends, including Mellor and Milyo (2002), who use CPS data to calculate MSA-level inequality in service of analyzing the Wilkinson hypothesis of a link between inequality and health status. A recent working paper, Daly and Wilson (2013), utilize county-level inequality data to examine a similar relationship between inequality and health. At least two recent papers have examined the determinants of inequality at the level of individual MSAs. Florida and Mellander (2013) utilize both inequality measured using the American Community Survey (ACS) well as ineuqality measured using wage data 9 from the Current Population Survey. Glaeser et al. (2009) also use inequality measures calculated from ACS and Census data to examine the connection between home prices and inequality. However, compared to these papers I make several contributions. First, using the CPS, I construct an annual panel of inequality measures at the MSA (and state) levels.6 Second, this study is unique among state or local area inequality studies in that it directly addresses the top-coded nature of the CPS microdata via a multiple imputation process along the lines of Jenkins et al. (2011). Data This paper seeks to add to this small but growing literature by examining income inequality at the sub-national level using CPS data in a way that addresses the potential pitfalls of the data (the right-censoring of incomes due to topcoding). To accomplish this, I adapt the Generalized Beta II multiple imputation approach of Jenkins et al. (2011) to a sub-national setting. Before carefully explaining the multiple imputation process, I present some institutional information about Census Bureau survey data. The focus here is on topcoding in the Current Population Survey, although there is also topcoding in the decennial Census and American Community surveys. Census Bureau Topcoding The March Supplement to the Current Population Survey is an annual survey of around 50,000 households which is designed to produce a nationally representative sample of the US population. There has been some debate concerning the representativeness of any given year’s CPS sample for individual states or localities. The decennial Census and the ACS are designed to be representative for localities (e.g. the 1-year ACS public- 6In Appendix B, I construct a similar dataset using ACS data. 10 use files are intended to be representative for geographic areas with populations of at least 100,000). I proceed under the assumption that the CPS is reasonably close to representative, at least for sufficiently large MSAs and states.7 The CPS March Supplement includes a number of questions about income, which have become more detailed in subsequent iterations of the survey. This disaggregated individual income information is then aggregated up to form individual, family and household income amounts for each relevant responding unit in the survey. From 1968 to 1975, the CPS included eight questions about personal income components. In 1976, this was expanded to eleven items, and after 1988, the CPS included 24 separate questions about income sources.8 The full list of income sources can be found in Table 1. Each of these income sources is subject to topcoding. In order to satisfy their mandate to protect individual privacy, the Census Bureau censors the raw reported amounts above a pre- defined threshold for each income source. There are two levels of topcoding—a “hard” topcode of the internal data, in which the raw survey results are replaced with a censored amount, and a second topcode for the public-use data, in which the internal data amount is replaced with a separate censored amount before the data are released for public-use. The internal topcodes are usually, but not always, different from the public-use topcodes. Until 1996, the Census Bureau simply replaced each topcoded income amount with the topcoding threshold. From 1996 to 2010, however, the Census Bureau instituted a more informative topcoding regime. They first divided the topcoded individuals into cells (gender-by-race-by-employment status) and then calculated a mean of all the individuals in each cell above the topcoding threshold (but below the hard-coded censoring point). 7Census Bureau documentation suggests that the CPS sample is probably representative for the largest 100 MSAs. 8The more recent surveys merely divide broad categories of income (e.g. government transfers) into more disaggregated components. The various household income definitions I will use should be unaffected by the granularity. 11 TABLE 1. Income Sources in the CPS by Year 1968-1976 Labor Sources: Wages, Self Employment, Farm Income Non-labor sources: Social Security, Welfare, Government Programs, Interest, Dividends and Rents alimony, contributions, other 1976-1987 Labor Sources: Wages, Self Employment, Farm Income Non-labor sources: Social Security, Supplemental Security, Welfare, Interest, Dividends & Rental Income Veterans & Workers Comp., Retirement, Other 1988-2012 Labor Sources: Wages, Self Employment, Farm Income Non-labor sources: Social Security, Supplemental Security, Welfare, Interest, Dividends, Rental Income, Alimony, Child Support, Unemployment, Veterans Benefits, Workers Comp., Retirement, Survivor Benefits, Disability Benefits, Educational Assistance, Financial Assistance, Other These cell means are then substituted for each respective topcoded income amount. Compared to the previous regime, the cell means provide a better picture of the right tail of the income distribution, although a substantial amount of information is still suppressed. The Census Bureau change in topcoding policies after 1996 can lead to discontinuities in any calculations of distributional statistics. However, Larrimore et al. (2008) provide a series of cell means that is consistent with the official Census Bureau series, and extends back to 1976. Starting with the 2011 March CPS, the Census Bureau has implemented a new top-coding method, in which topcoded incomes are randomly swapped across individuals within relatively narrow income bins. At the national level, this means that there should be very little difference between the internal and public-use data when these are used simply for the purpose of income inequality analysis. The internal CPS data is topcoded at a higher threshold, so the bias due to topcoding is still present, albeit diminished 12 substantially. Additionally, the Census Bureau has additionally provided researchers with “swap files” for the 1977–2010. I then use these “swap files” to perform the same rank-bin swap process on the March CPS public-use data from 1977 to 2010. I will then use this swap-file modified public-use CPS microdata from 1977 to 2010 and the CPS public-use data from 2011 onwards to calculate measures of income inequality and perform inference on trends.9 Generalized Beta II Imputation Although the rank-proximity swap process reduces some of the problematic topcoding in the public-use CPS, it does not affect the internal “hard” topcodes, which are problematic for the study of income inequality.10 A large literature has emphasized that the rise in income inequality since 1970 has been driven primarily by gains at the very top of the income distribution. Inequality measured using censored top-income values in the Census Bureau survey data is likely to miss changes at the very top of the income distribution. The use of cell mean or rank-proximity swap replacement partially address this censoring. One increasingly common method to address the remaining bias due to topcoding and under-reporting, which I utilize in this study, is to implement a multiple imputation approach. Jenkins et al. (2011) introduced this method for national level income inequality, and I extend this approach to the state (and MSA) level. The basic methodology for the multiple imputation process is as follows. First, a parametric distribution is fitted to the observed size-adjusted household income data. Then, partially synthetic income distributions are formed by taking draws from the fitted 9A previous version of this paper used the Larrimore et al. (2008) cell means series and the GB2 imputation method described in what follows. The current version of this current version uses the same multiple imputation method, but with the swap-file-modified CPS microdata. 10Additionally, as noted by Diaz-Bazan (2015), there is reason to believe that incomes may be under- reported at the top of the income distribution in the March CPS. 13 distribution and imputing these draws to the topcoded or potentially under-reported individual incomes. Second, distributional statistics are calculated using the partially synthetic data. This process is repeated n times, and the n point estimates are combined according to the rules proposed by Reiter (2003). I use n = 200 for most of my analysis, although preliminary results in small-scale simulations seem to imply that n = 100 or even n = 20 is sufficiently large. In Jenkins et al. (2011), this process was conducted for each year of the CPS to produce national-level inequality statistics. However, since I am interested with sub-national-level inequality, there are slight modifications that must be made to the Jenkins et al. (2011) methodology. In the following, I describe the specifics of this modified process. I merge the CPS swap-file with the public-use CPS microdata and then calculate household incomes from the resulting dataset.11 I then flag all topcoded and potentially under-reported individual income source amounts in the combined CPS sample from 1968- 2014.12 As shown in Appendix A, using a cutoff at the 97.5th percentile allows me to reasonably approximate trends in top income shares and the Gini coefficient calculated using IRS tax return data. I adjust for household size by using an equivalence scale equal to the square root of the number of people in the household. Each individual in the household is then associated with this size-adjusted household income amount. I assume that the distribution of size adjusted household income is well approximated by a Generalized Beta distribution of the second kind (GB2).13 The four 11The CPS counts business losses as income, so there are some households which report negative total household incomes. I am utilizing some inequality measures (notably the Theil index) which are only defined for positive incomes. After aggregating to the household level, I truncate all non-positive household incomes to $0.01. 12All geographic identifiers are not available for all years. Some MSAs are identified starting in the 1968 March CPS, but states are not identified until 1977. 13See McDonald (1984), Majumder and Chakravarty (1990), Wilfling (1996) and McDonald and Ransom (2008) 14 parameter GB2 distribution has a probability density function f (y) = ayap−1 bapβ (p, q) ( 1 + ( y b )a)p+q where β (·) is the Beta function, and a, b, p and q are positive parameters. Several well known distributions used in the income inequality literature (including the Singh-Maddala and Pareto distributions) are special cases of the GB2 distribution. In the model fitting step of the imputation process, I estimate a GB2 distribution to fit the observed data via maximum likelihood.14 I am interested in inequality at sub-national levels, so it would be natural to modify the Jenkins et al. (2011) method to fit a distribution to local level incomes, and then proceed with the imputation. However, in practice, there are often too few observations at the local level. Hence I fit a single distribution for each year using observations from the entire US in the swap-file-modified CPS microdata.15 I then perform the imputation for each MSA or state separately based on the estimated parameters from this fitted GB2 distribution. This method maximizes the number of top incomes employed to fit the GB2 distribution, and hence should yield a more-accurate approximation of the true upper tail of the distribution. In the second step of the GB2 imputation process, I split the data for each year by MSA or state, and construct partially synthetic datasets for each geographic unit in each year. Non-topcoded households enter each partially synthetic dataset unchanged. For households with topcoded (or potentially under-reported) incomes, I replace the topcoded income with a draw from the fitted GB2 distribution. Specifically, each topcoded income 14I use the R package GB2 package (https://cran.r-project.org/web/packages/GB2/index.html) to perform the maximum likelihood estimation. 15Following Jenkins et al. (2011), I use only the top 70% of incomes in the CPS sample in each year in the estimation. This is intended to improve the fit at the right tail of the distribution. 15 yi is replaced by y∗i = F −1 (ui (F (yi) , 1)) where F (·) and F−1 (·) are the CDF and inverse CDF associated with the fitted GB2 distribution, and ui (a, b) is a draw from a uniform distribution with lower bound a and upper bound b. I construct n = 200 such partially synthetic datasets for each MSA (or state) in each year. The final step in the GB2 imputation process is to estimate income inequality measures for each partially synthetic data set. I then follow the rules from Reiter (2003) for combining estimates from partially synthetic datasets. For each income inequality measure of interest, I produce a point estimate q∗ equal to the simple average of the estimates qi over the n partially synthetic datasets: q ∗ = 1 n n∑ i=1 qi. 16 I calculate several income inequality measures using this methodology.17 The two most important are the Gini coefficient and the top 1% share. The Gini coefficient is calculated as: G = 1 µx2n2 n∑ i=1 n∑ i′=1 |xi − xi′ | where xi is household i’s income. The top 1% income share is the income accruing to the top 1% of the income distribution as a share of total income. If incomes are arrayed in non-decreasing order i = 1, ..., n, and k is the closest integer to 99n 100 , then the top 1% share is Top1Share = n∑ k xi n∑ i xi 16The variance of this point estimate is given by V ∗ = 1n ( 1 n−1 ∑n i=1 (qi − q∗)2 + n∑ i=1 vi ) where vi is the variance (calculated using asymptotic variance formulas) of the point estimate qi from the ith partially synthetic dataset. 17In addition to the measures listed, I calculate the 90-10 ratio, Theil, Atkinson and Schutz indices, various Generalized entropy indices (varying the α parameter from 0.5-2), the coefficient of variation, the 99-median and 95-median ratios, and the 80-20 ratio, as well as Kuznets ratios. 16 I also calculate a number of Lorenz ordinates for a variety of analyses. The pth ordinate of the Lorenz curve is L (p) = 1 µ ∫ F−1(p) 0 xf (x) dx Note that the top 1% share can therefore also be expressed as a function of Lorenz ordinates: Top1Share = 1− L (0.99) I calculate these measures for all 50 states (and the District of Columbia) Unfortunately not all MSAs have enough observations in the CPS to perform reliable income distribution analysis. Hence, I focus on MSAs which have at least fifty household observations in each year of the CPS. The observations from MSAs with less than fifty households are recoded as non-MSA households, so that they may still be used for the fitting of the GB2 distribution. Additionally, since the evolution of income inequality is of primary importance, households in MSAs which have large gaps in their time series are likewise recoded as non-MSA households. After these modifications, 177 MSAs remain in the dataset. The state-level and MSA-level inequality measures that I estimate here are available in an online data appendix.18 I also perform similar exercises, generating datasets of income inequality measures for all 50 states and 277 MSAs using microdata on income from the American Community Survey, which are available in the same online data appendix. The specifics of the ACS-based analysis can be found in Appendix B. From Market Income to Haig-Simons Income The GB2 imputation method addresses potential concerns about topcoding and under-reporting of income at the top of the distribution. Using inequality measures 18Available at http://pages.uoregon.edu/jlv/state-and-msa-inequality.html 17 generated using this method, it is possible to examine how inequality has changed over the last several decades. Before doing so, however, it is necessary to define carefully the income concept to be used. The most conceptually attractive definition both i) treats income receiving units equally (by adjusting for household size using an equivalence scale), and ii) defines income as the change in net worth that can be used for consumption (where this is usually referred to as “Haig-Simons income”). Measuring Haig-Simons income requires information about changes in wealth due to capital gains, which are unavailable in the Census Bureau surveys used in this study. Nonetheless, it is possible to use all of the information available in the March CPS to construct an income definition that is as close as possible to the Haig-Simons definition. In the spirit of Armour et al. (2014), I will construct a “crosswalk” from the most commonly used income concept in inequality studies using tax return data (e.g. Piketty and Saez (2003)) to a broader measure of income that I will call “post-fiscal” income. Table 2 summarizes each step along the crosswalk. The crosswalk consists of five income definition, arranged in order of their “closeness” to Haig-Simons income, where each step corresponds to an additional source of income or a change in income-receiving unit. The first step is tax-unit Adjusted Gross Income (AGI), which includes all “market income” received by a tax unit (the individuals included on a tax return). The second step, pre- transfer income, changes the income receiving unit to the household, and adjusts for household size by applying a square root equivalence scale.19 The third step, post-transfer income, adds cash transfers from the government. The fourth step, post-tax income, is post-transfer income after state and federal income taxes (including tax credits). The final step, post-fiscal income, adds the cash value of in-kind aid to post-tax income. The contrast between post-fiscal and pre-transfer income incorporates not just how market 19I divide each household’s income by the square root of the number of members in the household and assign this size-adjusted household income to each individual in the household. 18 income has evolved, but also how state interventions to reduce income inequality have evolved, in the form of changes to the tax and transfer system and, increasingly, the provision of in-kind benefits to households in the bottom half of the income distribution. TABLE 2. Crosswalking from Market to Disposable Income Definition AGI Income Tax unit adjusted gross income Pre-transfer Income Pre-tax Household income (wages + investment) Post-transfer Income Pre-transfer Income plus cash transfers Post-tax Income Post-transfer income less taxes and tax credits Post-fiscal income Post-tax income plus the monetary value of noncash benefits Cash transfers include: TANF/AFDC, Social Security, Disability (SSDI or SSI), unemployment benefits. Non-cash benefits include SNAP (food stamps), the value of public housing and rental subsidies, the value of home heating subsidies, the value of government provided school lunch, the fungible value of Medicaid and Medicare benefits and the value of WIC benefits To visualize how recent trends in inequality vary across states and across income definitions, I present several graphs. First, I visualize the trends for all states and all five income definitions simultaneously as a “spiderweb graph” in Figure 1 for the Gini coefficient and the top 1% share. The central tendency across all 50 states for each income definition is emphasizes to permit easy visualization of the average trend in inequality across states. As expected, there are large differences in the level of income inequality when using different income measures. Income inequality is highest for the AGI definition of income, and lowest for post-fiscal income. The levels of measured income inequality differ substantially depending on the income definition. Especially in the period after about 2000, trends in income inequality vary by income definition as well. Income inequality measured by the Gini coefficient using AGI income is on average increasing 19 (albeit at a slower rate after about 2000 than the run-up in inequality before 2000). There is a noticeable decline in the post-fiscal Gini coefficient after 2000. This is also true of the post-fiscal top 1% share, with particularly sharp drop occurring right after 2000. 20 FIGURE 1. State Income Inequality, 1977–2014: Crosswalking from Market Income to Post-fiscal Income 0.2 0.4 0.6 0.8 19 80 19 90 20 00 20 10 year St at e G in i income AGI postfiscal posttax posttrans pretrans State Gini Coefficient, 1977−2014 0.1 0.2 0.3 19 80 19 90 20 00 20 10 year St at e To p 1% S ha re income AGI postfiscal posttax posttrans pretrans State Top 1% Share, 1977−2014 Since the “spiderweb” graphs by design abstract from trends in any individual state, it can be instructive to examine trends in a few states individually. Figure 2 illustrates the Gini and top 1% share income inequality crosswalks from AGI market income to post- fiscal income for the four largest states. For these large states, the pattern seen in the central tendency across all states is even more readily apparent: measured using market income, income inequality has increased almost monotonically since 1977, but measured using a broader, post-fiscal definition, inequality has actually decreased since 2000. 20This may be due in part to the rising value of rental subsidies and Medicare/Medicaid given that both health care costs and rents have been rising over this period. 20 Additionally, it is clear that the primary drivers of the difference between market and post-fiscal income vary depending on the type of inequality measure used. If inequality is measured by the top 1% share, household size is relatively unimportant, but the opposite is true for inequality measured by the Gini. FIGURE 2. State Income Inequality, 1977–2014: Crosswalking from Market Income to Post-fiscal Income, 4 largest States California Illinois New York Pennsylvania 0.3 0.4 0.5 0.6 0.7 0.3 0.4 0.5 0.6 0.7 19 80 19 90 20 00 20 10 19 80 19 90 20 00 20 10 year St at e G in i income AGI postfiscal posttax posttrans pretrans State Gini Coefficient, 1976−2013 California Illinois New York Pennsylvania 0.05 0.10 0.15 0.20 0.25 0.05 0.10 0.15 0.20 0.25 19 80 19 90 20 00 20 10 19 80 19 90 20 00 20 10 year St at e G in i income AGI postfiscal posttax posttrans pretrans State Gini Coefficient, 1976−2013 The difference in the level of income inequality when using pre-transfer versus post- fiscal income is one way of quantifying the degree to which government intervention is redistributive (Kakwani (1977). Examination of this Kakwani-style measure of redistributiveness across states and over time can shed light on the degree to which different states have adjusted policy in response to rising income inequality. Figure 3 summarizes the state-level trends in redistributiveness measured as the difference between the Gini coefficient using pre-transfer income, and the Gini coefficient using post-fiscal 21 income. It appears to be the case that, on average, redistributiveness has increased since 2000, especially in the period of time corresponding to the Great Recession of 2007-2010. FIGURE 3. State-level Redistributiveness, 1977-2014 0.0 0.1 0.2 0.3 1990 1995 2000 2005 2010 year R ed ist rib u tiv e n e ss State−level Redistributiveness, 1986−2013 Inference on Changes in Income Inequality Having generated a dataset measuring point estimates of state-level and MSA-level income inequality as accurately as is possible with public-use data, the logical next step is to analyze how inequality has been changing over time, and what might be driving these changes. Towards that end, I pursue two lines of inquiry. First, I can perform inference on scalar inequality measures and Lorenz ordinates. The scalar metrics and different Lorenz ordinates are sensitive to changes in different parts of the distribution, and hence comparing the results of these tests between these different inequality measures can provide qualitative information on how inequality has been changing. A second line of inquiry uses the bootstrap method of Barrett et al. (2014) to test directly the hypothesis of Lorenz dominance. I can then make direct welfare statements based on this inference. 22 I will use bootstrap methods to make inferences based on the data. This is a partial departure from much of the survey-based literature, which has relied on inference using asymptotic variance formulas. There is reason to believe, however, that this type of inference may lead to incorrect conclusions. Flaichaire and Davidson (2007) and Brzezinski (2013) note that bootstrap procedures have lower nominal type I error probabilities than inference using asymptotic variance formulas for the Theil coefficient and top income shares respectively, and Mills and Zandvakili (1997) show that the same is true for the Gini. Additionally, in simulation experiments, it appears that the hypothesis tests using asymptotic variance formulas for Lorenz curves have extremely low power for the sort of hypotheses I will be interested in testing, relative to bootstrap-based inference. Mills and Zandvakili (1997) develop a simple bootstrap method for conducting inference on the Gini coefficient. To extend this method to topcoded and potentially under-reported data, I extend the semi-parametric bootstrap technique of Flaichaire and Davidson (2007). For clarity, I first describe the simple Mills and Zandvakili (1997) method, and then describe my extension of the semi-parametric bootstrap technique of Flaichaire and Davidson (2007) for conducting inference on changes in inequality in the presence of topcoded or potentially under-reported data. The main object of interest is the difference between an inequality metric θ calculated using samples from two populations: φ = θ2 − θ1.21 In the simple bootstrap method, one would bootstrap resample (sample with replacement) from the samples of the two populations in the data, calculate the inequality measures θ1, θ2 using the bootstrap samples, and calculate the difference φ ∗ j . After performing this n times and collecting the bootstrap replicates Φ = { φ∗j }n j=1 , inference can then be done by constructing confidence intervals ( Φα/2,Φ1−α/2 ) , where 21I will be conducting inference where these two populations are a single state in two different years, but this method could also be used to test a hypothesis of no difference in inequality between two different states in a given year. 23 subscripts refer to percentiles of the distribution of bootrap replicates. Bootstrap p-values can also be calculated, e.g. for the null hypothesis of no change, as p∗ = 1 n n∑ j=1 1 (( φ∗j − φ )2 ≥ φ2) where φ is the full sample (“true”) estimate of the difference in inequality. To adapt this to our environment, I move to semi-parametric bootstrapping.22 For each time period, I partition the data into topcoded and non-topcoded observations. The non-topcoded observations are bootstrap resampled as in the simple case. The topcoded observations are imputed as in the GB2 multiple imputation process used to generate point estimates. That is, for each bootstrap replication, each topcoded observation yi is replaced by a draw from the tail of the fitted GB2 distribution. In practice, for a fitted distribution Fˆ (y), I take a uniform draw ui ∈ [ Fˆ (yi) , 1 ) , and then replace yi with Fˆ−1 (ui). I then estimate θj from the full non-topcoded sample and the topcoded draws, and θ∗j from the bootstrap sample of the non-topcoded observations and the topcoded draws. From these I obtain φ∗j , φj as before. After repeating this process n times, I can then calculate a point estimate of the difference in inequality as φˆ = 1 n ∑n j=1 φj as in the multiple imputation case. The bootstrap confidence interval is the same as before, and the two sided p-value for a null of no change is p∗ = 1 n n∑ j=1 1 (( φ∗j − φˆ )2 ≥ φˆ2 ) 22This semi-parametric method is similar to Flaichaire and Davidson (2007), differing in the parametric distribution used (the Generalized Beta II distribution) 24 or, for a one-sided p-value for the null hypothesis of no change against the alternate that the the change is negative, p∗ = 1 n n∑ j=1 1 (( φ∗j − φˆ ) ≤ φˆ ) With this semi-parametric bootstrap method, I perform inference on the same scalar inequality metrics for which I have produced point estimates in the previous section. Performing inference in this case involves many more hypothesis tests than are feasible to display individually, given that the focus is on with State-level and MSA-level inequality. I summarize the results by the proportion of states or MSAs for which I can reject the null hypothesis of no change in inequality, and show individual results for only the most populous of states and MSAs. To analyze trends in income inequality, I will perform tests of the null hypothesis that inequality did not change over a specific period of time. Most but not all previous studies utilizing Census microdata have shown little to no increase in inequality since the mid-1990’s at the national level. Studies utilizing tax return data find that inequality has increased dramatically, although in general at a slower rate than the 1980’s. Further, these studies tend to find that the increase in inequality is driven by an increasing share of income accruing to the richest 1%. In light of this, I will consider two questions: whether income inequality increased from 1986 through 1993, and whether income inequality has increased from 1994 onward.23 Due to a change in the way the Current Population Survey was administered, estimates of income inequality from before 1994 are not easily comparable with estimates after 1994, and hence dividing the sample at this discontinuity has an intuitive appeal. 23For the pre-1993 test, I choose 1986 as the starting year, since this is earliest year in which the post- fiscal income inequality series can be constructed. 25 Figures 4 and 5 summarizes inference using CPS microdata for two scalar inequality measures—the Gini coefficient and the top 1%’s share of income, for the periods 1986– 1993 and 1994–2014 at the State level. These figures show the 95% confidence interval around the change in the inequality measure in question for the eight most populous states. These largest states all have positive point estimates for the change in inequality by either measure in both time periods. For about half of all states, the change in the Gini coefficient is not statistically different from zero in either time period. On the other hand, the change in the top 1% share is statistically insignificant for almost all states from 1994 to 2014 and for most states from 1986 to 2014. There is some interesting heterogeneity across the income definitions—notably the confidence intervals around the differences in post-tax income inequality are substantially smaller than they are for the other income concepts. FIGURE 4. Difference in Gini Coefficient, 1986–1993 and 1994–2014 l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l New York Pennsylvania Ohio Illinois North Carolina Florida Texas California 0.000 0.025 0.050 Income l l l l Pre−transfer Post−transfer Post−tax Post−fiscal Change in Gini Coefficient, 1986−1993 (8 Most Populous States) l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l New York Pennsylvania Ohio Illinois North Carolina Florida Texas California −0.04 0.00 0.04 0.08 Income l l l l Pre−transfer Post−transfer Post−tax Post−fiscal Change in Gini Coefficient, 1994−2014 (8 Most Populous States) Given the properties of these two income inequality measures, these results suggest that increases in income inequality between 1994 and 2014 might be driven by changes 26 FIGURE 5. Difference in top 1% Share, 1986–1993 and 1994–2014 l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l New York Pennsylvania Ohio Illinois North Carolina Florida Texas California −0.025 0.000 0.025 0.050 Income l l l l Pre−transfer Post−transfer Post−tax Post−fiscal Change in Top 1% Share, 1986−1993 (8 Most Populous States) l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l New York Pennsylvania Ohio Illinois North Carolina Florida Texas California −0.05 0.00 0.05 0.10 Income l l l l Pre−transfer Post−transfer Post−tax Post−fiscal Change in Top 1% Share, 1994−2014 (8 Most Populous States) within the bottom 99% of the income distribution. Although there is less evidence for a top incomes-driven change in income inequality, it is difficult to disentangle why there is a failure to reject the null hypothesis of no change for the top 1% share. This may be due to (a) inadequate coverage of top incomes in the CPS survey, or (b) because top income shares did not increase. In an effort to further characterize the changes in income distribution, I next consider inference concerning Lorenz curves, which allow fopr the identification of the regions of the income distribution driving these changes in inequality. Recall that the ordinates of a Lorenz curve L (p) = 1 µ ∫ y=F−1(p) 0 xf (x) dx can be interpreted as the cumulative share of income accruing to the bottom (p) 100% of the income distribution. The change in specific Lorenz ordinates then describes the change 27 in income shares of the bottom p percent of the income distribution. Note that dL (p) dp = y µ where y = F−1 (p). Thus for two populations with income distributions subscripted as 1 and 2 for convenience, d (L1 (p)− L2 (p)) dp = y1 µ1 − y2 µ2 Which is to say, the slope of the difference in Lorenz ordinates graphed in (p, L1 (p)− L2 (p)) space describes the change in the pth percentile’s relative income. This can be thought of as as how much the pth percentile is relatively better off. A slope change can then be interpreted as delineating the portion of the income distribution which is relatively better off under distribution 1 and which is relatively better off under distribution 2. Consider a case in which changes in income inequality are driven solely by increases in the top 1%’s incomes while the bottom 99% stays the same. In this case, L1 (p) − L2 (p) < 0,∀p ≤ 0.99 and d(L1(p)−L2(p))dp < 0,∀p ≤ 0.99. On the other hand, if L1 (p) − L2 (p) < 0,∀p ≤ 0.99 but ∃p ≤ 0.99 s.t. d(L1(p)−L2(p))dp > 0 However if the former holds but the latter does not, then this is evidence for both rising top incomes and changes in inequality within the bottom 99%. If L1 (p) − L2 (p) has only one local minimum, this further suggests that there is income polarization within the bottom 99%, with the top of the bottom 99% seeing relative improvements, while the bottom of the bottom 99% is relatively worse off. Figures 6 and 7 inventory the evidence concerning changes in Lorenz curves at the state level for the four income concepts of interest over the period 1994-2014. Each figure visualizes the point estimate of the change in Lorenz ordinate, as well as 95% 28 FIGURE 6. Change in Lorenz Ordinates, 1994–2014, Pre-transfer and Post-transfer Income AK AL AR AZ CA CO CT DE FL GA HI IA ID IL IN KS KY LA MA MD ME MI MN MO MS MT NC ND NE NH NJ NM NV NY OH OK OR PA RI SC SD TN TX UT VA VT WA WI WV WY −0.3 −0.2 −0.1 0.0 0.1 0.2 −0.3 −0.2 −0.1 0.0 0.1 0.2 −0.3 −0.2 −0.1 0.0 0.1 0.2 −0.3 −0.2 −0.1 0.0 0.1 0.2 −0.3 −0.2 −0.1 0.0 0.1 0.2 −0.3 −0.2 −0.1 0.0 0.1 0.2 −0.3 −0.2 −0.1 0.0 0.1 0.2 0.0 0 0.2 5 0.5 0 0.7 5 1.0 0 0.0 0 0.2 5 0.5 0 0.7 5 1.0 0 Percentile of Income Ch an ge in L or en z O rd in at es Change in Lorenz Ordinates, 1994−2014, Pre−transfer Income AK AL AR AZ CA CO CT DE FL GA HI IA ID IL IN KS KY LA MA MD ME MI MN MO MS MT NC ND NE NH NJ NM NV NY OH OK OR PA RI SC SD TN TX UT VA VT WA WI WV WY −0.2 −0.1 0.0 0.1 0.2 −0.2 −0.1 0.0 0.1 0.2 −0.2 −0.1 0.0 0.1 0.2 −0.2 −0.1 0.0 0.1 0.2 −0.2 −0.1 0.0 0.1 0.2 −0.2 −0.1 0.0 0.1 0.2 −0.2 −0.1 0.0 0.1 0.2 0.0 0 0.2 5 0.5 0 0.7 5 1.0 0 0.0 0 0.2 5 0.5 0 0.7 5 1.0 0 Percentile of Income Ch an ge in L or en z O rd in at es Change in Lorenz Ordinates, 1994−2014, Post−transfer Income bootstrap confidence intervals (obtained via the semi-parametric bootstrap process described above). For almost all states, the Lorenz curve for the income distribution in 2014 is lower than the Lorenz curve for 1994, although the changes at any specific ordinate are not necessarily statistically different from zero.24 Further, the change in the Lorenz curve ordinates is negative and decreasing with p until around the 80th percentile for many states, at which point the change in the Lorenz curve ordinates begins to increase (although it remains negative). In general, the Lorenz curves for the different income definitions exhibit largely similar changes over the period 1994-2014. These patterns in the change in Lorenz ordinates over the income distribution are consistent with both an increase in top incomes, and increasing income inequality within the bottom 99%. I have established that there have been statistically significant increases in income inequality since the 1990’s, especially within the bottom 99% of the income distribution. I 24Oregon is the notable exception to this general trend. 29 FIGURE 7. Change in Lorenz Ordinates, 1994–2014, Post-tax and Post-fiscal Income AK AL AR AZ CA CO CT DE FL GA HI IA ID IL IN KS KY LA MA MD ME MI MN MO MS MT NC ND NE NH NJ NM NV NY OH OK OR PA RI SC SD TN TX UT VA VT WA WI WV WY −0.1 0.0 0.1 −0.1 0.0 0.1 −0.1 0.0 0.1 −0.1 0.0 0.1 −0.1 0.0 0.1 −0.1 0.0 0.1 −0.1 0.0 0.1 0.0 0 0.2 5 0.5 0 0.7 5 1.0 0 0.0 0 0.2 5 0.5 0 0.7 5 1.0 0 Percentile of Income Ch an ge in L or en z O rd in at es Change in Lorenz Ordinates, 1994−2014, Post−tax Income AK AL AR AZ CA CO CT DE FL GA HI IA ID IL IN KS KY LA MA MD ME MI MN MO MS MT NC ND NE NH NJ NM NV NY OH OK OR PA RI SC SD TN TX UT VA VT WA WI WV WY −0.1 0.0 0.1 −0.1 0.0 0.1 −0.1 0.0 0.1 −0.1 0.0 0.1 −0.1 0.0 0.1 −0.1 0.0 0.1 −0.1 0.0 0.1 0.0 0 0.2 5 0.5 0 0.7 5 1.0 0 0.0 0 0.2 5 0.5 0 0.7 5 1.0 0 Percentile of Income Ch an ge in L or en z O rd in at es Change in Lorenz Ordinates, 1994−2014, Post−tax Income next examine Lorenz dominance, which gives normative content to the changes in Lorenz curves shown previously. Distribution 1 is said to “Lorenz dominate” distribution 2 if L1 (p) ≥ L2 (p)∀p ∈ (0, 1) and ∃p ∈ (0, 1) s.t. L1 (p) > L2 (p) I extend the bootstrap test for Lorenz dominance suggested by Barrett et al. (2014) to accommodate topcoding by using the semi-parametric approach used previously. I then perform this Lorenz dominance test for each state using the four household income concepts. I first review the Barrett et al. (2014) method, and then describe my extension. Consider two distributions (numbered 1 and 2 for convenience) with associated Lorenz 30 curves L1, L2. Define φ (p) = L2 (p)− L1 (p), the functional I (φ) = ∫ 1 0 φ (p)1 (φ (p) > 0) dp and Tn = n1n2 n1+n2 . To test the null hypothesis H10 L2 (p) ≤ L1 (p) ∀p ∈ [0, 1] (2.1) i.e. that distribution 2 does not Lorenz dominate distribution 1, I perform m bootstrap replications, using the same semi-parametric scheme as above: I calculate φˆ∗i (p) from a sample composed of bootstrap resampled observations for non-topcoded observations, and GB2 draws for the topcoded observations, and φˆMi (p) from a sample composed of the full non-topcoded sample and draws from the GB2 distribution imputed for the topcoded observations. I then calculate φˆ (p) = ∑m i=1 φˆ M i (p). Thus, φˆ M i (p) is just the GB2 multiple imputation estimate of L2 (p) − L1 (p). Finally, I construct a one-sided bootstrap p-value from the bootstrap replications pˆ1 = 1 m m∑ i=1 1 ( TnI ( φˆ∗i (p)− φˆ (p) ) > TnI ( φˆ (p) )) A test of the hypothesis in equation 2.2 can be conducted based on the rule “reject if pˆ < α”. This is equivalent to a test of weak Lorenz dominance. It is straightforward to test the opposite hypothesis H20 : L2 (p) ≥ L1 (p) ∀p ∈ [0, 1] (2.2) by reversing the order of the two distributions and constructing the bootstrap p- value pˆ2. A test of strong Lorenz dominance can then be conducted by examining both bootstrapped p-values. I conclude that distribution 1 strongly Lorenz dominates distribution 2 if both pˆ1 < α and pˆ2 > α. 31 It is most straightforward to conduct a test for the Lorenz dominance of an early distribution over a later distribution (e.g. when comparing 2014 to 1994, the 1994 distribution plays the role of distribution 2, and the distribution in 2014 takes the place of distribution 1). Table 3 summarizes these results for the eight largest states and each of the four income concepts. The table summarizes the two p-values which can be used jointly to test for strong Lorenz dominance—if p1 < α and p2 > α the conclusion is that the income distribution in 1994 Lorenz-dominates the distribution in 2014. I conclude that this Lorenz dominance has occurred for many states, although the number of states where this occurs diminishes with the level of redistributiveness encapsulated in the income definition. Assuming α = 0.05, Lorenz dominance holds for 29 states using pre-transfer income, and for 31 states using post-transfer income. However, Lorenz dominance holds for only 23 states using post-tax income, and for only 20 states using post-fiscal income. This is the strongest evidence yet that income inequality may not have risen as much if taxes and transfers (including in-kind programs) are taken into account. This result is in line with Armour et al. (2014). Potential Explanations for Changing State-level Income Inequality I have demonstrated that there have been significant changes in State and MSA-level income inequality concentrated in the bottom 99% of the income distribution. Although I do not find strong direct evidence of top-incomes-driven inequality changes, I cannot rule these out. At any rate, such changes are not incompatible with the effects I observe in the lower part of the distribution. To complete the examination of state-level income inequality, I examine potential explanations for the rising income inequality observed since 32 TABLE 3. Lorenz Dominance Results Pre-Transfer Post-Transfer State p1 p2 p1 p2 California 0.006 1 0.002 0.130 Florida 0.136 0.090 0.048 1 Illinois 0.014 0.070 0.008 0.998 New York 0.098 0.076 0.016 1 North Carolina 0.002 0.052 0 1 Ohio 0.002 0.070 0.006 1 Pennsylvania 0.012 0.094 0.016 1 Texas 0.004 1 0.014 0.054 Lorenz Dominance: 29 States 31 States Post-tax Post-Fiscal State p1 p2 p1 p2 California 0.004 0.074 0.006 0.090 Florida 0.106 1 0.074 1 Illinois 0.002 0.998 0 0.996 New York 0.012 0.998 0 1 North Carolina 0.002 0.998 0.006 1 Ohio 0.012 1 0.016 1 Pennsylvania 0.002 1 0.002 1 Texas 0.020 0.056 0.036 0.048 Lorenz Dominance: 23 States 20 States 33 the 1990’s. Previous studies of inequality have suggested a number of factors that might account for the increase in inequality observed in recent decades. The evidence presented so far suggests that rising income inequality has been driven both by rising top incomes and by changes within the bottom 99%. To examine each of these factors, I will perform a “horse-race” type analysis, using fixed effects panel regressions to compare the relative influence of five common explanations for rising income inequality. These five candidate explanations are 1) unionization rates, 2) minimum wages, 3) top marginal tax rates, 4) human capital attainment and 5) technological advancement. The first two factors can be measured directly.25 To capture state variation in top tax rates, I include both the overall marginal capital gains tax rate and the overall marginal income tax rate in subsequent regressions. I capture human capital attainment by the fraction of a state’s population with at least a bachelor’s degree, and technological advancement by the number of patents granted per capita. I first consider how these five potential factors are related to two scalar measures of inequality (the Gini Coefficient and top 1% share) and then move to the use of ordinates of the the Lorenz curve as dependent variables. As in Bishop et al. (1991), changes in the slope of the size of the effect of a variable on the Lorenz share with respect to the cumulative proportion p of the population can be interpreted as the marginal effect on the income share (non-cumulative) of the pth percentile of the population.26 Other things equal, I expect that unionization rates and the real value of minimum wages will affect primarily the bottom 99% of the distribution, while technological advancement and top marginal tax rates will affect primarily top incomes. Educational 25I measure the minimum wage as the binding statutory minimum wage in a state, deflated by the CPI- U price index. 26This is because the Lorenz curve is a sum of infinitesimal income shares, and therefore the effect on cumulative income shares can be expressed as a sum of the effects on infinitesimal income shares 34 attainment may affect both parts of the distribution. Note that there is no ex ante reason to expect that each of these factors might affect income inequality identically for each income concept. In particular, note that the broader income definitions (e.g. post-fiscal income) incorporate the effects of policies which may be designed in response to the various factors’ effects on market income. In the current setting, causal identification is difficult, especially given free labor mobility between states. Neither unionization rates nor human capital attainment have been subject to the types of exogenous discrete policy variation necessary for difference-in- difference or regression discontinuity methods, and obviously exogenous instruments are not readily available. To overcome potential simultaneity problems, the best available course of action is to control for unobserved heterogeneity across states and over time to the fullest extent possible via fixed effects, and to compare results across a number of models and specifications. If most of the estimated effects lie in a relatively narrow band this can be taken as suggestive evidence as to the true effect size. The baseline horse-race model compares the influence of these 5 effects in a model including State and year fixed effects: Ineqit = αi + αt + β1Unionit + β2MinWageit+ β3Taxit + β4Educit + β5Patentsit + γXit + it (2.3) To modify the above to allow for more flexibility in absorbing time-varying heterogeneity, I can allow for State-specific linear trends and/or quadratic trends, as in: Ineqit = αi + αt + θ1,it+ β1Unionit + β2MinWageit+ β3Taxit + β4Educit + β5Patentsit + γXit + it (2.4) 35 Ineqit = αi + αt + θ1,it+ θ2,it 2 + β1Unionit + β2MinWageit+ β3Taxit + β4Educit + β5Patentsit + γXit + it (2.5) In each model, Xit is a vector of other potential time-varying confounding factors which might be related to inequality (including population density, government spending per capita, demographic characteristics, changes in household size and composition, industry composition, real state personal income per capita, the state unemployment rate and age composition of the state). Tables 4 and 5 report results from regressions estimated using the Gini coefficient as the dependent variable, for each of the four difference income concepts. In line with expectations, the unionization rate has a statistically significant and negative impact on the Gini coefficient for three of the four income concepts, suggesting that unionization reduces income inequality. The exception to this trend is post-fiscal income inequality, with which neither unionization nor any of the other determinants of interest has any statistically significant relationship. Top marginal tax rates also appear to have a substantial impact on changes in income inequality, although interestingly top marginal capital gains rates rather than income tax rates appear to drive this.27 The effects of the other determinants have the expected signs for most income concepts. the level of the minimum wage has reduces inequality while human capital attainment and technological advancement increase inequality, although these estimated effects are not statistically different from zero. The estimated effect of unionization appears to have a larger effect on inequality using pre-transfer or post-transfer income concepts, which is consistent with unionization primarily affecting wage income. Top marginal tax rates affect both pre- tax and post-tax income inequality, which suggests that the effect of tax rates working 27This is interesting given that capital gains are not included in any of the income definitions. Changes in capital gains tax rates may indirectly affect dividend income, however, which is included in the income definitions. 36 through the elasticity of taxable income rather than the mechanical effect of redistributive progressive taxes. TABLE 4. Determinants of State Income Inequality (Pre-transfer and Post-transfer) Dependent variable: Pre-transfer Gini Post-transfer Gini (1) (2) (3) (4) (5) (6) Union Coverage −0.231∗∗∗ −0.323∗∗∗ −0.327∗∗∗ −0.183 −0.320∗∗ −0.338∗∗ (0.085) (0.108) (0.124) (0.121) (0.142) (0.163) Minimum Wage −0.0001 −0.001 −0.002 0.0004 −0.001 −0.001 (0.002) (0.002) (0.002) (0.003) (0.002) (0.003) Capital Gains Tax Rate −0.006∗∗∗ −0.007∗∗ −0.008∗ −0.008∗∗∗ −0.008∗∗ −0.009 (0.002) (0.003) (0.004) (0.002) (0.004) (0.006) % College Educated 0.077 0.130 0.0001 0.163 0.183 0.062 (0.144) (0.169) (0.178) (0.188) (0.234) (0.247) Patents per cap. 0.013 0.024 0.015 0.021 0.028 0.013 (0.011) (0.015) (0.023) (0.014) (0.020) (0.028) Linear Trends? No Yes Yes No Yes Yes Quad. Trends? No No Yes No No Yes Observations 1,000 1,000 1,000 1,000 1,000 1,000 Note: ∗p<0.1; ∗∗p<0.05; ∗∗∗p<0.01 All models include State and Year fixed effects Other variables included in regressions but omitted from this table: top marginal income tax rate, real personal income per capita, the mean and standard deviation of years of education, % employed in manufacturing, population density, R&D spending per capita, % black, % Latino, median age, divorce rate, unemployment rate, % over 55, % under 25, % non-citizens, % foreign born, government expenditures per capita Tables 6 and 7 report analogous results using the top 1%’s share as a dependent variable. When examining how the five factors of interest are related to state-level income inequality when measured by the top 1%’s share of income, largely similar patterns of estimated effects obtain in terms of signs, although not necessarily significance. Of the 37 TABLE 5. Determinants of State Income Inequality (Post-tax and Post-fiscal) Dependent variable: Post-tax Gini Post-fiscal Gini (1) (2) (3) (4) (5) (6) Union Coverage −0.160∗∗ −0.220∗∗ −0.195∗ −0.115 0.009 0.001 (0.080) (0.097) (0.113) (0.128) (0.145) (0.173) Minimum Wage 0.001 0.00002 −0.001 −0.003 0.001 −0.0003 (0.002) (0.002) (0.002) (0.002) (0.003) (0.004) Capital Gains Tax Rate −0.006∗∗∗ −0.006∗∗ −0.007∗ −0.005 −0.003 −0.005 (0.002) (0.003) (0.004) (0.003) (0.006) (0.007) % College Educated 0.084 0.082 −0.014 −0.147 −0.056 −0.142 (0.130) (0.157) (0.168) (0.198) (0.271) (0.300) Patents per cap. 0.012 0.021 0.016 0.004 0.040∗ 0.043 (0.010) (0.014) (0.020) (0.016) (0.023) (0.037) Linear Trends? No Yes Yes No Yes Yes Quad. Trends? No No Yes No No Yes Observations 1,000 1,000 1,000 1,000 1,000 1,000 Note: ∗p<0.1; ∗∗p<0.05; ∗∗∗p<0.01 All models include State and Year fixed effects For more details see Table 4 38 five potential factors outlined above, only the effect of tax rates has an individually statistically significant effect on the top 1%’s share (although not for all specifications). TABLE 6. Determinants of State Top 1% Share (Pre-transfer and Post-transfer) Dependent variable: Pre-transfer Top 1% Share Post-transfer Top 1% Share (1) (2) (3) (4) (5) (6) Union Coverage −0.106 −0.155 −0.206 −0.109 −0.180 −0.248∗ (0.080) (0.110) (0.134) (0.099) (0.123) (0.150) Minimum Wage −0.001 −0.001 −0.001 −0.0002 −0.0002 0.0003 (0.002) (0.002) (0.003) (0.002) (0.003) (0.003) Capital Gains Tax Rate −0.007∗∗∗ −0.006 −0.007 −0.009∗∗∗ −0.006 −0.008 (0.002) (0.005) (0.006) (0.002) (0.005) (0.006) % College Educated −0.034 0.085 −0.011 −0.013 0.052 −0.031 (0.181) (0.230) (0.248) (0.201) (0.251) (0.270) Patents per cap. 0.010 0.025 0.020 0.012 0.026 0.019 (0.013) (0.017) (0.030) (0.014) (0.018) (0.032) Linear Trends? No No No No No No Quad. Trends? No No No No No No Observations 1,000 1,000 1,000 1,000 1,000 1,000 Note: ∗p<0.1; ∗∗p<0.05; ∗∗∗p<0.01 All models include State and Year fixed effects For more details see Table 4 As noted, although the Generalized Beta II multiple imputation method used to estimate income inequality attempts to address an important weakness in the current population survey, the underlying microdata ultimately represent only a sample of top incomes. As a robustness check for the analysis, I estimate regressions using state-level estimates of the top 1% share from Frank (2009). Table 8 summarizes these results. Notably, the estimated effects of all five major factors are qualitatively similar to the 39 TABLE 7. Determinants of State Top 1% Share (Post-tax and Post-fiscal) Dependent variable: Post-tax Top 1% Share Post-fiscal Top 1% Share (1) (2) (3) (4) (5) (6) Union Coverage −0.067 −0.096 −0.119 −0.144∗∗ −0.141∗∗ −0.153∗ (0.061) (0.082) (0.099) (0.068) (0.071) (0.083) Minimum Wage 0.0003 0.0003 −0.0003 −0.001 −0.001 −0.001 (0.002) (0.002) (0.002) (0.002) (0.002) (0.002) Capital Gains Tax Rate −0.005∗∗∗ −0.004 −0.006 −0.002 −0.001 −0.002 (0.002) (0.004) (0.004) (0.001) (0.001) (0.002) % College Educated −0.041 0.035 −0.042 0.025 0.015 −0.012 (0.140) (0.175) (0.191) (0.125) (0.158) (0.167) Patents per cap. 0.007 0.017 0.012 −0.001 0.007 −0.009 (0.010) (0.013) (0.022) (0.008) (0.010) (0.015) Linear Trends? No Yes Yes No Yes Yes Quad. Trends? No No Yes No No Yes Observations 1,000 1,000 1,000 1,000 1,000 1,000 Note: ∗p<0.1; ∗∗p<0.05; ∗∗∗p<0.01 All models include State and Year fixed effects For more details see Table 4 40 effects reported in Table 6. In addition, perhaps because the Frank (2009) data contains the whole population of top incomes, estimated effects are actually more precise here. In particular, note that at least for some specifications, rates of unionization have a statistically significant effect on top income shares, as does, interestingly, the real minimum wage. Top marginal tax rates exhibit a similarly sized effects as in Table 6, although the estimates are less precise than those for the effects of unionization and minimum wages. TABLE 8. Determinants of State Top 1 % Share, Frank (2009) Data Dependent variable: (1) (2) (3) Union Coverage −0.075 −0.172∗∗∗ −0.041 (0.064) (0.057) (0.071) Minimum Wage −0.005 −0.009∗∗∗ −0.012∗∗∗ (0.003) (0.003) (0.004) Capital Gains Tax Rate −0.002∗ −0.001 −0.001 (0.001) (0.001) (0.001) % College Educated 0.085 −0.046 −0.031 (0.133) (0.102) (0.113) Patents per cap. −0.005 0.005 0.020 (0.014) (0.014) (0.014) Linear Trends? No Yes Yes Quad. Trends? No No Yes Observations 1,000 1,000 1,000 Note: ∗p<0.1; ∗∗p<0.05; ∗∗∗p<0.01 All models include State and Year fixed effects For more details see Table 4 To further explore how these five factors might affect income inequality at different points along the income distribution, I estimate regressions using the Lorenz ordinates as dependent variables. Doing so allows me to examine of which factors might be closely 41 related to within-bottom-99% inequality, as opposed to those which are primarily driving top incomes. Positive coefficients suggest a reduction in income inequality (a positive coefficient implies an increase in the cumulative income going to incomes below a given percentile) and negative coefficients imply an increase in income inequality. As in the previous Lorenz curve inference, examining how estimated effects change across the distribution has implications about the relevance of the factor in question for top income inequality versus bottom 99% income inequality. Suppose that E (p;X) is the “effect curve” for variable X, describing the estimated effect of X on the Lorenz curve at percentile p. Then if E (p;X) < 0, ∀p or E (p;X) > 0∀p then variable X has an effect primarily on top incomes. On the other hand, if there is a slope change in the effect curve, this implies that variable X has an effect primarily on within-99% inequality. Figure 8 shows the effect of unionization, capital gains tax rates and patents per capita on Lorenz ordinates for the four income concepts. For each concept save post-fiscal income, the effect of unionization has its greatest effect around the 80th percentile of the income distribution, suggesting an effect on income inequality within the bottom 99%. Top marginal tax rates, have increasing effects across the distribution (at least for pre-tax inequality), suggesting an effect primarily on top income shares. Similarly, technological advancement, proxied by patents granted per capita, has an increasing and negative effect on Lorenz ordinates. This suggests that technology increases top incomes, although these effects are not statistically significant. It is clear, however, that this is a borderline case: 95% confidence intervals contain only small areas greater than zero. Of these five potential factors of interest, it can be argued that unionization is the most important because it has the most robust statistical impact on income inequality. However, this may be primarily affecting income inequality within the bottom 99% of the income distribution. Top marginal tax rates are also important, and are probably 42 FIGURE 8. Determinants of Lorenz Ordinates, Selected Covariates postfiscal posttax posttrans pretrans −0.2 0.0 0.2 0.4 −0.2 0.0 0.2 0.4 25 50 75 25 50 75 Lorenz Curve Ordinate Es tim at ed E ffe ct Effect of Unionization on Lorenz Ordinates, by Income Concept postfiscal posttax posttrans pretrans −0.005 0.000 0.005 0.010 0.015 −0.005 0.000 0.005 0.010 0.015 25 50 75 25 50 75 Lorenz Curve Ordinate Es tim at ed E ffe ct Effect of Capital Gains Tax Rates on Lorenz Ordinates, by Income Concept postfiscal posttax posttrans pretrans −0.06 −0.03 0.00 0.03 −0.06 −0.03 0.00 0.03 25 50 75 25 50 75 Lorenz Curve Ordinate Es tim at ed E ffe ct Effect of Patents Per Capita on Lorenz Ordinates, by Income Concept postfiscal posttax posttrans pretrans −0.004 0.000 0.004 0.008 −0.004 0.000 0.004 0.008 25 50 75 25 50 75 Lorenz Curve Ordinate Es tim at ed E ffe ct Effect of Real Minimum Wage on Lorenz Ordinates, by Income Concept 43 responsible for changes in top incomes. Interestingly, variations in human capital acquisition across states appear to have little to no effect on state-level changes in income inequality. Conclusion This study has generated a new dataset consisting of inequality measures at the State and MSA level using four different income concepts. The analysis of income inequality measured using these different income concepts in this paper sheds new light on the trends in income inequality within sub-national entities. This analysis also deepens our understanding of which factors may be the driving forces behind rising income inequality. Using semi-parametric multiple imputation and bootstrap methods, I show that income inequality has increased substantially over the last two decades, and note that this change has coincided with both rising top incomes as well as increasing inequality within the bottom 99% of the distribution. One important way forward in the study of state-level and MSA-level income inequality is to attempt to bridge the gap between state-level income inequality datasets such as Frank (2009) which use tax return data, and the survey based approach taken here (using CPS microdata). Given matched survey and administrative data, it may be possible to retain the benefits of both the survey based approach (more information about household structure and non-taxable income sources) as well as the tax return data (the full universe of top incomes), while potentially incorporating better administrative data on program participation for in-kind transfers.28 Fully utilizing all available information 28Leveraging administrative records on transfer payments and participation in in-kind will require linking these administrative records to Census survey records. This will require access and support of official statistical agencies. 44 on incomes is important for deepening our understanding of the dynamics of income inequality. In addition to the analysis presented in this paper on potential determinants of state-level inequality, there are a number of potential applications for this data. Several papers have already made use of this data, including Voorheis (2016), which examines the effect of income inequality on political polarization, Voorheis et al. (2015), which examines the effect of income inequality on carbon emissions, as well as Chapter IV of this dissertation, which uses the MSA-level income inequality dataset to examine the connection between income inequality and environmental justice. Income inequality is an important and much-discussed topic of obvious relevance to both researchers and policymakers. This study has created a novel, and hopefully quite useful dataset on state-level and MSA-level inequality that corrects for the censoring in the underlying Census Bureau microdata. This dataset allows for inferences concerning the trend in inequality since the 1990’s, about which there is some controversy in the literature. I find that for most states and MSAs, there has been a statistically significant increase in inequality (either in terms of Lorenz dominance or pairwise comparison of inequality statistics.) This increase in inequality is driven by both top incomes (i.e. an increase in the top 1% share) and by increasing disparities in income within the bottom 99%. This inequality increase seems to be driven by a decrease in workers’ bargaining power (via union density). In contrast, human capital attainment and technological advancement do not appear to explain the observed changes in income inequality within the bottom 99%. Insofar as greater inequality is generally considered a poor outcome, one policy implication is that laws strengthening the institutional position of unions may be an effective tool for decreasing income inequality as a complement to the redistributive effects of the tax and transfer system. 45 CHAPTER III TRENDS IN ENVIRONMENTAL INEQUALITY IN THE UNITED STATES: EVIDENCE FROM SATELLITE DATA Introduction Concerns about air quality and its negative health and ecological impacts are widespread, both in developed countries and, increasingly, in developing countries. A wide literature, spawned from the environmental justice movement, has documented differences across sub-groups in average exposure to pollutants and toxic chemicals. However, little is known what these differences imply about the overall inequality in the size distribution of pollution exposure. This paper adds to the stock of knowledge about environmental inequality by proposing a dashboard approach1, combining recently developed theoretical tools for measuring environmental inequality with conventional environmental justice measures. I also examine potential explanations for the variation in this environmental inequality over time. To examine trends in, and determinants of, environmental inequality, I utilize two datasets of satellite-derived observations of ground-level pollution exposure. These datasets provide fine-grained information about exposure to two important pollutants: nitrogen oxides (NOx) and particulate matter smaller than 2.5 microns (PM2.5). These two pollutants are relevant to human health. In addition to the direct health impact of exposure to NOx and PM2.5, they are both highly correlated with other pollutants (e.g. ozone) and as such can be taken as an index of overall air quality. These satellite-derived datasets have been studied widely in other disciplines (e.g. in the atmospheric science 1I borrow the “dashboard” terminology from the multi-dimensional poverty literature, e.g. Alkire and Foster (2011) 46 literature), however they have not yet found their way into the economics literature. These data provide several advantages over conventional tropospheric air quality data derived from ground-level air quality monitors or generated from air quality models using data on emissions from point sources. Using these two sources of data, I describe trends in environmental inequality for the entire United States, as well as for states and major metropolitan areas within the United States. I consider two different ways of considering the degree to which the distribution of pollution exposure is “unequal.” The first follows from Sheriff and Maguire (2014) and defines “environmental inequality” in terms of commonly used inequality measures adapted from the income distribution literature. The second follows from the environmental justice literature and quantifies “environmental justice” as the difference in exposure between advantaged and disadvantaged subgroups. These two concepts can be viewed as capturing “vertical equity” and “horizontal equity.”2 I examine how traditional environmental justice concerns relate to environmental inequality (rather than simply average environmental quality) by examining how census- tract-level demographic characteristics correlate with different measures of environmental inequality. To accomplish this, I adopt a re-centered influence function (RIF) regression approach. RIF regression is a general estimation strategy for examining how individual characteristics affect a summary statistic of a distribution. RIF regressions have been used mostly in the study of income or wage distributions (e.g. Dube (2013)) and I extend the use of RIF regressions to the study of pollution exposure distributions.3 2To capture horizontal equity it would be best to compare exposure between groups conditional on income levels. Doing so is difficult with the group demographic data available at the Census tract level, however. 3In Appendix C, I further use an extension of Oaxaca-Blinder style decompositions to examine to what degree observable demographics explain differences in exposure between advantaged and disadvantaged census tracts across the exposure distribution, and to what degree demographic characteristics appear to have contributed to changes in exposure across the distribution of exposure over time. 47 I obtain several key results. First, I confirm that the average level of pollution exposure has decreased since 1998, a result consistent with other studies of trends in environmental quality. Second, I examine how the measures of environmental inequality and environmental justice comprising my “dashboard” have changed over time. I find that most measures in the dashboard are declining over time, with heterogeneity across the two pollutants of interest. In general, decreases in exposure inequality for NOx have occurred through decreased exposure at the top of the distribution, while decreases in exposure inequality for PM2.5 have come via changes at the bottom of the distribution. By considering more than just average pollution exposure in examining the correlation between demographic factors and pollution exposure (using RIF regressions), I can expand upon the usual types of empirical results in the environmental justice literature. I find that the African American proportion of the population of a census tract is positively related to the level of pollution in very polluted tracts (increasing inequality), but the relationship is negative at the median (reducing inequality), and insignificant for less polluted tracts. However, the African American proportion of a census tract increases national environmental inequality. The rest of the paper proceeds as follows. First I briefly discuss the related literature on environmental justice and inequality. I then describe the data to be used in the analysis, and the process for assigning pollution exposure levels to each census tract. I define the various measures of environmental inequality in the “dashboard”, and investigate trends in exposure inequality, contrasting these to trends just average exposure. Finally, I present results from re-centered influence function regressions, and conclude with directions for future work. 48 Previous Literature There is a large literature, scattered across disciplines, that has attempted to quantify the degree to which environmental harms might be felt disproportionately by disadvantaged communities. This literature has arisen at least in part as a response to activists from the Environmental Justice movement. In fact, this literature takes its name and often its terminology from these same political advocates, although in practice most environmental justice scholarship is concerned with documenting environmental disparities rather than making normative claims about alternative policies to address these disparities. One early study by an advocacy group that stimulated the subsequent environmental justice literature is the so-called UCC study (Chavis and Lee (1987)), which noted a correlation between the locations of toxic waste sites and local ethnic minority populations. Subsequent analysis (e.g. Bryant and Mohai (1992)) of toxic waste sites has confirmed that the African-American population share in a neighborhood is sometimes an important correlate with the probability that a toxic waste site will be located in a neighborhood. Other dimensions of disadvantage are also important correlates with toxic waste siting. Despite this consensus, there remains some disagreement as to whether this correlation implies racist siting policies by firms or local authorities. Several studies (e.g. Been and Gupta (1997) and Wolverton (2009)) have persuasively shown that this correlation may merely capture the subsequent hedonic general equilibrium effects—toxic waste sites depress local property values, which leads to an inflow of poor, disadvantaged individuals and an outflow of richer, advantaged individuals. This phenomenon is often termed “coming to the nuisance.” Among others, Mohai et al. (2009) and Brulle and Pellow (2006) ably summarize this vast literature. 49 The simple proximity of toxic waste sites to disadvantaged communities was the primary early concern for the environmental justice literature, but this proximity itself merely implies exposure without actually measuring it. Measuring the actual disproportionate exposure to airborne and waterborne toxics and pollutants (and the negative health impacts of these exposures) has also been a large concern. However, the measurement of ambient pollution exposure is much more complicated than measuring proximity to the point locations of fixed toxic waste sites. Studies of disparate pollution exposure have utilized air pollution models derived from Toxic Release Inventory (TRI) or National Air Toxic Assessment (NATA) data (e.g. Morello-Frosch and Jesdale (2006), Zwickl and Moser (2015)) or data from the US Environmental Protection Agency’s (EPA) network of air quality monitoring stations. At least one paper, by Clark et al. (2014) has used satellite data on pollution exposure to document spatially disparate exposure levels. This, and almost all other papers in this literature have considered only single cross sections of data, and thus have not been able to consider how these disparate impacts might be changing over time. The literature has achieved consensus that in any given cross section at a point in time, disadvantaged communities are exposed to disproportionately high levels of environmental hazards. However, there is not yet a consensus on how best to summarize or quantify the extent of these disparities for use either in policy analysis or for the comparison of trends in the distribution of environmental hazards over time. This confusion about measurement continues, despite the fact that, at least since Executive order 12898, issued in 1994, US government agencies have been required to take environmental justice concerns into account when enacting or changing regulations. Several approaches to measuring environmental inequality (and environmental injustice) have been advanced in the literature. The simplest possible measure of 50 environmental justice/injustice is of course just the difference in average exposure between advantaged and disadvantaged populations—Clark et al. (2014) use this approach. To address the possibility that there may be differential exposure across the distribution, Boyce et al. (2016) have proposed comparing quantiles of the race-specific exposure distributions. Another approach sporadically used by Zwickl et al. (2014), Zwickl and Moser (2015) and Boyce and Voirnovytskyy (2010) among others has been to import measurement tools from the income distribution literature to describe environmental inequality. This approach has been extended and formalized by Maguire and Sheriff (2011) and Sheriff and Maguire (2014), who adapt several commonly used income inequality measures to produce normatively sensible conclusions. The important difference between income and environmental hazards is that environmental hazards are “bads”. The normative conclusions of unmodified income inequality measures calculated using pollution exposure data may produce potentially unethical conclusions about policy—someone with high income is highly advantaged, while someone with high pollution exposure is highly disadvantaged. In this paper, I advance the literature on environmental inequality and environmental justice along several dimensions. First, I extend the use of satellite data first used by Clark et al. (2014) to study environmental justice beyond a single cross- section. This allows me to describe not just disparities in exposure across groups at a point in time, but also how these disparities have been changing. Second, I propose a dashboard approach to environmental justice analysis by cataloging and categorizing different measures which quantify disparities in pollution exposure. I apply these measures both to document how environmental inequality has been changing over time, and also to reveal potential correlates of these changes in environmental inequality. 51 Data and Institutional Details Any examination of the distribution of pollution exposure across space and across sociodemographic groups requires measurements of ground-level pollutant concentrations at a relatively fine spatial scale. Ideally, I would like to observe the actual exposure of each individual as they go about their day. Such a dataset is infeasible without a universal monitoring regime, a prospect that seems unlikely outside of a George Orwell novel. As a next best alternative, I will estimate pollution concentrations at the level of US census tracts.4 I can then estimate measures of inequality and environmental justice measures using these concentrations, weighted by tract population. These measures of inequality can be interpreted as the inequality across individuals in exposure. There are two limitations that come with using tract-level average pollution concentrations to estimate pollution exposure inequality or environmental justice measures. First, weighting by tract population essentially assigns the tract-average exposure to each individual residing in the census tract. This is equivalent to assuming no tract-level inequality in pollution exposure. Second, using ground-level concentrations as a measure of exposure ignores potential adaptation behavior on the part of individuals. Adaptation behavior may further be related to sociodemographic characteristics such as income, especially if adaptation requires purchasing expensive equipment. If likelihood to engage in adaptation is positively related to, e.g. income, then tract-level average concentrations will over-estimate the exposure of rich individuals, and hence underestimate the gap in exposure between the rich and the poor. Thus each of these limitations suggest that using tract-level average concentrations to calculate exposure inequality and environmental justice measures will produce a lower bound estimate. 4Due to the limitation in spatial coverage of the satellite data, I am only able to calculate tract-level concentrations for the contiguous United States. 52 This paper capitalizes on the use of two novel sources of data on pollution exposure derived from remote sensing satellite observations. The first dataset, described in detail by Lamsal et al. (2008), infers ground-level NOx concentrations using a chemical air transport model and observations of the tropospheric vertical column densities of NOx. The second dataset, described in detail by van Donkelaar et al. (2016), infers ground-level PM2.5 concentrations from aerosol optical depth observations from multiple satellite sources.5 Each of these datasets is available at relatively fine geographic resolution—the NOx data are on a (0.1 × 0.1)-degree grid, while the PM2.5 data are on a (0.01 × 0.01)-degree grid (this corresponds to 10 km and 1 km at the equator, respectively). Both datasets provide observations for most of the globe, spanning approximately 70 degrees S through 70 degrees N. These datasets provide substantially more-comprehensive spatial coverage than measurements of ground level exposure from unevenly distributed fixed monitors. By way of comparison, note that there are about 500 NOx monitors in the EPA’s monitoring network, while there are 100,000 uniformly distributed fixed geographic grid points within the contiguous United States in the NOx satellite data. To infer person-level exposure to NOx and PM2.5 respectively, it is necessary to link information on the detailed spatial variation in remotely sensed ground-level pollutant concentrations with information about where people are located. In the absence of any more-fine-grained information about the spatial distribution of households, I assume that each person in a census tract is exposed to, the average ground-level pollutant concentration at the population-weighted centroid of her Census tract. For each satellite data source in each year, I interpolate over the fixed grid to the centroid of each census tract using inverse distance weighting. I use all gridpoints within a 10km radius from the 5These datasets are provided for public use by the Atmopsheric Composition Analysis Group at Dalhousie University, and can be accessed at http://fizz.phys.dal.ca/~atmos/martin/ 53 centroid of a census tract in calculating the IDW estimate of tract-level exposure.6 I using this IDW interpolation, I obtain annual average concentrations for PM2.5 for each year from 1998-2014, and for NOx from 2005-2011. The spatial distribution of person-level exposure to PM2.5 and NOx can be visualized using a choropleth map, as in Figure 9, which visualizes the data on exposure to PM2.5 and NOx for 2005 for the contiguous United States. At this point in time, it is clear that pollution exposure is concentrated in urban areas, although elevated ambient exposure levels are also present in large parts of the rural eastern US. These higher exposures may be in part due to this region’s relatively heavy reliance on coal-fired power plants for electricity generation. Coal-fired power plants are a major source of emissions of chemical precursors to NOx and PM2.5. On average, pollution exposure has been declining over the sample periods for the two satellite-derived pollution exposure datasets. Figure 10 shows the population- weighted annual average exposure to PM2.5 and NOx exposure for the contiguous US. The PM2.5 data span the period 1998–2014, while the NOx span 2005–2011. NOx and PM2.5 exposure have both also decreased markedly over these periods. Both datasets suggest that decreases in exposure to pollution were most pronounced in the period before 2008. After 2008, average exposure to both NOx and PM2.5 continued to decline, but at a slower rate than the previous period. Quantifying Environmental Inequality and Environmental Justice Using two independent satellite-derived datasets, it is clear that average pollution exposure has decreased markedly since the 1990s. There are a number of ways, however, in which this reduction in average pollution exposure could have occurred. The uneven 6Because of the differing resolution of the gridded data, this translates to using the 4 nearest gridpoints for the NOx data, and the 100 nearest gridpoints for the PM2.5 data. 54 FIGURE 9. Annual Average PM2.5 and NOx Exposure, 2005 55 FIGURE 10. National Average NOx Exposure (in ppb), 2005-2011 1.50 1.75 2.00 2.25 2.50 2006 2008 2010 year Av e ra ge N O x Ex po su re National Average NOx Exposure 10 12 14 16 18 2000 2005 2010 year Av e ra ge P M 2. 5 Ex po su re National Average PM2.5 Exposure distribution of these pollution exposure reductions is of particular interest. Any effort to explore how the changes in the distribution of pollution exposure that have accompanied the average improvements in environmental quality requires some methods for quantifying pollution exposure inequality, and normative tools to rank distributions according to some social evaluation function. Rather than seeking to identify a single measure which summarizes the distribution of pollution exposure across and within different groups, and its relation to other dimensions of disadvantage, I propose a “dashboard” consisting of several different measures which together can be used for distributional policy evaluation. Considering the whole dashboard provides a transparent view of how the distribution of pollution exposure is evolving that is resistant to cherry-picking individual measures to provide ex-post justification for preferred policy. This dashboard considers two ways of thinking about the distribution of pollution exposure. The first, which I term “environmental inequality” can be viewed as a vertical equity concept. Environmental inequality measures summarize the 56 size distribution of pollution exposure while preserving anonymity of the exposed — the identities of the exposed individuals do not matter, only their exposure levels. The second type, which I term “environmental justice”, can be viewed as a horizontal equity concept, and summarizes the distribution of pollution exposure across demographic categories. The measurement of environmental inequality can adapt some of the methods used in the literature on the measurement of income inequality. However, it is not immediately obvious that one can simply calculate income inequality measures using pollutant concentration and obtain ethically sensible measures of environmental inequality. Pollution exposure, unlike income, is a ”bad”, which means that “advantage” in pollution exposure runs in the opposite direction from “advantage” in income. Thus the bottom 10% of the pollution exposure distribution is actually the most advantaged segment of society along an environmental dimension, whereas the bottom 10% of the income distribution is the most disadvantaged segment of society along the income dimension. One apparently straightforward way to transform the distribution of a “bad” into a good, suggested by Sheriff and Maguire (2014) is to “reverse the sign”—use the negative values of pollutant concentrations when calculating measures of environmental inequality. This has the appeal of imposing the intuitively “correct” (or ethically sensible) ordering. Individuals experiencing the largest absolute value of pollutant concentration are now at the bottom of the distribution, and those who experience the lowest absolute value of pollution exposure are at the top. The environmental inequality components of the “dashboard” consist of three variants of the Lorenz curve, and two scalar environmental inequality measures (the Atkinson index and the Kolm-Pollak index). First, I consider the relative Lorenz curve L (p) = 1 µ ∫ F−1(p) −∞ xf (x) dx 57 Each ordinate L (p) can be estimated with discrete data by the Kovacevic and Blinder (1997) estimator: Lˆ (p) = 1 Nˆ µˆ ∑ wiyiI ( yi ≤ F−1 (p) ) where the wi are weights, Nˆ = ∑ wi, and µˆ is the weighted mean of the outcome. Pollution concentrations are expressed as negative quantities, so the Lorenz curve will lie above the line of perfect equality, contrary to the “usual” case of the income-based Lorenz curve. For any two distributions 1 and 2, Lorenz dominance of distribution 2 over distribution 1 requires L2 (p) ≤ L1 (p) ∀p ∈ [0, 1] For any two equal-mean distributions, relative Lorenz dominance has normative content: a relative Lorenz dominating distribution would be preferred by every concave social welfare function, as shown by Maguire and Sheriff (2011). Even for distributions with unequal means, the relative Lorenz curve can still provides useful information about the fairness of the distributions, although it no longer generates a unanimous ordering of distributions by all concave social welfare functions. Second, I consider the Generalized Lorenz curve, which is just the relative Lorenz curve scaled by the mean of the outcome distribution (in this case, mean pollution exposure): GL (p) = µL (p) = ∫ F−1(p) −∞ xf (x) dx Generalized Lorenz ordinates can be estimated by a modification of the Kovacevic and Binder (1997) estimator: GˆL (p) = µˆLˆ (p) = 1 Nˆ ∑ wiyiI ( yi ≤ F−1 (p) ) 58 The Generalized Lorenz dominance condition is the same for pollution the income generalized Lorenz dominance criteria from Shorrocks (1983): GL2 (p) ≥ GL1 (p)∀p ∈ [0, 1] Generalized Lorenz dominance takes into account the outcome distribution as well as the average level of the outcome. The normative content of generlized Lorenz dominance is more general than the relative Lorenz case—any concave social welfare function will prefer the Generalized Lorenz dominant distribution, regardless of the means of either of the distributions.7 I also consider the absolute Lorenz curve proposed by Moyes (1987), which captures the absolute cumulative gap in exposure between the overall average exposure µ, and the average exposure of the bottom pth percent of the population. The absolute Lorenz curve can be expressed as AL (p) = ∫ F−1(p) −∞ (x− µ) f (x) dx However, note that the absolute Lorenz curve can be expressed in terms of either the relative Lorenz or generalized Lorenz curves: AL (p) = µ (L (p)− p) or AL (p) = GL (p)− µp 7The transformation of the pollution exposure data ensures that this result, proved by Shorrocks (1983) for income, will hold for pollution exposure 59 Hence it is possible to modify the Kovacevic and Binder (1997) estimator for the absolute Lorenz case: AˆL (p) = 1 Nˆ ∑ wiyiI ( yi ≤ F−1 (p) )− µˆp where µˆ is the mean exposure, weighted by tract population. Absolute Lorenz dominance requires AL2 (p) ≥ AL (p) , ∀p ∈ [0, 1] Absolute and relative Lorenz curves are measuring the the two “unequal inequalities” concepts of Kolm (1976). The absolute Lorenz curve captures an absolute environmental inequality concept summarized by its translation invariance property: if every individual’s pollution exposure increases by a constant amount, the Absolute Lorenz ordinates are unchanged. In contrast, the relative Lorenz curve captures a relative inequality concept captured by its scale invariance property: if every individual’s pollution exposure increased by a constant proportion, the relative Lorenz ordinates would be unchanged. None of these variants of the Lorenz curve guarantee a complete ordering of distributions. If any pair of these Lorenz curves cross, then no normative conclusion is possible. In income distribution studies, scalar indices (e.g. the Gini coefficient) are often used to induce a complete ordering of distributions. However, it is not clear that many of the most common inequality measures used in the income distribution literature are directly applicable to a case where the distribution concerns a bad (or a transformation of a bad, in this case). Sheriff and Maguire (2014) show that a transformation of the Atkinson index induces a complete and sensible orderings of pollution exposure distributions. The transformed Atkinson index is IA = ( 1 N ∑(xi x¯ )1+α) 11+α − 1 60 where α is an inequality aversion parameter. I modify this formula so that I can weight by tract populations wi IˆA = ( 1 Nˆ ∑ wi (xi x¯ )1+α) 11+α − 1 where Nˆ = ∑ wi, and x¯ is the weighted mean of the outcome distribution. The Atkinson index has a normative interpretation. It is derived from a specific welfare function, so it directly ranks income distributions (higher Atkinson indices are less preferred). Additionally, the transformed Atkinson index has a cardinal interpretation. The Atkinson index of environmental inequality can be interpreted as the percent increase in the average pollution exposure necessary to maintain a constant level of welfare, if that higher average pollution exposure level were to be distributed equally. The Atkinson index, like the relative Lorenz curve, embeds a relative inequality concept that corresponds to scale invariance: it aggregates ratios rather than gaps. Sheriff and Maguire (2014) show that the Kolm-Pollak index is a suitable absolute inequality measure. The Kolm-Pollak index, like the Absolute Lorenz curve, embeds an absolute inequality concept corresponding to translation invariance. The Kolm-Pollak Index is defined as KI (x) = −1 κ ln 1 N N∑ i=1 e−κ(xi−µx), κ < 0 (3.1) where κ can be interpreted as an environmental inequality aversion parameter for the associated social welfare function. In addition to these five vertical equity (environmental inequality) measures, I include two measures of horizontal equity (environmental justice) in the “dashboard.” These measures be capture by differences in exposure between demographic subgroups. I follow the environmental justice literature and focus on differences in exposure by race and ethnicity, specifically on the difference in exposure between non-Latino whites and 61 African-Americans. The first of these horizontal equity measures is a simple comparison in averages, calculating the difference in average exposure between subgroups. The second compares the distributions across subgroups by calculating the differences in exposure at percentiles of the race-specific exposure distribution (i.e. comparing the 10th percentile of the African-American exposure distribution to the 10th percentile of the white exposure distribution.) Trends in Environmental Inequality and Environmental Justice Having defined this dashboard of measures to capture environmental inequality and environmental justice, it is possible to examine how environmental inequality has changed over time, utilizing the new satellite data on PM2.5 and NOx exposure that have not been widely studied by environmental economists. This use of ambient air pollution as an environmental disamenity in the study of environmental justice is in contrast to some earlier literature focusing on proximity to toxic sites. Concerns about discriminatory siting are less important when studying the distribution of pollution exposure.8 As with the calculation of average exposure levels, it is necessary to match ground- level pollutant concentrations with census tract population data to calculate the various environmental inequality and environmental justice measures. To fill in the gaps between the 2000 Census and the American Community Survey, I linearly interpolate between decennial censuses to estimate each census tracts’ population and racial demographics from 2001-2004, and use weighted averages of overlapping ACS 5-year file estimates for the period 2005-2014. I will calculate each of the components of the dashboard for the contiguous US using the tract-level pollution exposure and population data 8Note however that there are steep gradients in exposure near freeways, as shown by Currie and Walker (2011), which combined with the routing of freeways through minority or poor neighborhoods could be seen as an analogue to discriminatory siting of toxic facilities. 62 Trends in the scalar environmental inequality measures are summarized in Figure 11 for the Kolm-Pollak index and 12 for the Atkinson Index. Absolute environmental inequality has generally seen a downward trend for both types of pollutants over the relevant sample, although decreases in the Kolm-Pollak index appear to have slowed in the period after 2007 or so. Relative environmental inequality has a less clear overall trend, on the other hand. NOx exposure is consistently more unequally distributed than PM2.5 according to the Atkinson index, but both pollutants exhibit seem to behave in a roughly stationary way over time. FIGURE 11. Kolm-Pollak Index, PM2.5 and NOx 2 3 4 2000 2005 2010 year PM 2. 5 Ko lm −P o lla k In de x National PM2.5 Exposure Inequality (Kolm−Pollak Index) 0.50 0.75 1.00 2006 2008 2010 year Ko lm −P o lla k In de x, N O x Absolute NOx Exposure Inequality (Kolm−Pollak Index) Examining the three variants of the Lorenz curve can shed some light on which parts of the pollution exposure distribution may be driving the scalar inequality trends seen for the Kolm-Pollak and Atkinson indexes. Select relative Lorenz curves are shown for the period 2005-2011 for NOx and for PM2.5 from 1998-2014 in Figure 13. Relative Lorenz curves provide unambiguous inference about environmental inequality, since the relative Lorenz curves for, e.g. 2005 and 2011 cross at least once. Note that, unlike the relative 63 FIGURE 12. Atkinson Index, PM2.5 and NOx 0.0125 0.0150 0.0175 0.0200 2000 2005 2010 year PM 2. 5 At ki ns on In de x National PM2.5 Exposure Inequality (Atkinson Index) 0.09 0.10 0.11 0.12 2006 2008 2010 year N O x At kin so n In de x Relative NOx Exposure Inequality (Atkinson Index) Lorenz curves for NOx, the largest changes in the PM2.5 relative Lorenz curve seem to be concentrated in the more advantaged (less exposed) part of the exposure distribution. One the other hand, the point estimates of the the absolute and generalized Lorenz curves are more unambiguous. Figure 14 visualizes the absolute Lorenz curves calculated for NOx from 2005-2011, and for PM2.5 from 1998-2014. Absolute Lorenz curves for NOx exposure are higher in 2011 than 2005 throughout the exposure distribution, and absolute Lorenz curves for PM2.5 are similarly higher in 2014 versus 1998. Similar results hold for generalized Lorenz curves, summarized in Figure 15 These results are suggestive of absolute and generalized Lorenz dominance for both NOx and PM2.5. The trends in the horizontal equity (environmental justice) components of the dashboard show similar evidence of increasing equality over time. Figure 16 shows the trends in the gap between the average exposure of African-Americans and whites for PM2.5 and NOx. For both pollutants, this gap has shrunk considerably over the duration of the sample, falling by a factor of almost two for PM2.5. Note that since the 64 FIGURE 13. Relative Lorenz Curves, NOx (2005-2011) and PM2.5 (1998-2014) 0.00 0.25 0.50 0.75 1.00 0.00 0.25 0.50 0.75 1.00 Cumulative Proportion of People Cu m u la tiv e P ro po rti on o f E xp os ur e Year 2005 2007 2009 2011 Lorenz Curves, 2005−2011, NOx 0.00 0.25 0.50 0.75 1.00 0.00 0.25 0.50 0.75 1.00 Cumulative Proportion of People Cu m u la tiv e P ro po rti on o f E xp os ur e Year 1998 2002 2006 2010 2014 Lorenz Curves, 1998−2014, PM2.5 FIGURE 14. Absolute Lorenz Curves, NOx (2005-2011) and PM2.5 (1998-2014) −0.6 −0.4 −0.2 0.0 0.00 0.25 0.50 0.75 1.00 Cumulative Proportion of People Cu m u la tiv e E xp os ur e G ap Year 2005 2007 2009 2011 Absolute Lorenz Curves, 2005−2011, NOx −2.0 −1.5 −1.0 −0.5 0.0 0.00 0.25 0.50 0.75 1.00 Cumulative Proportion of People Cu m u la tiv e E xp os ur e G ap Year 1998 2002 2006 2010 2014 Absolute Lorenz Curves, 1998−2014, PM2.5 65 FIGURE 15. Generalized Lorenz Curves, NOx, 2005-2011 −2.0 −1.5 −1.0 −0.5 0.0 0.00 0.25 0.50 0.75 1.00 Cumulative Proportion of People Cu m u la tiv e A ve ra ge E xp os ur e Year 2005 2007 2009 2011 Generalized Lorenz Curves, 2005−2011, NOx −15 −10 −5 0 0.00 0.25 0.50 0.75 1.00 Cumulative Proportion of People Cu m u la tiv e A ve ra ge E xp os ur e Year 1998 2002 2006 2010 2014 Generalized Lorenz Curves, 1998−2014, PM2.5 two pollutants are measured in different units, the two black-white gaps are not directly comparable. Black-white ratios are unit-free and hence directly comparable, however, and are shown in Figure 17. Note also that the time periods of the two datasets differ, with no NOx observations before 2005. In the overlapping period of time (2005-2011), the black- white gap for NOx has a clear downward trend while the trend for PM2.5 is less clear. There are, of course, a number of ways in which the gap in exposure between two subgroups may have decreased, as is the case for both NOx and PM2.5 exposure from the beginning to the end of the relevant time periods. In the case of NOx, the reduction in the gap in exposure between blacks and whites has occurred because average black exposure was declining faster than average white exposure. Average black exposure to NOx declined from 2.89 ppb in 2005 to 1.71 in 2011 (a decline of 1.17 ppb) , while average white exposure declined from 2.27 ppb in 2005 to 1.35 in 2011 (a decline of only 0.92 ppb). A similar trend is evident in the decline in the black-white gap in PM2.5 exposure. Average black exposure declined from 20.73 µg/m3 in 1998 to 9.75 in 2014 (a decline of 10.97 µg/m3, while average white exposure to PM2.5 declined from 18.08 µg/m3 in 1998 66 FIGURE 16. National Black-White Exposure Gap, PM2.5, 1998-2014 and NOx, 2005-2011 0.5 1.0 1.5 2.0 2.5 2000 2005 2010 year Av e ra ge P M 2. 5 Bl ac k− W hi te G ap National Average PM2.5 Black−White Gap 0.35 0.40 0.45 0.50 0.55 0.60 2006 2008 2010 year Bl ac k− W hi te N O x Ex po su re G ap National Average Black−White NOx Exposure Gap to 8.83 µg/m3 in 2012 (a decline of 9.25 µg/m3). Thus the decline in the black white gap for both pollutants represents not just absolute but also relative improvements for the disadvantaged group. One way to examine how this trend of reduction in the average difference in exposure across racial groups has played out across the exposure distribution is to examine how the gap in exposure between blacks and whites has evolved at specific percentiles of the exposure distribution. Figure 18 shows the black-white percentile gap curves for PM2.5 from 1998-2014 and NOx from 2005-2011 respectively. For PM2.5, a notable feature of the percentile gap curves in each year is that the gap in exposure is actually larger at the less exposed part of the exposure distribution. This may reflect in part the differing rural/urban population distributions across the two groups. The percentile gap curve has flattened over the period 1998-2014, largely due to the decline in the gap at low levels of exposure. Environmental justice with respect to NOx appears to have evolved in roughly the opposite manner, however. The black-white exposure gap is much larger at the highly 67 FIGURE 17. National Black-White Exposure Ratio, PM2.5, 1998-2014 and NOx, 2005- 2011 1.06 1.08 1.10 1.12 1.14 1.16 2000 2005 2010 year Bl ac k− W hi te P M 2. 5 Ex po su re R at io National Average Black−White PM2.5 Exposure Ratio 1.25 1.26 1.27 2006 2008 2010 year Bl ac k− W hi te N O x Ex po su re R at io National Average Black−White NOx Exposure Ratio exposed end of the distribution, and the flattening of the percentile gap curve have largely occurred through upper percentile gap reductions. The evidence on whether the distribution of pollution exposure has become more unequal is not consistent across the measures in the dashboard. One way of synthesizing the evidence presented thus far is to examine the common features of the measures which show relatively unambiguous evidence of decreasing inequality in contrast to those measures for which such evidence is ambiguous. The Kolm-Pollak index, the absolute Lorenz curve and the gap in average exposure between blacks and whites all point towards declining exposure inequality, while the Atkinson index, relative Lorenz curve and black- white exposure ratio all show somewhat ambiguous trends over time.9 Each of the former group of measures aggregates, in one way or another, differences in absolute exposure, while the latter group aggregates relative exposure. Thus, the trends in inequality in these 9The generalized Lorenz curve also shows declining inequality, although this is driven by declines in overall average exposure. 68 FIGURE 18. National Black-White Exposure Gap, by Percentile (PM2.5 and NOx) 1998 1999 2000 2001 2002 2003 2004 2005 2006 2007 2008 2009 2010 2011 2012 2013 2014 0 1 2 3 4 5 0 1 2 3 4 5 0 1 2 3 4 5 0 1 2 3 4 5 0.25 0.50 0.75 0.25 0.50 0.75 percentile Bl ac k/ W hi te P M 2. 5 Ex po su re G ap Black/White PM2.5 Exposure Gap, by Percentile 2005 2006 2007 2008 2009 2010 2011 −1 0 1 2 3 4 −1 0 1 2 3 4 −1 0 1 2 3 4 0.00 0.25 0.50 0.75 1.00 percentile Bl ac k/ W hi te E xp os ur e G ap Black/White NOx Exposure Gap, by Percentile two groups of measures in the dashboard are consistent with more or less proportional declines in pollution exposure across the pollution exposure distribution. This, in turn, implies greater absolute improvements (declines in exposure) for disadvantaged populations. In general, the above patterns point toward the conclusion that average exposure to NOx and PM2.5 has decreased, and so has the inequality in this exposure. These results are less clear for the relative measure of environmental inequality than for measures capturing absolute environmental inequality. The two pollutants have very different cross-sectional distributions, and the two pollutants’ exposure distributions have evolved in different ways. Nonetheless the most robust finding is that not only has exposure to these two pollutants been decreasing over time, it is also likely becoming more equally distributed. 69 Explaining the Distribution of Pollution Exposure The trends in the distribution of pollution exposure at the national level point to both a decrease in average pollution exposure and a decrease in pollution exposure inequality. There are a number of potential reasons for this trend. In the context of the environmental justice literature, it is perhaps most interesting to consider how these changes may be related to the underlying demographics of the relevant geographic units (here, census tracts). I will examine the relationship between tract-level demographics and the distribution of pollution exposure by utilizing re-centered influence function (RIF) regressions to describe the variation in a functional of the national pollution exposure distribution as a function of tract-level demographics.10 Re-centered Influence Function Regressions The RIF regression method is a way to estimate the effect that variation in individual characteristics might affect a functional of the entire distribution. This method has been used most extensively in the income distribution literature, where the distributional functional is often the quantile function. In the quantile case, the RIF regression can be interpreted as an unconditional quantile regression Firpo et al. (2009). Essentially any functional of the distribution in question can be used in an RIF regression. Essama-Nssah and Lambert (2012) catalog the RIFs for most commonly used distributional functionals, including the Lorenz and Generalized Lorenz curve ordinates.11 I use RIF regressions to attempt to explain how census tract demographics may be 10Additionally, RIF regressions can be used to perform a decomposition analysis of the change in exposure over time, and of the difference in exposure between diverse and non-diverse census tracts, as in Appendix C. 11Essama-Nssah and Lambert (2012) also provide an expression for the standard Atkinson index, but not for the transformed Atkinson index from Sheriff and Maguire (2014). 70 related to the observed trends in environmental inequality noted above. Recentered influence function regressions have become an increasingly popular way of estimating distributional effects, both in the conventional wage or income distribution setting (e.g. Zhu (2016), Essama-Nssah and Lambert (2016), Dube (2013)) as well as in the study of health inequality (e.g. Heckley et al. (2016), Gaskin et al. (2015)) but have not been used in the study of environmental inequality. To review, the influence function is an analytic device commonly used in robust statistics. For a distribution of outcome variable y and a distributional functional ν, the influence function IF (y, ν) describes how each individual’s observed y affects ν (F (y)). Mathematically, the influence function is merely the directional derivative of the functional from the observed distribution towards a distribution with all probability weight at observed outcome y. The re-centered influence function merely adds the functional back into the influence function: RIF (y, ν) = IF (y, ν) + ν (F (y)). Assuming that the conditional expectation of the RIF is a linear function of some observable demographic characteristics X, so that E (RIF (y, ν)) = βX then the parameters β can be estimated via OLS. Estimating these RIF regressions can be seen as an extension of a common technique used in the environmental justice literature. In this commonly used (e.g. by Morello-frosch et al. (2002) and Clark et al. (2014)) technique, some measure of environmental hazard (e.g. number of toxic sites or air pollutant concentrations) in a neighborhood or census tract is regressed on neighborhood characteristics. Note that the RIF for the mean is RIF (y, µ) = y 71 so that the usual linear regression E (y|X) = βX is a special case of the RIF regression. Thus the RIF regression method I propose in fact nests the commonly used regressions in the environmental justice literature, but allowing for functionals other than the mean as outcomes. I will consider four such functionals: the quantile function, and ordinates of each of the three Lorenz curve variants: the relative, generalized and absolute Lorenz curves. Following Essama-Nssah and Lambert (2012), the RIF for the pth quantile is RIF (y,Q (p)) =  Qˆ (p) + p f(Qˆ(p)) y > Qˆ (p) Qˆ (p)− 1−p f(Qˆ(p)) y < Qˆ (p) where Qˆ is the empirical quantile point at p. The RIF for the pth Lorenz curve ordinate is RIF (y, L (p)) =  y−(1−p)Qˆ(p) µy − Lˆ (p) y µy y < Qˆ (p) pQˆ(p) µy − Lˆ (p) y µy y ≥ Qˆ (p) where Lˆ (p) is the pth empirical Lorenz ordinate. Finally, the RIF for the pth Generalized Lorenz ordinate is RIF (y,GL (p)) =  y − (1− p) Qˆ (p) y < Qˆ (p) pQˆ (p) y ≥ Qˆ (p) Finally, absolute inequality in pollution exposure, and not just the relative inequality captured by the relative Lorenz and Generalized Lorenz curves is of interest. A suitable way of capturing this is the absolute Lorenz curve due to Moyes (1987). The Absolute 72 Lorenz curve is defined as AL (p) = GL (p)− pµ Because the recentered influence function is a linear operator, I can use the previous expressions (plus the fact that RIF (µ (y)) = y) to define a recentered influence function for the absolute Lorenz curve: RIF (y, AL (p)) =  y − (1− p) Qˆ (p)− py y < Qˆ (p) pQˆ (p)− py y ≥ Qˆ (p) Or, more compactly, RIF (y, AL(p)) = RIF (y,GL (p))− py Firpo et al. (2009) show that in the case of the quantile function, the estimated coefficients of an RIF regression can be interpreted as “unconditional partial effects”, namely the effect of a small location shift in the distribution of X on the functional ν (F (y)). This logic directly extends to the Lorenz curve cases. The unconditional partial effects estimated from an RIF regression allow for the examination of how demographic characteristics of census tracts are related to the distributional statistics of interest, but they do not directly speak to how demographics might be driving the observed trends in environmental inequality above.12 RIF Results I first report the results of RIF regressions estimated using pooled data from 2005 through 2011. I supplement the data on tract-level NOx and PM2.5 exposure with tract-level demographic data from the ACS 5-year summary files. The ACS provides 12However, the RIF regression method can be used to perform a decomposition analysis that can address this concern. This decomposition, performed in Appendix C allows for an examination of the extent to which demographics have shaped the change in environmental inequality over time. 73 data on racial and ethnic composition, median income, poverty status, age distribution, employment and educational status at the census tract level. However, these estimates are only released as 5-year average summary files (2005-2009, etc). To back out yearly estimates of the sociodemographic variables, I take a weighted average of all the five year file estimates that contain a given year.13 So for instance, the estimate for 2005 is merely the 2005-2009 5-year file estimate, but the estimate for 2007 is an average of the estimates from the 2005-2009, 2006-2010, and 2007-2011 5 year files.14 The environmental justice literature suggests that variables capturing various types of “disadvantage” will be of interest. I will highlight in particular the proportion of the census tract that is African- American, proportion Latino, and proportion under the poverty line, the proportion with only a high school degree and the proportion with less than a high school degree. My empirical strategy is to estimate a fixed-effects version of the RIF regression outlined above: RIFi,t = δi + δt + γ1Blacki,t + γ2Latino+ γ3Poverty + γ4HS +Xi,tβ + ei,t where δi, δt are census tract and year fixed effects, and Xi,t is a vector of other sociodemographic variables.15 I repeat this for four different distributional functionals — quantiles, relative Lorenz ordinates, Absolute Lorenz ordinates and Generalized Lorenz ordinates. For each functional, I will estimate separate regressions for each p ∈ {0.05, 0.1, ..., 0.95}. Note that this linear and additively separable specification may 13The weights are linearly increasing in the difference between the middle year of the 5-year file and the year I want an estimate for, placing more weight on files that contain more years that are “closer” to the year in question. 14This approach is admittedly imperfect, and may induce measurement error. Other alternatives to matching 5-year ACS files to annual data are similarly imperfect, which highlights the limits of the data being used in this study. 15See Table 9 for a full list of covariates. 74 be subject to multicollinearity if the relevant degrees of disadvantage are correlated.16 However, note that among the independent variables, there are no pairwise correlations above 0.85 in absolute value in the estimating sample. Tables 9 and 10 summarize the results for RIF regressions estimated using quantiles as the distributional statistic(s) of interest for NOx exposure and PM2.5 exposure respectively. There is some evidence for a correlation between the minority population of census tracts — blacker and more heavily Hispanic tracts are more likely to be exposed to higher amounts of NOx and PM2.5. There is interesting heterogeneity; the proportion of a tract’s population that is black has statistically significant and positive effects on exposure mostly in less-exposed tracts, while the opposite is true for Hispanic populations. Median tract income appears to be positively related to exposure at the more exposed end of the exposure distribution, although this may be a function of the fact that high income tracts are often found in central cities (with concomitant higher levels of exposure). The unconditional quantile partial effect of a tract’s population proportion black is actually negative at the top of the exposure distribution. This implies that tracts with higher concentrations of African Americans are actually less exposed, at least at the highly exposed end of the exposure distribution. The effect of educational attainment within tracts is highly varied across the distribution; tracts with higher high school drop out populations are more exposed at the bottom of the exposure distribution, while the high school-only population is related to higher exposure near the top of the distribution. Tables 11 and 12 summarizes the results of RIF regressions estimated using the Lorenz curve ordinates as the distributional functionals of interest. Due to the multiplication of raw NOx exposure by −1 in calculating the Lorenz curve ordinates, the Lorenz curve will lie strictly above the line of perfect equality. This means that positive coefficient estimates imply an increase in environmental inequality in conjunction 16One possible way to test for the appropriateness of the specification would be to more fully saturate the model with interaction terms. 75 TABLE 9. Quantile RIF Regression Results (NOx Exposure) Dependent variable: NOx Quantiles p=0.1 p=0.25 p=0.5 p=0.75 p=0.9 (1) (2) (3) (4) (5) Black 0.124∗∗∗ 0.149∗∗∗ −0.065 −0.620∗∗∗ −2.141∗∗∗ (0.041) (0.046) (0.073) (0.153) (0.403) Latino −0.064 −0.052 0.245∗∗ 0.029 −1.353∗∗ (0.052) (0.061) (0.106) (0.227) (0.573) Poverty −0.121∗∗∗ −0.077 −0.245∗∗∗ 0.161 0.956∗∗ (0.043) (0.048) (0.079) (0.165) (0.388) UR −0.447∗∗∗ −0.290∗∗∗ 0.192∗∗ 1.066∗∗∗ −1.299∗∗ (0.055) (0.056) (0.093) (0.190) (0.507) HS Only −0.028 0.002 0.178∗ −0.419∗ 1.425∗∗∗ (0.052) (0.062) (0.104) (0.221) (0.495) Less than HS 0.256∗∗∗ 0.351∗∗∗ 0.050 −1.176∗∗∗ 0.499 (0.061) (0.071) (0.118) (0.245) (0.553) Log Med. Income −0.006 0.021 0.115∗∗∗ −0.058 0.403∗∗∗ (0.015) (0.018) (0.030) (0.061) (0.141) Observations 470,569 470,569 470,569 470,569 470,569 Note: ∗p<0.1; ∗∗p<0.05; ∗∗∗p<0.01 All models include census tract and year fixed effects Negative values can be interpreted as increasing inequality Other variables included in regressions but omitted from this table: linguistic isolation, median age, median home value, % between 20-24, % between 25-44, % between 45-64, % between 5-19, % aged 65+, % asian/pacific islander, labor force participation rate, % native, % other race, % more than bachelor’s, % unaffordable rent (30%+ of HH income), tract Gini, % some college, % commuting by bike/walk, % commuting by car, % commuting by transit, tract population, % unmarried parents, % veteran 76 TABLE 10. Quantile RIF Regression Results (PM2.5 Exposure) Dependent variable: Absolute Lorenz Ordinate p=0.1 p=0.25 p=0.5 p=0.75 p=0.9 (1) (2) (3) (4) (5) Black 0.403∗∗∗ 0.531∗∗∗ −0.002 −0.663∗∗∗ −0.286 (0.128) (0.131) (0.112) (0.176) (0.445) Latino 0.265 −0.125 −0.042 0.522∗∗ 1.823∗∗∗ (0.239) (0.218) (0.169) (0.258) (0.603) Poverty −0.203 0.392∗∗∗ 0.298∗∗ −0.056 −1.379∗∗∗ (0.151) (0.146) (0.120) (0.179) (0.444) UR −0.064 0.054 −0.130 −1.614∗∗∗ −1.530∗∗∗ (0.188) (0.172) (0.139) (0.210) (0.529) HS Only 0.159 −0.004 −0.131 0.001 0.227 (0.205) (0.203) (0.157) (0.241) (0.572) Less than HS 0.263 0.297 0.288 0.430 0.014 (0.237) (0.226) (0.178) (0.274) (0.662) Log Med. Income −0.159∗∗∗ −0.139∗∗ 0.044 0.484∗∗∗ 1.099∗∗∗ (0.059) (0.056) (0.044) (0.067) (0.165) Observations 615,630 615,630 615,630 615,630 615,630 Note: ∗p<0.1; ∗∗p<0.05; ∗∗∗p<0.01 Negative values can be interpreted as increasing inequality All models include census tract and year fixed effects For more details see Table 9 77 with an increase in the regressor in question, whereas negative coefficient would imply greater equality in exposure. The most immediate result is that across the exposure distribution, racial composition of census tracts has a large and significant effect on inequality. Increases in the black proportion of a census tract are positively related to the cumulative share of pollution exposure up through the 25th percentile, but negatively related to cumulative exposure past the median. This implies that racial composition is inequality-enhancing for more advantaged (less exposed) tracts, i.e. a more racially diverse population is associated with greater disparities in exposure among less-exposed tracts. The effect of median income is, as expected, negative throughout most of the pollution exposure distribution — implying that higher income tracts exhibit lower cumulative pollution shares. High poverty areas are strongly correlated with more unequal pollution exposure — the effect of the proportion of the population poverty implies an increase in environmental inequality across the whole pollution exposure distribution. There is some evidence for a correlation between educational attainment and environmental inequality — the effect of the proportion of the population with less than a high school degree is negative and significant through most of the pollution exposure distribution. Tables 13 and 14 summarize the results of RIF regressions estimated using the Generalized Lorenz ordinates as the distributional functionals of interest. The same inference applies here as in the unconditional quantile partial effects case. Negative coefficients imply movements towards more environmental inequality, positive coefficients towards less inequality. The black and Hispanic proportions of population increase inequality throughout the pollution exposure distribution. The Hispanic effect reaches a maximum around the median of the distribution, while the black effect is largest for the most polluted tracts. Poverty seems to have similar effects as in the Lorenz curve 78 TABLE 11. Relative Lorenz RIF Regression Results (NOx Exposure) Dependent variable: Relative Lorenz Ordinate p=0.1 p=0.25 p=0.5 p=0.75 p=0.9 (1) (2) (3) (4) (5) Black 0.120∗∗∗ 0.019 −0.021∗∗ −0.011∗∗ −0.005∗∗∗ (0.017) (0.016) (0.010) (0.005) (0.002) Latino 0.015 −0.039 0.016 0.007 0.0003 (0.040) (0.031) (0.017) (0.007) (0.003) Poverty 0.068∗∗∗ 0.078∗∗∗ 0.053∗∗∗ 0.019∗∗∗ 0.007∗∗∗ (0.021) (0.018) (0.011) (0.005) (0.002) UR −0.096∗∗∗ −0.048∗∗ 0.058∗∗∗ 0.049∗∗∗ 0.022∗∗∗ (0.025) (0.022) (0.014) (0.006) (0.002) HS Only 0.025 0.010 0.005 0.001 0.001 (0.029) (0.024) (0.015) (0.006) (0.002) Less than HS −0.058∗ −0.083∗∗∗ −0.108∗∗∗ −0.049∗∗∗ −0.012∗∗∗ (0.034) (0.028) (0.017) (0.007) (0.003) Log Med. Income −0.015∗ −0.021∗∗∗ −0.006 −0.0004 0.001 (0.009) (0.007) (0.004) (0.002) (0.001) Observations 470,569 470,569 470,569 470,569 470,569 Note: ∗p<0.1; ∗∗p<0.05; ∗∗∗p<0.01 Positive values can be interpreted as increasing inequality All models include census tract and year fixed effects For more details see Table 9 79 TABLE 12. Relative Lorenz RIF Regression Results (PM25 Exposure) Dependent variable: Relative Lorenz Ordinate p=0.1 p=0.25 p=0.5 p=0.75 p=0.9 (1) (2) (3) (4) (5) Black −0.004 −0.016∗∗ −0.026∗∗∗ −0.018∗∗∗ −0.006∗∗∗ (0.004) (0.006) (0.005) (0.003) (0.001) Latino 0.011∗∗ 0.017∗∗ 0.012 0.003 −0.002 (0.005) (0.008) (0.008) (0.005) (0.003) Poverty −0.017∗∗∗ −0.029∗∗∗ −0.015∗∗∗ −0.001 0.003 (0.004) (0.006) (0.005) (0.004) (0.002) UR 0.010∗∗ −0.013∗ −0.026∗∗∗ −0.010∗∗ −0.001 (0.005) (0.008) (0.007) (0.004) (0.002) HS Only 0.008 0.009 −0.001 −0.005 −0.005∗∗ (0.005) (0.008) (0.007) (0.005) (0.002) Less than HS −0.008 −0.007 −0.004 −0.006 −0.001 (0.006) (0.009) (0.008) (0.005) (0.003) Log Med. Income 0.008∗∗∗ 0.016∗∗∗ 0.016∗∗∗ 0.009∗∗∗ 0.002∗∗∗ (0.001) (0.002) (0.002) (0.001) (0.001) Observations 615,630 615,630 615,630 615,630 615,630 Note: ∗p<0.1; ∗∗p<0.05; ∗∗∗p<0.01 Positive values can be interpreted as increasing inequality All models include census tract and year fixed effects For more details see Table 9 80 RIF regressions, increasing exposure inequality near the middle-top of the pollution distribution. Likewise, median income seems to decrease inequality throughout the distribution, suggesting a significantly negative income-exposure relationship, which was also visible in the relative Lorenz RIF results. TABLE 13. Generalized Lorenz RIF Regression Results (NOx Exposure) Dependent variable: Generalized Lorenz Ordinate p=0.1 p=0.25 p=0.5 p=0.75 p=0.9 (1) (2) (3) (4) (5) Black −0.230∗∗∗ −0.052 0.014 −0.010 −0.022 (0.039) (0.049) (0.050) (0.046) (0.044) Latino −0.042 0.044 −0.071 −0.064 −0.054 (0.093) (0.097) (0.094) (0.089) (0.086) Poverty −0.161∗∗∗ −0.209∗∗∗ −0.195∗∗∗ −0.151∗∗∗ −0.137∗∗∗ (0.049) (0.056) (0.057) (0.053) (0.051) UR 0.210∗∗∗ 0.148∗∗ −0.019 0.016 0.074 (0.058) (0.067) (0.067) (0.063) (0.061) HS Only −0.061 −0.045 −0.051 −0.051 −0.054 (0.065) (0.074) (0.074) (0.070) (0.067) Less than HS 0.187∗∗ 0.302∗∗∗ 0.421∗∗∗ 0.359∗∗∗ 0.308∗∗∗ (0.079) (0.088) (0.087) (0.082) (0.079) Log Med. Income 0.029 0.040∗ 0.013 0.002 0.0003 (0.020) (0.022) (0.022) (0.021) (0.020) Observations 470,569 470,569 470,569 470,569 470,569 Note: ∗p<0.1; ∗∗p<0.05; ∗∗∗p<0.01 Negative values can be interpreted as increasing inequality All models include census tract and ear fixed effects For more details see Table 9 The effects of tract-level demographic characteristics on the final functional, the Absolute Lorenz curve, are summarized in Tables 15 and 16. Once again, racial composition is correlated with environmental inequality, although with heterogeneity 81 TABLE 14. Generalized Lorenz RIF Regression Results (PM25 Exposure) Dependent variable: Generalized Lorenz Ordinate p=0.1 p=0.25 p=0.5 p=0.75 p=0.9 (1) (2) (3) (4) (5) Black 0.046 0.163∗∗ 0.273∗∗∗ 0.192∗∗∗ 0.075 (0.044) (0.075) (0.082) (0.073) (0.065) Latino −0.178∗∗∗ −0.310∗∗∗ −0.374∗∗∗ −0.370∗∗∗ −0.373∗∗∗ (0.061) (0.107) (0.129) (0.124) (0.114) Poverty 0.204∗∗∗ 0.356∗∗∗ 0.257∗∗∗ 0.153∗ 0.127∗ (0.047) (0.078) (0.088) (0.083) (0.077) UR −0.016 0.337∗∗∗ 0.622∗∗∗ 0.596∗∗∗ 0.578∗∗∗ (0.058) (0.093) (0.104) (0.098) (0.090) HS Only −0.095∗ −0.122 −0.046 −0.023 −0.028 (0.057) (0.099) (0.117) (0.110) (0.100) Less than HS 0.044 −0.006 −0.097 −0.127 −0.210∗ (0.066) (0.114) (0.132) (0.124) (0.113) Log Med. Income −0.117∗∗∗ −0.248∗∗∗ −0.314∗∗∗ −0.293∗∗∗ −0.258∗∗∗ (0.017) (0.028) (0.033) (0.031) (0.029) Observations 615,630 615,630 615,630 615,630 615,630 Note: ∗p<0.1; ∗∗p<0.05; ∗∗∗p<0.01 Negative values can be interpreted as increasing inequality All models include census tract and year fixed effects For more details see Table 9 82 across pollutants. Tract African-American populations are associated with higher absolute inequality at the more exposed end of the distribution for NOx, but are associated with lower absolute inequality for PM2.5. Tract Hispanic population is associated with higher absolute inequality for both pollutants, but this association is only statistically significant for PM2.5. Interestingly, median tract income is actually associated with lower absolute inequality for NOx, but with higher absolute exposure inequality across the distribution for PM2.5. TABLE 15. Absolute Lorenz RIF Regression Results (NOx Exposure) Dependent variable: Absolute Lorenz Ordinate p=0.1 p=0.25 p=0.5 p=0.75 p=0.9 (1) (2) (3) (4) (5) Black −0.227∗∗∗ −0.044 0.030 0.014 0.007 (0.036) (0.040) (0.029) (0.015) (0.006) Latino −0.037 0.058 −0.044 −0.023 −0.005 (0.086) (0.077) (0.052) (0.025) (0.010) Poverty −0.148∗∗∗ −0.177∗∗∗ −0.131∗∗∗ −0.056∗∗∗ −0.023∗∗∗ (0.045) (0.045) (0.033) (0.016) (0.006) UR 0.198∗∗∗ 0.119∗∗ −0.077∗∗ −0.072∗∗∗ −0.031∗∗∗ (0.053) (0.053) (0.039) (0.020) (0.008) HS Only −0.056 −0.032 −0.024 −0.011 −0.005 (0.060) (0.059) (0.043) (0.021) (0.008) Less than HS 0.158∗∗ 0.228∗∗∗ 0.275∗∗∗ 0.139∗∗∗ 0.044∗∗∗ (0.073) (0.070) (0.050) (0.025) (0.010) Log Med. Income 0.029 0.040∗∗ 0.012 0.001 −0.001 (0.019) (0.018) (0.013) (0.006) (0.002) Observations 470,569 470,569 470,569 470,569 470,569 Note: ∗p<0.1; ∗∗p<0.05; ∗∗∗p<0.01 Negative values can be interpreted as increasing inequality All models include census tract and year fixed effects For more details see Table 9 83 TABLE 16. Absolute Lorenz RIF Regression Results (PM2.5 Exposure) Dependent variable: Absolute Lorenz Ordinate p=0.1 p=0.25 p=0.5 p=0.75 p=0.9 (1) (2) (3) (4) (5) Black 0.045 0.160∗∗ 0.268∗∗∗ 0.185∗∗∗ 0.066∗∗∗ (0.041) (0.065) (0.058) (0.035) (0.017) Latino −0.137∗∗ −0.206∗∗ −0.167∗ −0.060 0.0001 (0.056) (0.090) (0.088) (0.060) (0.033) Poverty 0.188∗∗∗ 0.314∗∗∗ 0.174∗∗∗ 0.028 −0.023 (0.044) (0.067) (0.061) (0.040) (0.021) UR −0.076 0.188∗∗ 0.323∗∗∗ 0.148∗∗∗ 0.040 (0.054) (0.081) (0.072) (0.048) (0.026) HS Only −0.086 −0.100 −0.001 0.043 0.052∗ (0.053) (0.085) (0.082) (0.054) (0.027) Less than HS 0.067 0.051 0.017 0.044 −0.004 (0.062) (0.097) (0.092) (0.060) (0.031) Log Med. Income −0.093∗∗∗ −0.186∗∗∗ −0.190∗∗∗ −0.107∗∗∗ −0.035∗∗∗ (0.015) (0.024) (0.023) (0.015) (0.008) Observations 615,630 615,630 615,630 615,630 615,630 Note: ∗p<0.1; ∗∗p<0.05; ∗∗∗p<0.01 Negative values can be interpreted as increasing inequality All models include census tract and year fixed effects For more details see Table 9 84 These results together shine additional light on the interconnectedness of the concepts of environmental justice and environmental inequality — it appears that racial differences and poverty are important correlates of environmental inequality. However, these results are at best only potentially interesting correlations, and cannot claim to have uncovered the causes of environmental inequality. In particular, I cannot speak to whether any correlation between the racial composition of census tracts and exposure inequality necessarily represents environmental racism, although the generally more robust correlations between income variables (median income and poverty) and exposure inequality suggests that the level and distribution of income may be important for explaining changes in the exposure distribution, a possibility I take up more formally in Chapter IV of this dissertation. Conclusion This paper proposes a “dashboard” approach to the considerations of distributional concerns in environmental policy analysis. Rather than choosing a single summary statistic (e.g. the black-white exposure gap, or the Atkinson index) as a social evaluation function when analyzing the (environmental) distributional effects of policy, looking at several indicators can be a more fruitful approach that more fully illustrates how policy is affecting the distribution of exposure to harmful pollutants. I propose two ways of thinking about the degree to which the distribution of pollution exposure is “unequal”: a horizontal equity concept, where the primary concern is the inequality in exposure across subgroups, and a vertical equity concept, where the primary concern is the degree of inequality across the whole population. I apply scalar measures of each equity concept (e.g. the Atkinson index for vertical equity, and the average black-white exposure gap for horizontal equity), in addition to measures that allow 85 for the examination of the whole distribution (black-white gaps by percentile for horizontal equity, and Lorenz curves for vertical equity). I apply this dashboard of measures of environmental inequality and environmental justice to novel data on pollution exposure derived from remote sensing observations of ground level NOx and PM2.5 concentrations. By matching these remote sensing data with census data on the distribution of population, I am able to measure not just average exposure, but also all of the environmental inequality and environmental justice measures, on an annual basis over the period 1998-2014 (only 2005-2011 for NOx exposure). Average exposure to both pollutants has decreased markedly over time, as have most measures of environmental inequality and environmental justice. I use a re-centered influence function regression estimation strategy to isolate how individual tract-level demographic characteristics affect measures of environmental inequality as a whole. I find that many characteristics that correlate with disadvantage — such as poverty rates, education levels and racial minority populations — have statistically significant associations with environmental inequality, although notably the patterns of sign and significance often differ between pollutants, suggesting the interaction between the distribution of people and the production and the fate-and-transport of pollution may depend in part of the chemical properties of the pollutants in question. This largely descriptive exercise has not, by design, involved the identification of causal factors that might be related to the observed trends in environmental quality, environmental inequality and environmental justice. The final substantive chapter of this dissertation takes up this challenge by synthesize the first substantive chapter’s discussion of trends in income inequality with this chapter’s discussion of environmental inequality, and considers whether income inequality within metropolitan areas might have a causal effect on changes in the distribution of pollution exposure. 86 CHAPTER IV ENVIRONMENTAL JUSTICE VIEWED FROM OUTER SPACE: HOW DOES GROWING INCOME INEQUALITY AFFECT THE DISTRIBUTION OF POLLUTION EXPOSURE? Introduction Income inequality has increased substantially in the last several decades, both within the United States as a whole (Piketty and Saez (2003)) and within individual US states and metropolitan areas (Frank (2009), and Chapter II of this dissertation). This fact has spawned a contentious debate, both within and outside of academia. Much of this debate has concerned itself with the causes of this increase in income inequality. Substantially less time and effort has been expended on considering the potential effects of rising inequality. In this paper, I will examine how increases in income inequality might affect the distribution of environmental disamenities, specifically, exposure to ground-level nitrogen oxides (NOx). Exposure to NOx itself, and to the ozone and smog generated when ground-level NOx interacts with sunlight and volatile organic compounds, is a major health hazard, contributing as many as 10,000 excess deaths per year in the United States (Caiazzo et al. (2013)). Concern about NOx has become particularly salient in light of the recent Volkswagen emissions scandal, wherein software on several car models was designed to circumvent emissions testing regimes in the US and Europe, resulting in thousands of tons of excess NOx emissions. Using data on ground-level NOx concentrations inferred from remote sensing observations by NASA’s Aura satellite, I am able to measure pollution exposure at a fine geographic resolution. I combine these data with information about income distributions 87 in US metropolitan statistical areas (MSAs) to examine the causal effect of changes in within-MSA income inequality on the distribution of NOx exposure within MSAs. There is likely to be some joint endogeneity between the income inequality and pollution exposure, operating through migration between metropolitan areas. To identify a causal effect in the presence of this endogeneity, I employ an instrumental variables approach, using a version of the simulated instrument introduced by Boustan et al. (2013). I construct an instrument for income inequality by simulating counterfactual MSA-level income distributions which are independent of changes in the MSA-level distribution of pollution exposure, thereby eliminating the potential endogeneity bias due to locational sorting. Using this empirical strategy, I find that increases in metropolitan area income inequality lead to decreases in the average level of NOx exposure within MSAs. However, increases in income inequality are also associated with an increase in environmental inequality. I conclude that the decrease in average exposure caused by income inequality disproportionately benefits the most-advantaged, although the least-advantaged are still better off in absolute terms. I then consider to what extent the political system might serve as a potential mechanism for the effect of inequality on pollution exposure. Specifically, I examine how income inequality is related to the pro-environmental voting records of legislators, as measured by their League of Conservation Voters score. I find that increases in income inequality within a US senator’s state lead to an increase in the LCV scores of Democratic senators, but have no discernible effect on the LCV scores of Republican senators. The remainder of the paper proceeds as follows. I review the relevant literature on environmental justice and income inequality. I then offer a conceptual model for my analysis, and describe the data sources I use. Finally, I introduce my identification strategy, and present estimation results. I discuss potential mechanisms, and examine 88 how the effect of income inequality on the distribution of environmental amenities may work through the political system. I conclude with some ideas for potential extensions and directions for future research. Previous Literature Environmental Justice The earliest environmental justice literature was motivated by the identification of differences in the levels of pollution exposure across discrete socio-economic categories. There are many capable reviews of this literature, including a recent volume collected by Banzhaf (2012) and surveys including Mohai et al. (2009) and Brulle and Pellow (2006). The central claim in this literature is that minority and poorer households are disproportionately more likely to be exposed to pollution, compared to white and richer households. In practice, papers in this literature are often able to establish only that areas near toxic emitting firms and areas with higher measured levels of pollution tend to be less white, to be less educated and/or to have lower median income than less-polluted areas. There are slightly fewer papers that deal explicitly with environmental inequality rather than environmental injustice. By environmental inequality I am referring to a measure capturing the entire distribution of the level of exposure to pollution within an area.1 This contrasts with the concept of environmental injustice, which refers to differences in subgroup means. This distinction is not entirely standard across the broad environmental injustice literature (e.g. Downey (2007)). Maguire and Sheriff (2011) and Sheriff and Maguire (2014) provide a direct adaptation, from the income inequality literature, of tools to rank distributions of environmental amenities or 1For the purposes of this paper, Environmental Inequality will be captured by a scalar index, although it is possible to measure environmental inequality using, e.g. Lorenz curves without sacrificing normative content. 89 disamenities. They suggest that normatively based scalar indices should be used for studying the environmental justice effects of a given policy. In a similar paper, although in a different context, Harper et al. (2013), apply some of the tools from the literature on the measurement of income inequality to the measurement of health outcomes, again in the context of policy analysis. Income Inequality and Environmental Quality Scholars in environmental economics have long sought to model the simple relationship between incomes and pollutant emissions. This relationship is often embedded within considerations of the Environmental Kuznets Curve (EKC)—the idea being that pollution levels first rise as countries develop (i.e. as average incomes increase), and then, at some point, begin to decline, as demand for environmental amenities increases. Although the EKC model concerns only average incomes, and not the full distribution of income, the EKC model has spawned a few papers that examine how income inequality, as distinct from just average incomes, might affect environmental quality. Berthe and Elie (2015) summarize this small but growing literature, and attempt to synthesize its findings by suggesting a theoretical structure which can explain all of the potential pathways through which income inequality might affect environmental quality. This literature has not yet reached any consensus on whether there is a relationship between income inequality and environmental quality, let alone has it established the sign of this relationship or whether this relationship is causal. Boyce (1994) is perhaps the first to consider this relationship, followed by Scruggs (1998), Ravallion et al. (2000) and others.2 Papers in this literature share a common 2A non-exhaustive list includes Torras and Boyce (1998), Heerink et al. (2001), Magnani (2000) and Neumayer (2004). More-recent papers in this literature include Zwickl and Moser (2015) and Baek and Gweisah (2013) 90 framework. Consider a propensity-to-emit function (PEF) which describes the amount of emissions embedded in each household’s consumption activity at any given level of household income, a function that I assume is non-decreasing over some range of income.3 To consider how income inequality might affect environmental quality, it is necessary only to know the concavity of the PEF. If the function is concave, so that the marginal propensity to emit is decreasing in income, then Pigou-Dalton transfers (mean-preserving progressive transfers of income) may actually increase average emissions levels. Conversely, if the PEF is convex, so that the marginal propensity to emit is increasing in income, Pigou-Dalton income transfers will reduce average emissions levels. In other words, increases in income inequality may account for environmental degradation if the PEF is convex, while decreases in income inequality may account for environmental degradation if the PEF is concave. There is, however, no obvious reason to expect, a priori, that the propensity to emit function need necessarily be either convex or concave. No consensus has emerged from the resulting empirical literature, although patterns have sometimes been observed. In cross-country regressions, generally suggest that rising income inequality increases environmental degradation (e.g. Drabo (2011)). Basing the analysis simply on within- country variation, however, has produced a wide array of estimates. There appear to be rather different relationships for developing or middle income countries (e.g. China, in the case of Golley and Meng (2012)) than for developed countries (e.g. Sweden, in the case of Bra¨nnlund and Ghalwash (2008)). In particular, the weight of the evidence suggests that the propensity to emit function may be convex for developing countries, but concave for developed nations. This implies that developing countries may be able to reduce emissions 3In a recent working paper, Levinson and O’Brien (2015) introduce the related concept of an environmental Engel curve 91 and enhance equity simultaneously, whereas developed countries face a trade-off between equity and environmental quality. There are two difficulties faced by earlier papers in this literature. First, the observability of environmental quality is often incomplete, relying on irregularly spaced ground monitors. Second, there is potential endogeneity between environmental quality and the distribution of income. No paper in this literature has proposed a credible strategy to recover estimates of the causal effect of income inequality on the environment in the presence of this potential endogeneity. I address these issues by (1) utilizing satellite data that provide information about air quality in unmonitored areas, and (2) using an instrument for inequality that addresses endogeneity due to locational sorting. Data and Empirical Strategy My goal is to capture the impact of income inequality on some functional of the pollution exposure distribution. As with many questions of distributional measurement, the choice of the proper geographic scope is important. My analysis will be performed at the level of metropolitan statistical areas (MSAs) in the United States. Metropolitan areas are typically the geographic context in which the health effects of exposure to non- uniformly mixing pollutants like NOx are most acute. The differences in these exposures within an individual metropolitan area contribute to the environmental inequality I measure. Metropolitan areas are also sufficiently large to allow for estimation of income inequality measures using publicly available Census Bureau data at an annual frequency. Thus MSAs are a natural choice as the geographic setting for this analysis. To examine the relationship between income inequality and the distribution of pollution exposure, I utilize two novel data sources. I use data on NOx concentration levels from the Aura satellite to calculate both metropolitan-area average exposure levels 92 and environmental inequality, which I capture via two alternative measures (functionals of the pollution exposure distribution). I supplement these pollution exposure data with metropolitan area income inequality measures from Chapter II of this dissertaion, and socio-demographic variables calculated from the American Community Survey (ACS) 1- year files. The Aura satellite was launched in 2004 with a mission to measure the atmospheric composition both for the troposphere (i.e. at ground-level) and for the stratosphere (i.e. the ozone layer). The Ozone Monitoring Instrument (OMI), carried by the Aura satellite, provides comprehensive global observations of the tropospheric vertical column density of NOx on a fixed grid. Vertical column densities do not correspond with ground-level concentrations of NOx on a one-for-one basis. In general, it is necessary to infer the ground-level concentrations via the use of ground-level observations from monitoring stations and a chemical air-transport model. I use the SP GC v1.01 dataset provided by the Air Composition Analysis Group (ACAG) at Dalhousie University, which is estimated using the GEOS-CHEM chemical transport model.4 More information about these data are available in Lamsal et al. (2008) and Lamsal et al. (2010).5 Figure 19 summarizes the national distribution of NOx concentrations in the form of choropleth maps. 6 The ACAG data provide yearly average NOx concentrations on a 0.1×0.1- degree grid for North America. To measure the person-level distribution of pollution exposure, I geographically interpolate the ACAG data to the census-tract level by 4A previous version of this paper used an alternate approach to inferring ground-level concentrations by comparing the distribution of ground-level observations from monitoring stations with the distribution of vertical column densities. 5These data are available from the ACAG website at: http://fizz.phys.dal.ca/~atmos/martin/ 6More detailed choropleths, including metropolitan area maps, are available in an online appendix accessible at http://pages.uoregon.edu/jlv/jmp_appendix.html. An additional set of interactive maps visualizing the distribution of NOx exposure in 2005 is available at http://pages.uoregon.edu/jlv/ interactive_NOX_maps.html 93 FIGURE 19. Average Annual NOx Exposure by Census Tract, 2005–2011 inverse distance weighting.7 These annual estimates of NOx concentrations at the census-tract level are then used to calculate metropolitan area average exposure and the necessary environmental inequality measures, which are functionals of the NOx exposure distribution. The final estimating sample includes observations for 265 MSAs, for the period from 2005 to 2011. Metropolitan area income inequality is measured by the well-known Gini coefficient, estimated from income microdata provided in the 1-year American Community Survey (ACS) files. I use the Gini coefficient estimates from chapter II of this disseration, which address topcoding by adopting a multiple imputation approach, simulating the censored right tail of the income distribution as following a Generalized Beta II 7Inverse distance weighting interpolates to unobserved locations xi by calculating the weighted average of observed locations nearby (xj). The weights are calculated as wj = 1 d(xi,xj) 2 , where d (.) is the Euclidean distance operator. 94 distribution. Income inequality estimates are calculated for all MSAs identified in the public-use ACS microdata files (265 in total). I supplement my two main variables with several demographic measures including median income, race, industry characteristics, poverty, and transportation patterns. Measuring Environmental Inequality Most previous research focuses on the average level of pollution exposure within a jurisdiction, generally calculated as the population-weighted mean of ground monitor readings. I extend this by examining not just the average (i.e. first moment of the distribution), but also functionals of the entire distribution of pollution exposure within each MSA.8 I define two types of measures which summarize the pollution exposure distribution. The first type I describe as measures of “environmental inequality.” These measures summarize the marginal distribution of pollution exposure, across the population, without considering individual attributes other than the level of exposure. The second type I term measures of “environmental justice.” These summarize the joint distribution of pollution exposure and other demographic characteristics, likewise across the population. Measures of environmental inequality thus summarize just the unconditional distribution of pollution exposure, without considering subgroup differences. Sheriff and Maguire (2014) describe how to adapt common measures of inequality to a case where the distribution of interest is a “bad” (as for NOx exposure). If inequality measures are interpreted purely as a description of the spread of a distribution, then this distinction is largely irrelevant. However, normative interpretations of these inequality measures 8While I will not be using the higher moments (e.g. variance, skewness) of the distribution per se, the measures I use can be thought of as capturing the same sort of information about the shape and dispersion of exposure distributions. 95 are important for analyzing the tradeoff between equity and efficiency when making environmental policy decisions, and these measures are not symmetric across the good/bad distinction. I adopt the approach of Sheriff and Maguire (2014) and modify commonly used inequality measures so that it is possible to rank distributions in an ethically sensible fashion. In contrast, measures of environmental justice are defined in terms of the differences in average pollution exposure between subgroups, where the subgroups correspond to conventional notions of advantage and disadvantage. I will consider differences in exposures across racial lines: specifically, the difference in pollution exposures between African-Americans and whites, as well as the difference in exposures between Latinos and whites. I will consider differences in exposures across income levels as well, defined as the difference in average pollution exposure between the top quintile of the income distribution and the bottom quintile of the income distribution. I will capture environmental inequality by two types of scalar inequality measures, each of which induce a complete ordering of pollution exposure distributions.9 I will divide these types of measures into two groups: relative inequality measures and absolute inequality measures. The substantive difference in these measures lies in their invariance properties. A relative inequality measure IR (x), for a vector x characterizing the empirical pollution exposure distribution, satisfies the property of scale invariance: ∀x : IR (x) = IR (kx) , k > 0 (4.1) 9Environmental inequality can also be captured using a Lorenz curve (or variations thereof). However, Lorenz curves induce only a partial ordering of pollution exposure distributions, since the Lorenz dominance criteria requires that the curves do not cross. 96 In contrast, an absolute inequality measure, IR (x), satisfies the property of translation invariance: ∀x : IA (x) = IA (x+ k) (4.2) where k is any vector, in the domain of x, whose entries are all equal. There has been some controversy over which of these two measures more accurately captures moral intuitions about inequality in general. Following Kolm (1976), the key distinction between the two types of measures lies in how the worst-off (most-exposed) individuals are treated. Consider the two inequality-preserving transformations above. A proportional increase in pollution exposure as in equation (4.1) will in fact result in much larger changes for the most exposed individual. A equiproportional increase in pollution of, e.g., 20% will by definition leave relative inequality unchanged, but will increase absolute inequality. Likewise, an equal-sized increase in pollution will decrease relative inequality, but leave absolute inequality unchanged. Of the two measures, a strong case can be made that absolute inequality measures may be more appropriate when considering the distribution of pollution exposures. Absolute differences in exposure have negative health effects, while relative differences in exposure may not necessarily correspond to substantive health disparities.. Thus, the absolute measures will respect the spirit of the environmental justice movement’s concept of equity (defined in terms of absolute differences in exposure), unlike the relative measures. I use the Atkinson index, as modified by Sheriff and Maguire (2014) to quantify relative inequality: AI (x) = [ 1 N N∑ i=1 ( xi µx )1−α] 11−α − 1, α ≤ 0 (4.3) 97 Here, α is an environmental inequality aversion parameter in the associated social welfare function (SWF). As α → 0, the associated SWF becomes increasingly utilitarian, and as α→ −∞ the implied SWF becomes increasingly Rawlsian. Likewise, I use the (modified) Kolm-Pollak index to quantify absolute inequality: KI (x) = −1 κ ln 1 N N∑ i=1 e−κ(xi−µx), κ < 0 (4.4) where κ can be interpreted as an environmental inequality aversion parameter for the associated social welfare function. Symmetrically with the α parameter in the previous case, as κ → 0 the SWF becomes increasingly utilitarian, and as κ → −∞, it becomes increasingly Rawlsian. There are two reasons to believe that the estimates of these measures of environmental justice and environmental inequality should be regarded as lower bound estimates. First, by construction and due to data limitations, I assume that all census tract residents have the same level of exposure, which is equivalent to assuming that within-tract inequality is zero. Second, I use the actual ground level concentration of NOx as a measure of exposure. However, individuals may be able to engage in averting behavior to avoid exposure to a given ambient concentration level. If the ability to engage in averting behavior is related to the degree of advantage then the estimates of environmental justice and environmental inequality based on concentration levels will overestimate exposure by advantaged groups and hence underestimate the gap in exposure between advantaged and disadvantaged groups. 98 IV Approach I assume, as a baseline, that the relationship between a functional of an MSAs pollution exposure distribution ν and an MSA’s income inequality, measured by the Gini coefficient, can be modeled as ν (Pollution)i,t = β1Ginii,t + β2Xi,t + ui,t (4.5) ui,t = αi + ei,t (4.6) where αi is the MSA-specific fixed component of the error term, and ei,t is white noise. To recover the effect of income inequality on the pollution exposure distribution, I can estimate a model in first-difference form:10 ∆ν (Pollution)i,t = β1∆Ginii,t + β2∆Xi,t + ∆ui,t (4.7) Alternatively, I could estimate equation (4.5) directly using a fixed-effects specification. This baseline model does not absorb time-varying unobserved heterogeneity. I address this heterogeneity by allowing for MSA-specific linear trends: ν (Pollution)i,t = β1Ginii,t + β2Xi,t + αit+ ui,t (4.8) which can be estimated either in a first-difference or fixed-effects specification. Even after allowing for time-varying heterogeneity, however, none of these estimates of β1 are sufficient on their own to establish a causal effect. When attempting to investigate the relationship between inequality in the distribution of income and inequality in the distribution of pollution exposure, there 10This is the preferred OLS specification in Boustan et al. (2013). 99 are two significant challenges to identifying a causal effect. First, there may be long- run reverse causality, as implied by the literature on the intergenerational transmission of inequality (for example Currie (2011)). Increases in environmental inequality might lead to increases in future income inequality, although only on intergenerational time scales. Second, on shorter time scales, there may be endogenous locational sorting between metropolitan areas, where sorting differs systematically across the income distribution. For example, if rich households disproportionately migrate out of metropolitan areas with high average levels of pollution exposure, or high inequality in pollution exposures, then there is the potential for reverse causality. At the other end of the income distribution, poorer households may disproportionately migrate into metropolitan areas with higher levels of pollution (or alternatively with less equitable distributions of pollution exposure) which may change the income distribution in these areas.11 These two potential migration flows have conflicting effects on the income distribution—if rich households migrate out, this will decrease the Gini coefficient of income inequality, however if poor households migrate in, this will increase the Gini coefficient of income inequality.12 If both types of flows occur, the effect on the Gini coefficient is ambiguous. To address these concerns about endogeneity, I adopt a simulated instrumental variables strategy. Specifically, I construct an instrument for income inequality that definitionally rules out any between-MSA sorting by “freezing” the MSA-level income distribution in an initial year and simulating the evolution of counter-factual income distributions. These counter-factual income distributions are constructed by allowing 11These are related to the concept of “coming to the nuisance”, wherein environmental disamenities depress local rents, which in turn attracts lower income households. 12To see this, consider the following toy example. Suppose there are 4 people in a city, with incomes of 10,25,50 and 150. The Gini coefficient in the city is 0.473. If the richest person leaves, the Gini coefficient drops to 0.314. If, on the other hand, a new poor person (with income 10) moves in, the Gini coefficient increases to 0.52 100 each decile of each MSAs income distribution to follow the growth trends observed for the corresponding deciles of the national income distribution. Using this instrument makes it possible to identify the causal effect of income inequality on the distribution of NOx exposure by cutting off any potential variation due to between-MSA sorting. This instrument is very similar to the instrument used in Boustan et al. (2013). This type of identification strategy has also been used in other settings, as in Enamorado et al. (2014), who examine the causal effect of income inequality on crime in Mexican municipalities. This class of simulated instruments are examples of “Bartik-style” instruments, which have been used in a wide variety of settings. As Baum-Snow and Ferreira (2015) note, the ability of this strategy to identify the causal effect of income inequality hinges on the assumption that the initial level of income inequality within metropolitan areas is independent of changes in the distribution of pollution exposure, except through its effect on the actual trends in MSA-level income inequality. One test of this identifying assumption would then be to examine the correlation between initial income inequality and subsequent changes in the distribution of NOx exposure. Figure 20 visualizes the relationship between initial income inequality and changes in average exposure. The slope of the line of best fit through this scatterplot is not statistically different from zero, which can be taken as evidence that the identifying assumptions hold. Essentially identical results hold for all the other functionals of the NOx exposure distribution used as outcome variables. To construct the instrument, I use the same microdata on incomes from the ACS 1-year files that I used to calculate the measures of MSA-level income inequality (using the Generalized Beta II multiple imputation method described in Chapter II of this 101 FIGURE 20. Initial MSA Income is Unrelated to Subsequent Changes in NOx Exposure −1.0 −0.5 0.0 0.5 0.35 0.40 0.45 0.50 Initial Gini Coefficient Ch an ge in A ve ra ge N O x Ex po su re Initial Inequality vs. Changes in Exposure dissertation).13 For each MSA identified in the ACS microdata, I calculate the mean income of each decile of the MSA income distribution in the first year of the sample (2005). I also calculate the means of each decile of the national distribution for each year, and calculate the growth rates for each of these decile means from 2006 to 2011. I leave each MSA’s information out of calculations of national decile average income levels and trends when constructing that MSA’s simulated counterfactual income distribution. This eliminates the possibility that large MSAs, e.g. New York, may be disproportionately driving national level income trends.14 I then use these national decile-mean growth rates to construct synthetic income distributions for each MSA in each year in the period 2006-2011. I assign each MSA decile 13The income concept used here is pre-tax, post-transfer household income, adjusted for household size by applying an equivalence scale equal to the square root of household size. 14Previous versions of this paper used the same national decile income levels and trends for each MSA. There is no qualitative difference in results when using the instruments, although the “leave-one-out” estimates are more precise. 102 in 2005 to its matching national decile in 2005. I calculate the instrument by assuming that every MSA decile grew at the same rate as the corresponding national decile mean for each year in 2006-2011. In other words, I “freeze” the individual MSA income distribution in 2005, and then simulate future counterfactual distributions based only on nationwide trends. Once I have these sets of simulated decile means, I calculate a Gini coefficient using the simulated decile means for each year between 2006 and 2011. With these simulated Gini coefficients as my instrumental variable, I can estimate the relationship between income inequality and environmental inequality using two-stage least squares methods. The baseline model can be estimated in first-differences as First stage: ∆Ginii,t = α + γ1∆SynthGinii,t + δ∆Xi,t + vi,t (4.9) Second stage: ∆EnvIneqi,t = α + β∆Ĝinii,t + Γ∆Xi,t + i,t (4.10) To account for potential time-varying heterogeneity, as mentioned above, I additionally estimate models that include MSA-specific trends αi: First stage: ∆Ginii,t = αi + γ1∆SynthGinii,t + δ∆Xi,t + vi,t (4.11) Second stage: ∆EnvIneqi,t = αi + β∆Ĝinii,t + Γ∆Xi,t + i,t (4.12) where in either case Xi,t is a vector of exogenous sociodemographic covariates that enter into both the first and second stages. These controls include median income, race, industry characteristics (proportion of employment in manufacturing), poverty, and transportation patterns (such as the percent of the population that commutes by car, and the average commute time). Like fixed-effects models, these first difference models will absorb any permanent features (terrain, climate, etc) which will affect between-MSA 103 variation in environmental inequality. The remaining variation in pollution exposure can be attributed to socio-demographic factors and economic activity. Figure 21 and Table 17 summarize the first-stage results of the model. As Figure 21 shows, my simulated instrument for MSA-level income inequality is highly correlated with actual MSA-level income inequality. The first column of Table 17 reports first-stage results for a model without MSA-specific linear trends, while the second column reports first-stage results for a model with MSA-specific linear trends. The estimated coefficient associated with the instrument is positive and close to unity, and the F-statistics for the two first-stage specifications are 70.5 and 55.34 respectively. These F-statistics are well above the rule-of-thumb value of 10, which indicates that a weak first stage is not a problem. Together with the construction of the instrument, this can be taken as evidence that I obtain unbiased causal estimates of β. Additionally, Figure 22 summarizes the reduced form effect of the simulated instrument on the outcomes of interest (average NOx exposure, the black-white gap, and the Kolm-Pollak and Atkinson indexes). FIGURE 21. Actual Gini Coefficient as a function of Simulated Gini Instrument, for 265 MSAs, 2005–2011 0.3 0.4 0.5 0.30 0.35 0.40 0.45 0.50 Simulated Gini Coefficient Ac tu al G in i C oe ffi cie nt First Stage (Simulated vs. Actual Gini) −0.02 0.00 0.02 0.04 −0.004 0.000 0.004 0.008 Change in Simulated Gini Coefficient Ch an ge in A ct ua l G in i C oe ffi cie nt First Stage 104 TABLE 17. First Stage, key coefficient only Dependent variable: MSA-Level Gini (1) (2) Simulated Gini 1.302∗∗∗ 1.600∗∗∗ (0.175) (0.263) MSA-specific Linear Trend? No Yes F-stat 70.5 55.342 Observations 1,578 1,578 Note: ∗p<0.1; ∗∗p<0.05; ∗∗∗p<0.01 Standard Errors allow for MSA clustering Other control variables not shown: homeownership rates, % linguistic isolated, mean commute time, mean # of bedrooms in housing stock, mean household size, % latino, % black, % in school, % native, % asian % other race, % with a high school education , % with some college, % with a bachelor’s degree, % with postgraduate education, % female, unemployment rate, % non-English speakers, Average Duncan Socioeconomic Index, % who commute by car, median age, median home value, % employed in 16 broad NAICS categories 105 FIGURE 22. Reduced Form Visualizations Showing the Effect of Simulated Income Inequality on Pollution Exposure −0.4 −0.2 0.0 0.2 −0.004 0.000 0.004 0.008 Change in Simulated Gini Coefficient Ch an ge in M ea n NO x Ex po su re Reduced Form Effect on Average NOx −0.05 0.00 0.05 −0.004 0.000 0.004 0.008 Change in Simulated Gini Coefficient Ch an ge in B la ck −W hi te E xp os ur e G ap Reduced Form Effect on Black−White Exposure Gap −0.002 0.000 0.002 0.004 0.000 0.004 0.008 Change in Simulated Gini Coefficient Ch an ge in E nv iro nm en ta l A tk in so n In de x Reduced Form Effect on Atkinson Index −0.010 −0.005 0.000 0.005 0.000 0.004 0.008 Change in Simulated Gini Coefficient Ch an ge in K o lm −P o lla k In de x Reduced Form Effect on Kolm−Pollak Index 106 Results I will consider three main sets of results. The previous literature has largely discussed the effect of income inequality on environmental degradation in terms of average exposure or emissions. Thus, to ensure comparability with previous results, I begin by examining the effect of MSA-level income inequality on average exposure to NOx. Second, I consider whether increases in MSA-level income inequality affect measures of environmental injustice (which I define as the disparity in average exposure across subgroups). Third, I consider whether increases in income inequality affect measures of environmental inequality, which I define in terms of a functional of the distribution of pollution exposure. Examination of these last two relationships constitutes one of the main innovations in this research. Table 18 reports the estimates of the key coefficient measuring the effect of income inequality on the average level of pollution exposure within a metropolitan area.15 The top panel in the table shows results from an IV model using the simulated Gini coefficient described above as an instrument for actual income inequality. For comparison, the bottom panel shows results from a naive OLS regression of average NOx exposure on the MSA-level Gini coefficient, ignoring potential endogeneity. The first column reports results for a model with no other covariates. The second column includes the full set of time-varying controls (see Table 17 for a full list). Finally, the third column reports results from a model with all time-varying controls and MSA-specific linear trends (equivalent to an MSA fixed effect in first differences). All models allow for MSA-level clustering in the standard errors. 15For this and other specifications, coefficient estimates for additional control variables are not presented. Full tables of all parameter estimates are available in supplemental material. 107 TABLE 18. Effect of Income Inequality on Average NOx Exposure (1) (2) (3) IV Results: Gini −3.772∗∗∗ −5.698∗∗∗ −6.405∗∗∗ (1.346) (1.681) (2.111) OLS Results: Gini −0.330 −0.503 −0.507 (0.359) (0.364) (0.426) Observations 1,578 1,578 1,578 First Stage F 86.22 70.5 55.34 Control Variables? No Yes Yes MSA-specific Trend? No No Yes Notes: See Table 17 for further details For specifications with and without MSA-specific linear trends the sign of the effect of income inequality on pollution exposure is negative, implying that rising income inequality decreases average exposure.16 These estimated effects are statistically significantly different from zero, and quantitatively important. To put this in perspective, the average cumulative change in NOx exposure across metropolitan areas from 2005- 2011 is −0.582. According to the results in the third column of Table 18, a one Gini- point increase (i.e. a 0.01 increase in the Gini coefficient) in income inequality reduces average NOx exposure by about 0.064 parts per billion, which is approximately 11% of the average cumulative change in NOx concentration in our sample (from 2005 to 2011). A one-standard-deviation change in inequality (approximately 0.035) would decrease NOx 16There is a relatively large difference in the estimated coefficients between a model with no control variables and a model with the full set of control variables. This difference appears to be driven by population density and total population. 108 concentrations by 0.22 ppb, which can account for approximately 38.5% of the average cumulative change in exposure over the study period. These results are consistent with some, but not all, of the literature on the environmental effects of income inequality. Ravallion et al. (2000) and Scruggs (1998), among others, find that greater income inequality decreases the average level of emissions or pollutant concentrations. However, a recent study by Zwickl and Moser (2015), which utilizes variation within regions of the United States over time as I do here, finds that income inequality, when measured by the Gini coefficient, increases average pollution exposure. My results, when they differ from the previous literature, however, have two distinct advantages that tend to make them more arguably credible. First, unique in this literature, my results directly address potential time-varying endogeneity due to household locational sorting. Second, my data on pollution exposure, derived from satellite observations, uniformly covers areas which are not directly observed by EPA monitoring stations. Measuring pollution exposure from outer space thus allows for more-accurate measurement of the distribution of exposure. Income inequality’s effect on the pollution exposure distribution is unlikely to be fully summarized by changes in the first moment of the exposure distribution, however. To move beyond the mean, I first examine how increasing income inequality within metropolitan areas affects measures of “environmental injustice,” which capture the differences in average exposure across sub-groups which are traditionally assumed to experience different degrees of advantage or disadvantage. Table 19 presents results from models using within-MSA differences in average exposure between African-Americans and whites as a dependent variable. The structure of Table 19 mimics Table 18, with increasing sets of controls from left to right (no controls in the first column, control variables in the second, and control variables and MSA-specific trends in the third). 109 Regardless of whether the models include MSA-specific trends, greater income inequality increases the difference in exposure between blacks and whites. This pattern is also broadly apparent when examining the effect of increasing income inequality on the Latino- white exposure differences in Table 20 and the rich-poor exposure gap in Table 21. The effects of increasing income inequality on black-white differences are larger in terms of effect size, and are statistically different from zero in almost all models, while the effect of inequality on the rich-poor exposure gap, and the Latino-white exposure gap are at best marginally significant. TABLE 19. Effect of Income Inequality on Black-White Exposure Gap (1) (2) (3) IV Results: Gini 1.000∗∗∗ 1.091∗∗∗ 1.024∗∗ (0.320) (0.382) (0.455) OLS Results: Gini 0.096 0.064 0.067 (0.069) (0.075) (0.089) Observations 1,578 1,578 1,578 First Stage F 86.22 70.5 55.34 Control Variables? No Yes Yes MSA-specific Trend? No No Yes Notes: See Table 17 for further details As an alternative way of moving beyond the mean, I also consider whether rising income inequality affects the distribution of NOx exposure as summarized by measures of environmental inequality. Table 22 reports the results of models using my preferred measure of absolute environmental inequality, the Kolm-Pollak index. I 110 TABLE 20. Effect of Income Inequality on Latino-White Exposure Gap (1) (2) (3) IV Results: Gini 0.595∗∗∗ 0.549∗∗ 0.684∗∗ (0.217) (0.228) (0.285) OLS Results: Gini 0.030 0.014 0.006 (0.054) (0.058) (0.069) Observations 1,578 1,578 1,578 First Stage F 86.22 70.5 55.34 Control Variables? No Yes Yes MSA-specific Trend? No No Yes Notes: See Table 17 for further details TABLE 21. Effect of Income Inequality on Poor-Rich Exposure Gap (1) (2) (3) IV Results: Gini 0.478∗∗∗ 0.462∗∗ 0.488∗∗ (0.183) (0.194) (0.239) OLS Results: Gini 0.048 0.027 0.033 (0.038) (0.041) (0.052) Observations 1,578 1,578 1,578 First Stage F 86.22 70.5 55.34 Control Variables? No Yes Yes MSA-specific Trend? No No Yes Notes: See Table 17 for further details 111 select an environmental inequality aversion parameter κ = 0.5, which corresponds to relatively utilitarian social preferences. The Kolm-Pollak index is sensitive to the absolute differences in exposure between highly exposed and minimally exposed individuals, and is ethically similar to the environmental justice measures used previously in this paper. It is perhaps unsurprising that, consistent with the environmental justice results above, I find that greater income inequality increases absolute environmental inequality across all models. TABLE 22. Effect of Income Inequality on Absolute Environmental Inequality (1) (2) (3) IV Results: Gini 0.283∗∗∗ 0.203∗∗ 0.150∗ (0.098) (0.092) (0.090) OLS Results: Gini 0.013 0.002 0.003 (0.012) (0.012) (0.013) Observations 1,578 1,578 1,578 First Stage F 86.22 70.5 55.34 Control Variables? No Yes Yes MSA-specific Trend? No No Yes Notes: See Table 17 for further details The choice of an absolute environmental inequality aversion parameter, κ, that corresponds to relatively utilitarian preferences is conservative in that it produces the smallest estimated effect size. To illustrate the influence of this parameter choice on the results, I allow for more Rawlsian preferences by allowing the absolute environmental inequality aversion parameter to vary from κ = 0.5 to κ = 5. The size of the effect 112 of income inequality on the Kolm-Pollak index of absolute environmental inequality is monotonically increasing in κ over this range of κ.17 These results are summarized in Figure 23. Intuitively, as social preferences become more Rawlsian (as κ → ∞), these preferences are more sensitive to the the fate of the most exposed individual. FIGURE 23. Illustration of how the estimated effect of Income Inequality on Absolute Environmental Inequality changes as the assumed value of κ, capturing Absolute Environmental Inequality aversion, increases l l l l l l l l l l 0.0 0.5 1.0 1.5 1 2 3 4 5 Assumed Absolute Environmental Inequality Aversion Parameter Es tim at ed IV E ffe ct S ize Effect of Income Inequality on Kolm−Pollak Index Table 23 reports results from models which use, alternatively, the Atkinson index of relative environmental inequality as a dependent variable. I find that increasing income inequality universally increases relative environmental inequality. This is unsurprising given the previous results on the effect of income inequality on the Kolm-Pollak index. Recall that absolute environmental inequality indexes are translation-invariant and relative inequality indexes are scale-invariant. Hence, if increasing income inequality reduces average exposure in a manner that increases the differences between highly and minimally exposed individuals, then the ratio between the highly and minimally exposed individuals 17This pattern continues for larger values of κ, although around κ = 30 the effect seems to level off. 113 will also increase (so the Atkinson index would be expected to increase). These results use a relative inequality aversion parameter of α = 0.5, which again corresponds to relatively utilitarian social preferences. Paralleling the previous analysis, selecting larger values of α (more Rawlsian preferences) results in larger effect sizes, as shown in Figure 24. TABLE 23. Effect of Income Inequality on Relative Environmental Inequality (1) (2) (3) IV Results: Gini 0.101∗∗∗ 0.106∗∗∗ 0.100∗∗∗ (0.020) (0.023) (0.027) OLS Results: Gini 0.008∗∗ 0.007∗ 0.006 (0.004) (0.004) (0.005) Observations 1,578 1,578 1,578 First Stage F 86.22 70.5 55.34 Control Variables? No Yes Yes MSA-specific Trend? No No Yes Notes: See Table 17 for further details Income inequality has a complex effect on environmental quality: it decreases the average level of NOx exposure within metropolitan areas, but it also causes this exposure to be more unequally distributed, measured either in terms of environmental justice (across race/class lines) or in terms of an environmental inequality index. This implies that the benefits of pollution reduction induced by greater income inequality accrue disproportionately to the most-advantaged. This does not necessarily mean, however, that income inequality makes the most-disadvantaged worse off in an absolute sense. Table 24 shows results of models using the within-MSA average exposure among African-Americans 114 FIGURE 24. Effect of Income Inequality on Relative Environmental Inequality, varying Relative Environmental Inequality Aversion l l l l l l l l l l 0.00 0.25 0.50 0.75 1.00 1.25 1 2 3 4 5 Assumed Relative Environmental Inequality Aversion Parameter Es tim at ed IV E ffe ct S ize Effect of Income Inequality on Atkinson Index as a dependent variable. Increased income inequality decreases pollution exposure among African-Americans, across different specifications and for different sub-samples. Similarly, Table 25 summarizes results for models using average white exposure as a dependent variable. Rising income inequality decreases white exposure, and the estimated effect is larger than is the effect on average black exposure. Figure 25 illustrates how a five Gini-point increase in income inequality (roughly the size of the observed change in the aggregate national Gini coefficient between 1990 and 2010) would affect a hypothetical distribution of NOx exposure. 18 I take as given the estimated effect sizes in the second columns of Tables 24 and 25 as the race-specific marginal effects of income inequality on pollution exposure. I assume for simplicity that African-Americans and whites each experience a decrease in pollution exposure equal to the race-specific marginal effect multiplied by a five Gini-point change in inequality. The overall result is depicted by the distributional changes between the top and bottom panels 18The hypothetical pollution distributions are generated by simulating distributions from a Generalized Beta distribution fitted to the aggregate national data in 2005 for whites and blacks separately. 115 TABLE 24. Effect of Income Inequality on Average Black Exposure (1) (2) (3) IV Results: Gini −3.311∗∗ −4.951∗∗∗ −5.680∗∗∗ (1.436) (1.759) (2.190) OLS Results: Gini −0.243 −0.427 −0.423 (0.384) (0.387) (0.452) Observations 1,578 1,578 1,578 First Stage F 86.22 70.5 55.34 Control Variables? No Yes Yes MSA-specific Trend? No No Yes Notes: See Table 17 for further details TABLE 25. Effect of Income Inequality on Average White Exposure (1) (2) (3) IV Results: Gini −4.311∗∗∗ −6.041∗∗∗ −6.704∗∗∗ (1.328) (1.671) (2.109) OLS Results: Gini −0.339 −0.491 −0.490 (0.355) (0.360) (0.422) Observations 1,578 1,578 1,578 First Stage F 86.22 70.5 55.34 Control Variables? No Yes Yes MSA-specific Trend? No No Yes Notes: See Table 17 for further details 116 of Figure 25. An increase in income inequality decreases the average exposure of both groups in absolute terms, but increases the difference in average exposures across the two groups. FIGURE 25. Effect of an increase in income inequality on NOx exposure. baseline counterfactual 0.0 0.1 0.2 0.3 0.4 0.0 0.1 0.2 0.3 0.4 0 2 4 6 NOx Exposure de ns ity race black white Predicted effect of an increase in income inequality on distributions of NOx exposures for blacks and whites This illustrates the predicted effect of a 5 Gini-point increase in inequality (approximately the size of the increase from 1990 to 2010 nationally) on the distribution of NOx exposure among blacks and whites. I observe (1) average exposure decreases for both groups and (2) the gap between the two groups increases in the counterfactual higher income inequality scenario. Baseline scenario is the smoothed national distribution of NOx exposure in 2005, counterfactual assumes that the baseline distributions are “shifted” by the marginal effect (column 2 in tables 24 and 25) for a 5 Gini-point change in inequality. Mechanisms Across a variety of model specifications, and using a number of different measurement tools for summarizing the NOx exposure distribution in each of 265 different MSAs, I have shown that increasing income inequality tends to reduce the average level of pollution exposure, but also increases exposure inequality and the differences in 117 exposure between advantaged and disadvantaged subgroups. My findings are inferred from reduced-form models, so they cannot directly address the underlying mechanisms by which these results might occur (i.e the pathways through which changes in the income distribution might affect the pollution exposure distribution). I can, nonetheless, appeal to some other studies to suggest potential mechanisms for the effects I observe. There are several potential pathways from income inequality to environmental quality, including (1) differential demands for clean air across the income distribution, (2) residential or industrial sorting within MSAs, and (3) the political process working through legislator ideology or legislator voting behavior. As noted, the Environmental Kuznets curve literature provides some insight as to how increasing income inequality might affect the level of pollution exposure. One can think of this in two ways. First, suppose that the NOx emission intensity of a household’s consumption bundle can be summarized by a “propensity to emit” function. Over the range of incomes in the United States, this function is assumed to be non-decreasing in income. If the PEF is concave in income, so that the marginal propensity to emit is decreasing with income, then an immediate conclusion is that regressive income transfers should decrease emissions. The marginal decrease in emissions by a poor household at one end of the regressive transfer is larger in absolute terms than the marginal increase in emissions by the rich household at the other end of the transfer. Although there is not much direct evidence on the concavity of the PEF, Levinson and O’Brien (2015) use US expenditure data to show that the closely related Environmental Engel curve— the pollution embedded in consumption along a household’s income expansion path—is concave in household income. A second potential pathway from increased income inequality to decreased average pollution exposure (and also increased environmental inequality) might occur through 118 residential sorting within MSAs.19 If increases in income inequality are associated with differential rates of migration towards relatively more polluted areas by disadvantaged groups, perhaps due to property market dynamics, this might increase my measures of environmental inequality, despite decreasing pollution exposure on average. Matlack and Vigdor (2008) show, for example, that increases in income inequality within metropolitan areas lead to higher rents and home prices at the bottom of the home price distribution. This effect may induce poorer and more-disadvantaged households to “come to the nuisance” in search of cheaper housing (see also Banzhaf (2012)). A final mechanism for the relationship between income inequality and environmental quality might work through the political system. This mechanism, however, is specific to the institutional arrangement of US politics, and may not generalize to other countries. Within the US political system, the party of the left (the Democrats) is generally considered to also be the party of the environment. In other research, Voorheis et al. (2015) investigate the effect of within-state income inequality on the average ideological position of political parties within state legislatures. Perhaps the most striking result in Voorheis et al. (2015) is that increasing income inequality moves state Democratic parties further to the left. Another small but growing literature in political science has suggested that income inequality may be connected to what is termed “unequal democracy”. As an exemplar, Bartels (2009) shows that US Senators are more responsive to the political opinions of rich constituents than poor constituents. This suggests a potential mechanism for the relationship between income inequality and pollution exposure observed in the present study. This political mechanism can be summarized as a two-stage process. First, if 19my identification strategy addresses potential reverse causality from pollution exposure to income inequality due to between-MSA sorting. Within-MSA residential sorting poses no such endogeneity problems, since the Gini coefficient satisfies the property of anonymity. Within-MSA locational sorting will not affect the Gini coefficient for that MSA. 119 income inequality increases the responsiveness of legislators to rich constituents, and demand for clean air is increasing in income, then this may induce legislators to support more environmentally friendly legislation. Second, increased likelihood of enacting or enforcing environmental regulation, would be expected to decrease average pollution exposure. Inequality and LCV Scores I can directly examine the first stage of the proposed political mechanism in the last section by modeling how increasing income inequality affects the environmental voting record of elected legislators. However, this makes it necessary to change the geographic scale of analysis to account for the specifics of the US political system. Most environmental regulation occurs at the Federal level, making members of the US House and Senate the obvious legislators to analyze. For reasons of data availability, I will further restrict my attention to just the US Senate.20 To measure the environmental voting record of US senators, I will use the League of Conservation Voters’ National Environmental Scorecard.21 The LCV has published the scorecard annually since 1970. The scorecard gives each federal legislator (senator or member of the House of Representatives) a score between 0 and 100, based on their votes on a list of bills selected by the LCV as being important for the environment and natural resources. A score of 100 represents a voting record in a given year where a legislator voted in agreement with the LCV on all bills, while a score of 0 represents a voting record where a legislator always voted against the LCV’s policy position. The LCV 20Income inequality data for US House districts is available only in the decennial Censuses or in the ACS 5-year files, while Chapter II of this dissertation produces State-level inequality measures annually for the period 1977-2014. 21The nominal, unadjusted LCV scorecard data can be accessed via the LCV website, at: http: //scorecard.lcv.org/ 120 gives a separate score in each year a senator serves, so there can be non-trivial within- senator variation that can be exploited.22 Since there are different sets of bills that are voted on in each year, I follow the method proposed by Groseclose et al. (1999) for scaling nominal LCV scores to ensure comparability over time. These scores are correlated with the ideology of legislators. The most commonly used data on the ideology of senators are the DW-NOMINATE 23 scores, which place legislators on a scale from -1 (most left-wing) to 1 (most right-wing). Figure 26 summarizes the correlations between LCV scores and DW-NOMINATE scores. As expected, there is party polarization along the environmentalism dimension. To see this further, Figure 27 plots the histogram of LCV scores by party for each Senator serving between 1977-2014. Although the Republican party is substantially more unified on environmental issues than is the Democratic party, as can been seen by the pronounced peak in Republican LCV scores near zero, there is still substantial heterogeneity within both parties. I will analyze the effect of state-level income inequality on these adjusted LCV scores for all senators who served during the period 1977-2014. State-level income inequality data is available over the same time period from Chapter II of this dissertation. Unlike the MSA-level data used above, these state-level inequality measures are calculated using the Current Population Survey, and address Census Bureau censoring and potential under- reporting by modeling the right tail of the income distribution as following a Generalized Beta II distribution. I supplement these two variables of interest (State-level income inequality and LCV scores) with state-level demographic and economic information from 22This within-legislator variation is in contrast to ideal-point estimates of legislator ideology which assume that legislators do not change positions over time. 23Dynamically-Weighted Nominal Three-step Estimation 121 FIGURE 26. Correlation Between League of Conservation Voter Scores and Ideology, US senators (a) l l l l l l ll l l l l ll l l ll l l l l ll l l l l l l l ll l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l ll l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l ll l l l ll l l l l l l l l l l l l ll ll l l l l l ll l l l l l l l l l l l l l l ll l l ll ll l l l l l l l l l l l l ll l l l l l l l l ll l l l l l l l l l l l l l l l l l l l l l l l l l l l ll ll ll l l l ll l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l ll l l l l l l l l l l l l l l l l l l l l l ll l l l l l l l l l l ll l l ll l l ll l l lll l l l l l l l l l l l l l l l l l l l l l ll l l l l l l l l l l l ll l l l l l l l l ll l l l l l l ll l l l l ll l l l l l l l l l l l l l l l l l l l l l l ll l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l ll l l l l l l l l l l l l l ll l l l l l l l l l l l l l l ll ll ll l l l l l l l l l l ll l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l ll l l l l l l l l l l l l l l l ll l l l l l l l l l l l l l l l l l ll l l l l l l l l l l l l l l l l l l l l l l l l l ll l l l l l l l ll l l l l l l l l l l l l l l l l l l l l l l l l l l ll l l l l l l l l l l l l l l l l l ll l ll l l l l l l l l l l l l l l l l l l l l l l ll l ll ll l l l ll l l l l l l l l l l l l l l l l l l l l l l l l l l l ll l l l l l l l ll l l l l l l l l l l l l l l l ll l l l l l l l l l l ll l l ll l l l l l l ll l l l l ll l l l l l l l l l l l l l l l l ll l l l l l l l l l l l l l l l l l l l l l l l l ll l l l l l l ll l l l l ll l l l l l l l l l l l l l l l l l l l l l l l l 0 40 80 −0.4 0.0 0.4 0.8 Average DW−NOMINATE Score Av e ra ge A dju ste d L CV S co re LCV Scores vs. DW−NOMINATE Scores, US Senate 1977−2014 Nominal League of Conservation Voter (LCV) scores represent the the percentage of votes cast on environmental legislation by a senator that agree with the LCV position. The adjusted LCV Scores shown here rescale each year’s Nominal scores to make them comparable over time, and hence can be less than 0 or greater than 100. DW-NOMINATE scores represent the “ideal point” of a senator on a latent ideology scale ranging from -1 (most liberal) to 1 (most conservative) 122 FIGURE 27. Histogram of Adjusted LCV Scores by Party for US Senators, 1977-2014 D R 0 100 200 300 400 500 0 40 80 0 40 80 Adjusted LCV Score co u n t Histogram of Adjusted LCV Scores by Party the Current Population Survey and the Bureau of Economic Analysis’ national income accounts. I assume that each senator is potentially responsive to changes in income inequality within the state that he or she represents. As with the previous analysis of the effect of MSA-level inequality on environmental quality, and similar to Voorheis et al. (2015), there is potential endogeneity between income inequality and the outcome of interest (in this case, the environmental voting record of senators). In addition to the possibility of locational sorting by a senator’s constituents, the examination of the effect of inequality is potentially complicated by an additional source of endogeneity bias stemming from the distributional consequences of environmental policy. To address these potential sources of endogeneity bias, I implement an instrumental variables identification strategy similar to that used in the earlier sections of this paper. As before, I require an instrument for state-level income inequality that is uncorrelated with changes in a senator’s voting record, but is correlated with the actual 123 state-level income inequality experienced by that senator. I propose a version of the simulated inequality instrument used in the previous analysis. In the state-level setting, I construct the instrument by freezing each state’s income distribution in 1976 (the first year in which the state-level income inequality data are available), and simulating future counterfactual state income distributions based on nationwide trends in decile income growth. This is instrument is constructed identically to the instrument used in Voorheis et al. (2015), with the sole difference being an earlier starting year. I estimate the effect of income inequality on a senator’s LCV score using a two-stage least squares model in first differences: First Stage: ∆Ginii,t = δi + γ∆SynthGinii,t + θ∆Xi,t + νi,t (4.13) Second Stage: ∆Scorei,t = αi + β∆Ĝinii,t + Γ∆Xi,t + i,t (4.14) where i now indexes senators, t indexes time in years, αi and δi are senator-specific linear trends, and Xi,t is a vector of time varying controls, including the proportion of the population that is black or Hispanic, the median age, median household income, poverty rates and educational attainment in the state each senator represents. Table 26 summarizes the basic results from this estimation. The first two columns report the full-sample effect of income inequality on senators’ LCV scores with and without senator-specific linear trends, respectively. The final columns report the effects (again with and then without senator-specific linear trends) for just the sub-sample of senators serving after 2004. This latter subsample corresponds to the time period in the previous MSA-level results describing the effect of income inequality on the distribution of NOx exposure. Across all specifications and all subsets of the data considered, 124 however, I find that rising income inequality seems to have a positive effect on a senator’s environmental voting record. Using results with the senator-specific trends, a one-Gini- point increase in income inequality would increase a senator’s adjusted LCV score by 1.6 points (i.e. this greater degree of income inequality would increase the proportion of times a senator agrees with the LCV by 1.6 percentage points). TABLE 26. Effect of State Inequality on Senators’ LCV Scores Adjusted LCV Score 1977–2014 2005–2014 (1) (2) (3) (4) Gini 163.324∗∗∗ 160.154∗∗ 281.250∗ 278.416 (60.356) (64.949) (158.769) (193.750) Control Variables? Yes Yes Yes Yes Senator-specific Trend? No Yes No Yes N 3,356 3,356 742 742 Notes: ∗∗∗Significant at the 1 percent level. ∗∗Significant at the 5 percent level. ∗Significant at the 10 percent level. See Table 1 for other variables not shown here Politics within the US Senate has become increasingly one-dimensional over the period of the estimating sample—over 80% of roll votes can be correctly classified based only on a senator’s DW-NOMINATE score, which captures conservative-liberal differences (see McCarty et al. (1997)). As politics has become more polarized, it is likely that the effect of income inequality on senators’ environmental voting records might also be polarized along party lines. Table 27 summarizes the effect of income inequality on the environmental voting records of senators, stratified by political party.24 The first two columns report results for the subsample of Democratic senators (with and without 24There were 5 senators without formal party affiliation who were elected during this period. I exclude these senators from the party-stratification analysis. 125 senator-specific trends), while the final two columns report results for Republican senators. The effect of income inequality on the environmental voting record of Democratic senators is statistically significant, positive, and larger than the full sample results. On the other hand, the effect on the environmental voting record of Republican senators is negative, substantially smaller in absolute value, and very imprecisely estimated. TABLE 27. Effect of State Inequality on Senators’ LCV Scores, By Party Adjusted LCV Score Democrats, 1977–2014 Republicans, 1977–2014 (1) (2) (3) (4) Gini 499.237∗∗∗ 495.279∗∗∗ −105.548∗ −102.157 (138.356) (149.892) (56.372) (62.321) Control Variables? Yes Yes Yes Yes Senator-specific Trend? No Yes No Yes N 1,708 1,708 1,597 1,597 Notes: ∗∗∗Significant at the 1 percent level. ∗∗Significant at the 5 percent level. ∗Significant at the 10 percent level. See Table 1 for other variables not shown here These results imply that increasing income inequality improves the environmental voting record of Democratic senators, but has no discernible effect on Republican senators. If demand for environmental quality is increasing in income, this is consistent with the concept of “unequal democracy”, wherein the positions of elected officials are more closely aligned with the preferences of the rich than the poor or middle class. In this case, an increase in income inequality, and hence an increase in the local bargaining power of the affluent, moves Democratic senators towards the typical position of the affluent in terms of more environmentally friendly legislation. The differential effect of inequality on 126 Democratic and Republican senators is likely the result of differing underlying levels of environmentalism, or a different income-environmentalism gradient among party elites. This suggests that the political system may be a plausible pathway through which increases in income inequality affect the distribution of environmental amenities. Note that the previous result that effects of income inequality on the distribution of NOx exposure were stronger for the subsample when Democrats controlled federal institutions is consistent with the Democratic-party-specific political mechanism implied in these state- level results. To reiterate, however, these results are context-specific, and may not describe how changes in income inequality affect environmental policy at other levels of government (e.g. the restrictiveness of state regulations or local land use ordinances). Conclusion Viewing environmental justice from outer space—using satellite remote-sensing data to infer the distribution of ground-level exposure to harmful pollutants like NOx— provides a viable way forward for the measurement and analysis of environmental justice and inequality. Leveraging these remotely sensed data to study environmental justice and environmental inequality is one important contribution of this study. The near-universal coverage and fine geographic resolution of the satellite data used in this study are far superior to the incomplete coverage (and the endogenous placement) of ground-monitor data so often used in previous studies. I have shown that there is consistent and robust evidence that rising income inequality decreases the level of average NOx exposure within metropolitan areas. Increasing income inequality also appears to increase the differences in exposure between advantaged and disadvantaged groups, and increases measures of environmental inequality. This pattern implies that the benefits of pollution reduction tend to disproportionately 127 accrue to the most advantaged. However, income inequality still decreases the absolute level of exposure among the most disadvantaged—either in terms of class, race or degree of exposure. These results hold for a variety of specifications, and for various normative assumptions underlying the empirical measurement of environmental inequality (such as the assumed degree of environmental inequality aversion in the social welfare function). I offer some evidence that the effect of increasing income inequality on the distribution of environmental amenities may work through the political system. Using a state-level simulated instrument strategy similar to that employed elsewhere in this paper for MSAs, I use state-level data to identify the causal effect of income inequality on the environmental voting record of US senators. I find that income inequality increases the LCV scores of senators, and that this effect appears to be concentrated among Democratic senators. NOx exposure is a major public health concern in the United States and Europe, contributing to thousands of premature deaths. The fact documented here—that rising income inequality appears to mitigate this exposure to a measurable extent—suggests that there may be unanticipated local effects of inequality that have not previously been recognized. These reductions in exposure are not equally shared, because greater income inequality increases the difference in exposure between advantaged and disadvantaged groups. These results add nuance to the discussion of the effects of increasing inequality in the United States. One particularly important implication of this research is that the effect of inequality on the distribution of pollution exposure may lead to a further propagation of inequality into the future, due to the widening gap in exposure between children born to rich and poor parents. There is considerable further work to be done on this topic. In particular, I have shown an effect only for one pollutant. Using satellite data and/or monitoring data to 128 examine the relationship for other important pollutants, including particulate matter and ozone, may be instructive. It will also be relevant to examine the relationship between inequality and pollution exposure for other countries and regions, including the EU, India and China. Further investigating the possible mechanisms behind the observed reduced- form relationship between income inequality and the environment is another important direction for research. 129 CHAPTER V CONCLUSION In the three substantive chapters of this dissertation, I have shown how rising income inequality may be an important driver of improvements in environmental quality and, more sinisterly, inequality in the distribution of environmental disamenities. These results are highly relevant in the current policy environment, in which rising income inequality and the environmental challenges of climate change and the health impacts of air pollution are salient topics. These chapters add to several important literatures across sub-fields within economics, and introduce new data, modelling tools and extend identification and inference strategies in ways that will benefit future researchers. Chapter II adds to the stock of knowledge about one of the most pressing concerns of the current time: the growth in income inequality. The new datasets of income inequality measures at the state and metropolitan area level represent an extension of the literature on using survey data to measure income inequality, specifically by extending the method of using the Generalized Beta II distribution to address topcoding and under- reporting first proposed by Burkhauser et al. (2011). Using this data, I also contribute to understanding the determinants of rising inequality, identifying declining unionization and reductions in top marginal tax rates as the most important among potential explanations. I find less compelling evidence for explanations relying on changes in human capital or skill-biased technological change. This bolsters claims by, among others, DiNardo et al. (1996) and Piketty et al. (2014) that the recent rise in inequality may be the result of policy, such as changes in tax rates and legislation designed to reduce the bargaining power of unions. Chapter III provides two main contributions to the literature on environmental inequality and environmental justice. First, I leverage satellite-derived remote sensing 130 data to measure the distribution of exposure to two important pollutants. Although the use of satellite data to measure pollution exposure is common in other fields, it has not yet become common in the environmental economics literature (this chapter, and Chapter IV represent the first use of these data in economics to my knowledge). Due to the fact that satellite data provide substantially better spatial coverage than data derived from ground monitoring stations, and provide a better picture of actual exposure than conventional air quality models, since the satellite data captures ground-level concentrations deriving from all sources, whereas most air quality models only capture exposure to pollution from stationary emissions sources. In addition to bringing this potentially quite valuable data source into the economics literature, I also contribute to the literature on measuring environmental inequality. I propose a dashboard approach to measuring environmental inequality, combining the vertical equity approach of Sheriff and Maguire (2014) with the horizontal equity approach of the environmental justice literature (e.g. Mohai et al. (2009)). Using the satellite data to calculate the various environmental inequality measures that compose this “dashboard” for the entire United States annually over the period 1998-2014. Chapter IV combines the datasets introduced in the Chapters II and III to examine whether rising income inequality within metropolitan areas might be related to changes in the distribution of pollution exposure. Identifying the causal effect of inequality on an outcome such as pollution requires care, as locational sorting may bias naive OLS estimates which do not account for reverse causality. I extend an approach to causal inference on the effects of income inequality first proposed in Boustan et al. (2013), in which I simulate an instrument for MSA-level income inequality using national level trends in income growth at deciles of the income distribution. This approach is an important tool in the small but growing literature on the causal effects of rising income inequality on other outcomes of interest. Using the new datasets produced in the previous chapters 131 with the causal identification approach presents a way forward for the literature on the environmental effects of income inequality, which, since Boyce (1994) has been plagued by insufficient or incomplete data and a lack of clean identification. I also examine a potential political economy explanation for the environmental effects of inequality. This adds to the political economy literature on the connection between income inequality and the political process (e.g. McCarty et al. (1997), Bartels (2009)), and connects this political economy literature to the literature of environmental inequality and the environmental effects of income inequality. The results presented in this dissertation represent the core of a broader research agenda examining interrelations among income inequality, the environment and the political process. One important project running parallel to this is dissertation is Voorheis et al. (2015), an examination of the effect of income inequality on political polarization in US State legislatures. Using the data from Chapter II and a similar identification strategy to that used in chapter IV, we find that income inequality increases political polarization by inducing the replacement of moderate Democratic legislators with more conservative Republican legislators. We provide suggestive evidence that campaign contributions may serve as a potential mechanism for this effect, by allowing the effect of income inequality on polarization to vary systematically in state-years in which pre-Citizens United caps on independent campaign expenditures were in effect. I further examine the effect of income inequality on the environment in Voorheis (2016), which presents evidence that income inequality might lead to a reduction in carbon emissions, a result consistent with Chapter IV. 132 APPENDIX A MEASURING STATE INCOME INEQUALITY BY COMBINING CPS AND IRS DATA Introduction We have income information from two sources: public use microdata from the Current Population Survey, and data on number of tax returns and adjusted gross income for income bands. The first substantive chapter of this dissertation uses the former to estimates measures of state-level income inequality, while Frank (2009) and uses the latter. Each data source has drawbacks. The CPS data may incorrectly capture top incomes for two reasons: (1) top earners may systematically under-report (or differentially non-respond), and (2) the Census Bureau censors incomes above a certain threshold (“topcoding”). Both of these will bias estimates of income inequality downwards. The IRS data, on the other hand, contains no information on the income of non-filers. Estimates of inequality from IRS data will likely overstate the level of inequality, especially when inequality is captured by top income shares. One obvious way to improve on estimates of inequality using each of the above data sources individually is to combine information on top incomes from tax return data with information on the rest of the income distribution using survey data. Atkinson (2007) and Alvaredo (2011) suggest an approach for the Gini coefficient based on the approximation that for top income share S, the Gini coefficient is approximately G∗ (1− S) +S, where G∗ is the Gini coefficient estimated for the non-top incomes. Diaz-Bazan (2015) shows that is possible to estimate the population Gini coefficient by combining estimates of the Gini coefficient computed using conditional distributions estimated from survey and tax data. We propose an alternative, but similar, approach to this problem which improves on the previous methods for combining survey and tax data in two dimensions. This 133 approach relies on simulating top incomes rather than relying on an approximation based on asymptotic behavior. This means that in practice this method can be used to estimate any income inequality measure, not just the Gini coefficient. Additionally, as in Flaichaire and Davidson (2007), it is possible to perform inference via semi-parametric bootstrapping and calculate point estimates of inequality measures simultaneously. I proceed as follows: I describe the data in detail and the steps necessary to make income information in the CPS comparable to income information in the IRS data. I describe the proposed method for combining information from the two series, as well as an approach to selecting the optimal cutoff percentile. I then present estimates of inequality using this method for US states, and compare these estimates to estimates calculated using an approach using only information from CPS data. Methodology In order to combine the information from the two data sources, I need to first accomplish three tasks. First, I need to ensure that the income concepts in the two data sources are identical to ensure comparability. Second, I must identify the point at which top incomes may be censored or underreported in the survey data. Finally, I need to translate information on top incomes from the tax return data back to the survey data in order to estimate income inequality. Microdata from the March Current Population survey provides information on income from a variety of sources for each member of a household. These sources include earnings (wages, small business income, farm income), unearned income (rent, interest, dividends) as well as taxable and non-taxable transfer income (unemployment insurance, social security and AFDC/TANF). The summary data available from the IRS reports “Adjusted Gross Income” (broad income), which is the sum of all taxable gross income less a set of pre-defined allowances. 134 In order to incorporate information from both sources, it is necessary to transform the CPS income data to conform with the IRS AGI definition. I do this in two steps. I first form “tax units” from the CPS households as follows. I first identify all dependents in each year’s sample (all children under 18 and all students currently in school under the age of 25). I then assign these children to the head of household (or to their parent if their parent is not the head). I then define all non-dependents in a household as either married, single of head of household filers depending on their marital status and whether they have children. Finally, I define all dependents with total personal income above a threshold ($3000) as dependent taxpayers. For each of the tax units I form, I simulate AGI using NBER’s TAXSIM, using the income information in the CPS. Since there is no information on essentially all “above-the-line” deductions in the CPS, this may overestimate AGI. Next, I determine the point past which I believe CPS income data is either censored (topcoded) or under-reported; this will determine the cutpoint at which I will use IRS information. Heuristically, I search for a point at which the CPS distribution and the IRS distribution “match up”. For each percentile p ∈ {0.9, 0.901, 0.902, ...., 0.999}, I calculate the threshold income at this percentile in the CPS AGI distribution, and estimate the percentile pP at which the CPS threshold income falls in the IRS AGI distribution using Pareto interpolation. I choose as the cutpoint past which I will use IRS information only by minimizing |p− pP |. Finally, I combine the information from the survey and tax return data to estimate measures of inequality by simulating incomes above the optimally chosen percentile p. For each state and year, I estimate the Pareto parameter α using IRS data, again via Pareto interpolation. I then simulate n partially synthetic income distributions. For each replication, I combine all the CPS tax units below the cutoff p, and replace all tax units above p with random draws from a Pareto distribution with shape parameter α and location parameter equal to the threshold income at p. I calculate income 135 inequality measures ν (Fi) for each partially synthetic distribution. The resulting estimate of inequality, considering information from both the survey and tax return data is 1 n ∑n i=1 ν (Fi) Data and Results I will compare estimates of two measures of income inequality (the Gini coefficient and the top 1%’s share of income) using the IRS Pareto imputation against two baselines. I first compare inequality measures estimated via the IRS Pareto imputation with estimates calculated using just the CPS data with no corrections for topcoding, and second with the estimates from Frank (2009), which use just the IRS data. The readily available IRS SOI summary data used in the Pareto imputation spans 1997-2012, a relatively short period. Data for years before 1997 was published in the SOI bulletin, although it is not made freely available in machine readable form by the IRS. Figure 28 shows the estimates of the Gini coefficient estimated with the IRS Pareto simulation method (in blue) and with just CPS microdata (with no correction for topcoding, in red). For all states, the Gini coefficient estimated via the Pareto simulation process is higher than the Gini estimated using just CPS data. There is a wide degree of heterogeneity in the degree to which the Pareto simulation estimates diverge from the CPS-only estimates, however. Unsurprisingly, states which are relatively richer on average (e.g. Connecticut, New York) exhibit the widest differential between the two estimates, while states which are poorer on average (e.g. Mississippi) exhibit only small differences. For many (though not all) states, the trends in the Gini coefficient are largely similar for the two measures. Figure 29 shows the estimates of inequality measured by the top 1%’s share of income, again comparing estimates using the Pareto simulation to estimates using only CPS data. As with the Gini coefficient, using the Pareto simulation method substantially 136 FIGURE 28. State Gini Coefficients, 1997-2012: IRS Simulation vs. CPS Baseline Alabama Alaska Arizona Arkansas California Colorado Connecticut Delaware Florida Georgia Hawaii Idaho Illinois Indiana Iowa Kansas Kentucky Louisiana Maine Maryland Massachusetts Michigan Minnesota Mississippi Missouri Montana Nebraska Nevada New Hampshire New Jersey New Mexico New York North Carolina North Dakota Ohio Oklahoma Oregon Pennsylvania Rhode Island South Carolina South Dakota Tennessee Texas Utah Vermont Virginia Washington West Virginia Wisconsin Wyoming 0.50 0.55 0.60 0.65 0.70 0.50 0.55 0.60 0.65 0.70 0.50 0.55 0.60 0.65 0.70 0.50 0.55 0.60 0.65 0.70 0.50 0.55 0.60 0.65 0.70 0.50 0.55 0.60 0.65 0.70 0.50 0.55 0.60 0.65 0.70 0.50 0.55 0.60 0.65 0.70 0.50 0.55 0.60 0.65 0.70 0.50 0.55 0.60 0.65 0.70 20 00 20 04 20 08 20 12 20 00 20 04 20 08 20 12 20 00 20 04 20 08 20 12 20 00 20 04 20 08 20 12 20 00 20 04 20 08 20 12 year G in i I nd ex Gini CPS IRS Pareto increases estimates of the top 1% share relative to the CPS-only baseline. Unlike the Gini coefficient case, however, the Pareto simulation estimates deviate from the CPS baseline not just in the level of inequality, but also in trends over time for many states. This is especially true in the last few years of the sample, where most states exhibit increasing top 1% shares from 2009-2012 using the Pareto simulation method, while top 1% shares are flat or falling in the CPS baseline. Figures 30 and 31 compare IRS simulation method estimates of the Gini coefficient and top 1% share, respectively with estimates from Frank (2009), which are estimated using only IRS data.1 The Pareto simulation estimates of the top 1% share are much closer to the Frank estimates than they are to the CPS baseline. The largest deviations occur between 2004-2008 for most states, where the Frank estimates of the top 1% 1This dataset has been updated through 2013, and is available at http://www.shsu.edu/eco_mwf/ inequality.html 137 FIGURE 29. State Top 1% Shares, 1997-2012: IRS Simulation vs. CPS Baseline Alabama Alaska Arizona Arkansas California Colorado Connecticut Delaware Florida Georgia Hawaii Idaho Illinois Indiana Iowa Kansas Kentucky Louisiana Maine Maryland Massachusetts Michigan Minnesota Mississippi Missouri Montana Nebraska Nevada New Hampshire New Jersey New Mexico New York North Carolina North Dakota Ohio Oklahoma Oregon Pennsylvania Rhode Island South Carolina South Dakota Tennessee Texas Utah Vermont Virginia Washington West Virginia Wisconsin Wyoming 0.1 0.2 0.3 0.1 0.2 0.3 0.1 0.2 0.3 0.1 0.2 0.3 0.1 0.2 0.3 0.1 0.2 0.3 0.1 0.2 0.3 0.1 0.2 0.3 0.1 0.2 0.3 0.1 0.2 0.3 20 00 20 04 20 08 20 12 20 00 20 04 20 08 20 12 20 00 20 04 20 08 20 12 20 00 20 04 20 08 20 12 20 00 20 04 20 08 20 12 year To p 1% S ha re Top 1% Share CPS IRS Pareto share are much higher than the IRS simulation results. The Frank estimates of the Gini coefficient deviate much more considerably from the Pareto simulation results, however, both in level and trends. Overall, as expected, the IRS simulation method seems to produce estimates that are a compromise between the IRS and CPS estimates. It is notable that the IRS simulation method can roughly match the level and trends from the updated State inequality dataset from Frank (2009). The IRS simulation method is only feasible for circumstances where the income definitions in survey data can be aligned to the IRS definition, and where summary data on tax returns and aggregate AGI by income group is available. In cases where the desired income concept is not tax unit AGI, such as equivalized household income, or a broad income concept that includes in-kind transfers, it is not clear how to integrate information on the tail of the AGI distribution into an estimate of inequality in another distribution. Nonetheless, concerns about under-reporting and censoring in the survey data are still 138 FIGURE 30. State Gini Coefficient, 1997-2012: IRS Simulation vs. Frank (2009) Alabama Alaska Arizona Arkansas California Colorado Connecticut Delaware Florida Georgia Hawaii Idaho Illinois Indiana Iowa Kansas Kentucky Louisiana Maine Maryland Massachusetts Michigan Minnesota Mississippi Missouri Montana Nebraska Nevada New HampshireNew Jersey New Mexico New York North CarolinaNorth Dakota Ohio Oklahoma Oregon Pennsylvania Rhode IslandSouth Carolina South Dakota Tennessee Texas Utah Vermont Virginia Washington West Virginia Wisconsin Wyoming 0.6 0.7 0.6 0.7 0.6 0.7 0.6 0.7 0.6 0.7 0.6 0.7 0.6 0.7 0.6 0.7 0.6 0.7 0.6 0.7 20 00 20 04 20 08 20 12 20 00 20 04 20 08 20 12 20 00 20 04 20 08 20 12 20 00 20 04 20 08 20 12 20 00 20 04 20 08 20 12 year G in i I nd ex Gini Frank (IRS) IRS Pareto Imputation FIGURE 31. State Top 1% Shares, 1997-2012: IRS Simulation vs. Frank (2009) Alabama Alaska Arizona Arkansas California Colorado Connecticut Delaware Florida Georgia Hawaii Idaho Illinois Indiana Iowa Kansas Kentucky Louisiana Maine Maryland Massachusetts Michigan Minnesota Mississippi Missouri Montana Nebraska Nevada New HampshireNew Jersey New Mexico New York North CarolinaNorth Dakota Ohio Oklahoma Oregon Pennsylvania Rhode IslandSouth Carolina South Dakota Tennessee Texas Utah Vermont Virginia Washington West Virginia Wisconsin Wyoming 0.1 0.2 0.3 0.4 0.5 0.1 0.2 0.3 0.4 0.5 0.1 0.2 0.3 0.4 0.5 0.1 0.2 0.3 0.4 0.5 0.1 0.2 0.3 0.4 0.5 0.1 0.2 0.3 0.4 0.5 0.1 0.2 0.3 0.4 0.5 0.1 0.2 0.3 0.4 0.5 0.1 0.2 0.3 0.4 0.5 0.1 0.2 0.3 0.4 0.5 20 00 20 04 20 08 20 12 20 00 20 04 20 08 20 12 20 00 20 04 20 08 20 12 20 00 20 04 20 08 20 12 20 00 20 04 20 08 20 12 year To p 1% S ha re Top 1% Share Frank (IRS) IRS Pareto Imputation 139 salient in these cases. One way to address these concerns is to use a semi-parametric method similar to the Pareto simulation method described above using only survey data. I will use a specific version of this survey data-only approach to estimate state inequality and compare it to the Pareto simulation results produced above to judge the quality of the survey-data only approach. One specific way to address potentially downward biased top incomes in the survey data is to use a multiple imputation approach. In this approach, a Generalized Beta II distribution is fitted to the national income distribution from the CPS microdata in each year. A cutoff is selected, past which incomes are believed to be censored or underreported. To maintain comparability with the Pareto simulation estimates above, I will use the cutoffs identified above using the ”lining up” method. The two data sources line up, on average, at about the 97.5th percentile. This, and the fact that it is only possible to use the “lining up method” for AGI income, supports the use of the 97.5th percentile as a cutoff for non-AGI income definitions (as in Chapter II of this dissertation). Then, as in the Pareto simulation method above, N partially synthetic datasets are formed; each partially synthetic dataset is formed by concatenating all incomes below the cutoff with n1 draws from the fitted GB2 distribution, where n1 is the number of incomes above the cutoff. Inequality measures ν (Fi) are calculated from each partially synthetic dataset, and the resulting estimate of inequality is again νˆ = 1 N ∑N i=1 ν (Fi). Figure 32 compares the Pareto simulation estimates of the Gini coefficient with estimates produced using the GB2 approach using only CPS data. Remarkably, the GB2 simulation method produces estimates that very closely matches the level and trends in inequality estimated using the Pareto simulation approach, but it does so using only information available in the CPS. The one notable exception appears to be Wyoming, for which the GB2 method spectacularly fails. This is likely due to the combination of low population and extreme inequality in the right tail of the income distribution that is 140 unique to Wyoming.2 Figure 33 compares Pareto simulation estimates of the top 1% share to the GB2 simulation estimates. Here the GB2 estimates are not quite as impressively matched. For many states, the GB2 estimates are relatively close to the Pareto simulation estimates, although not nearly as close as the Gini estimates are. Nonetheless, the GB2 estimates are a substantially improvement on the baseline CPS estimates with no correction for topcoding or underreporting. FIGURE 32. State Gini Coefficient, 1997-2012: IRS Simulation vs. GB2 Simulation Alabama Alaska Arizona Arkansas California Colorado Connecticut Delaware Florida Georgia Hawaii Idaho Illinois Indiana Iowa Kansas Kentucky Louisiana Maine Maryland Massachusetts Michigan Minnesota Mississippi Missouri Montana Nebraska Nevada New Hampshire New Jersey New Mexico New York North Carolina North Dakota Ohio Oklahoma Oregon Pennsylvania Rhode Island South Carolina South Dakota Tennessee Texas Utah Vermont Virginia Washington West Virginia Wisconsin Wyoming 0.6 0.7 0.6 0.7 0.6 0.7 0.6 0.7 0.6 0.7 0.6 0.7 0.6 0.7 0.6 0.7 0.6 0.7 0.6 0.7 20 00 20 04 20 08 20 12 20 00 20 04 20 08 20 12 20 00 20 04 20 08 20 12 20 00 20 04 20 08 20 12 20 00 20 04 20 08 20 12 year G in i I nd ex Gini GB2 IRS Pareto Pareto Interpolation In order to estimate the Pareto parameter α using IRS summary data, I use Pareto interpolation. I describe the process using a single state (Alabama) in a single year (2012) 2In 2012, the average income of the top 1% in Wyoming was $4,844,205, the highest in the nation, despite the fact that Wyoming has the smallest population of any state. 141 FIGURE 33. State Top 1% Shares, 1997-2012: IRS Simulation vs. GB2 Simulation Alabama Alaska Arizona Arkansas California Colorado Connecticut Delaware Florida Georgia Hawaii Idaho Illinois Indiana Iowa Kansas Kentucky Louisiana Maine Maryland Massachusetts Michigan Minnesota Mississippi Missouri Montana Nebraska Nevada New Hampshire New Jersey New Mexico New York North Carolina North Dakota Ohio Oklahoma Oregon Pennsylvania Rhode Island South Carolina South Dakota Tennessee Texas Utah Vermont Virginia Washington West Virginia Wisconsin Wyoming 0.1 0.2 0.3 0.1 0.2 0.3 0.1 0.2 0.3 0.1 0.2 0.3 0.1 0.2 0.3 0.1 0.2 0.3 0.1 0.2 0.3 0.1 0.2 0.3 0.1 0.2 0.3 0.1 0.2 0.3 20 00 20 04 20 08 20 12 20 00 20 04 20 08 20 12 20 00 20 04 20 08 20 12 20 00 20 04 20 08 20 12 20 00 20 04 20 08 20 12 year To p 1% S ha re Top 1% Share GB2 IRS Pareto by way of example. This process and notation is adapted from Sommeiller and Price (2014). Table 28 shows the information available in Historical Table 2 in the Statistics of Income Bulletin for Alabama. Pareto Interpolation allows us to calculate the threshold incomes at a given percentile p, and to estimate the Pareto shape parameter α past this threshold. To do the interpolation, I need to proceed in steps. First, define si as the lower bound of each income bin i. Next, define N∗ and Y ∗ as the cumulative sums of Returns and AGI, from highest bin to lowest: N∗i = i∑ j=9 Nj Y ∗i = i∑ j=9 Yj 142 TABLE 28. Alabama Total AGI and Number of Returns, by Size of AGI, 2012 Income Bins Number of Returns AGI 1 under 10,000 340, 890 −116, 087, 000 10,000 under 25,000 575, 210 9, 742, 551, 000 25,000 under 50,000 491, 320 17, 622, 052, 000 50,000 under 75,000 254, 210 15, 629, 937, 000 75,000 under 100,000 159, 820 13, 821, 966, 000 100,000 under 200,000 182, 240 24, 071, 292, 000 200,000 under 500,000 37, 950 10, 701, 664, 000 500,000 under 1,000,000 6, 280 4, 271, 645, 000 1,000,000 or more 2, 970 8, 525, 047, 000 Define yi = Y ∗i N∗i . Then I can define bi = yi si , and hence the Pareto shape parameter is αi = bi bi−1 . Let pi = N∗i Ni and ki = si ( p 1 αi i ) . I can finally derive an expression for the threshold income value at percentile p TI (p) = ki (1− p) 1αi Or, equivalently, if I know a threshold income, the equivalent percentile is p = 1− ( TI (p) ki )−αi This process yields estimates of TI (p) , p, αi using information from each income bin. The final step in the interpolation process is to select which income bin’s estimates are to be used. To do this, I select the income bin for which |p− pi| is minimized (in other words, the bin whose cumulative percentage of incomes are higher is closest to percentile p). I illustrate how this Pareto interpolation method can be used to improve estimates of inequality. Let’s suppose that I know that incomes above the 99th percentile are 143 understated in the CPS data. For Alabama in 2012, the 99th percentile of AGI in the CPS data is $215527.10. To improve these estimates, I would like to replace incomes above this threshold with draws from the Pareto distribution implied by the IRS data. To do this, I first calculate the percentile at which the CPS cutoff income falls in the IRS AGI distribution pPi , and then select the income bin for which ∣∣pPi − pi∣∣ is minimized. The Pareto parameter αi for the selected bin can then be used to simulate top incomes. Table 29 summarizes the intermediate calculations of yi, bi, αi, pi, ki, and Table 30 summarizes the calculation of cutoff percentiles pPi and the choice of income bin, and the Pareto parameter selected. TABLE 29. Intermediate Calculations in the Pareto Interpolation Process Income Bin yi bi αi pi ki 1 under 10,000 50, 841.380 5, 084, 137.000 1.000 1 0.010 10,000 under 25,000 61, 044.540 6.104 1.196 0.834 8, 589.864 25,000 under 50,000 83, 401.870 3.336 1.428 0.553 16, 518.010 50,000 under 75,000 119, 697.200 2.394 1.717 0.314 25, 459.060 75,000 under 100,000 157, 713.600 2.103 1.907 0.190 31, 373.360 100,000 under 200,000 207, 329.400 2.073 1.932 0.112 32, 177.260 200,000 under 500,000 497, 846.500 2.489 1.671 0.023 20, 944.210 500,000 under 1,000,000 1, 383, 426.000 2.767 1.566 0.005 15, 885.180 1,000,000 or more 2, 870, 386.000 2.870 1.535 0.001 14, 123.570 144 TABLE 30. Selecting the Pareto Parameter Income Bin pPi ∣∣pPi − pi∣∣ αi 1 under 10,000 1.000 1.000 1.000 10,000 under 25,000 0.979 0.813 1.196 25,000 under 50,000 0.974 0.528 1.428 50,000 under 75,000 0.974 0.288 1.717 75,000 under 100,000 0.975 0.164 1.907 100,000 under 200,000 0.975 0.086 1.932 200,000 under 500,000 0.980 0.003 1.671 500,000 under 1,000,000 0.983 0.012 1.566 1,000,000 or more 0.985 0.014 1.535 145 Conclusion Previous estimates of income inequality within US states have relied on data from either the Current Population Survey or IRS tax returns. How ver, each data source has conceptual shortcomings. I have described an approach to incorporate information from both sources in a way that addresses the shortcomings while preserving the advantages. This method produces estimates of income inequality which are generally much larger than the naive results estimated from CPS data, while generally somewhat smaller than estimates using just IRS data. Additionally, I show that I can closely match the level and trends in the Gini coefficient estimated using information from both the IRS and the CPS using a semi- parametric method that uses only CPS data. This suggests that CPS-based estimates of income inequality using income definitions other than tax unit Adjusted Gross Income should be regarded as with significantly less skepticism than that which they are often greeted. 146 APPENDIX B STATE AND MSA INCOME INEQUALITY IN THE AMERICAN COMMUNITY SURVEY As noted earlier, the issues presented by topcoding are most severe in the Current Population Survey. However, topcoding is not limited to the CPS. The decennial Census and the American Community Survey1 both topcode incomes, although at a much higher threshold than in the CPS. The Census/ACS also has much less disaggregated income information, collecting data on only eight income sources. Each income source has a state-specific topcode in the public use data equal to the 99.5th percentile of that income source’s distribution in a given state. Additionally, there is a hard topcode of $999,999 in the internal Census data. For the 1990 and 2000 decennial Censuses, and for the ACS from 2005-2011, the public use data imputes topcoded incomes with state-specific cell means calculated in a similar manner to the CPS cell-means, although the cells are state-based rather than demographic-based. Applying the GB2 multiple imputation methodology to Census and ACS data is then relatively straightforward. It should be noted, however, that there are slight differences in sample design between the CPS and Census/ACS that must be addressed to make the two series comparable.2 I calculate the same scalar inequality metrics and Lorenz ordinates for the Census and ACS microdata as those produced from the CPS microdata. Figure 34 visualizes the point estimates and bootstrap confidence intervals for the change in Lorenz curve ordinates from 2005–2011 using ACS. I see that for almost all 1The ACS can be thought of as a successor to the decennial Census, since it replaces the “long form” survey after 2000. The ACS is smaller, providing 1% samples of the US each year, compared to the 5% sample available in the public use long form Census data. 2The chief difference here is that the ACS/Census and CPS have different coverage of individuals in group quarters. Following I dropped observations from the ACS and Census coded as living in institutions, old age homes or prisons. 147 states, the size of the change in Lorenz ordinates reaches a local minimum around the middle of the income distribution (generally between the 50th and 75th percentiles). This is very strong evidence against a pure “top incomes” story. It appears that not just the top 1%, but much of the top half of the distribution, is relatively better off. This also implies that the changes in inequality since 2005 are characterized not just by income gains among the top 1%, but by the fact that the bottom half of the distribution is relatively worse off than it was a decade ago. This important fact would be missed by analyses that focus solely on the share of income going to the top 1% of the distribution. 148 FIGURE 34. Lorenz Curve Results, ACS (2005-2011) 149 APPENDIX C DECOMPOSITION ANALYSIS OF CHANGES IN ENVIRONMENTAL INEQUALITY RIF regressions can be used to decompose the changes between two different distributions. Suppose that I want to analyze the difference in some functional ν between two distributions A and B. Firpo et al. (2011) show that the overall change in the functional νB − νA can be decomposed as νB − νA = K∑ k=1 XB,k ( βˆB,k − βˆA,k ) + K∑ k=1 βˆA,k ( XB,k −XA,k ) where X i,k is the mean of the kth demographic covariate in distribution i ∈ {A,B}, and the βˆi,k are the parameter estimates from an RIF regression using distribution i data. Borrowing terminology from Oaxaca-Blinder wage decompositions, the first term in the decomposition is the “structure effect,” where some part of the change is due to a change in parameters, and the second is the “composition effect,” where some part is due to a change in variables. In the aggregate, these two effects can be viewed as the unexplained and explained variation in the change in the distributional functional of interest. Additionally, each element of the aggregate composition effect can be examined individually (this is often termed a “detailed decomposition”). The structure effect captures all of the variation in environmental inequality that is not due to changes in observable demographic characteristics. The estimated structure effect in a decomposition of the change in environmental inequality in 2005 vs. 2011 can then be thought of as an upper bound on the environmental policy that applies to all census tracts. I decompose the difference in each of the distributional functionals of interest between 2005 and 2011, providing a detailed decomposition of the composition effect, 150 which describes how changes in each sociodemographic variable contribute to the overall change in the functional.1 The aggregate structural effect then captures all variation that is not explained by sociodemographic and economic factors. One unexplained factor that I cannot separately identify is the aggregate influence of environmental policies (chiefly, the continuing regulations associated with the Clean Air Act). The structure effect can be interpreted as a rough upper bound of the aggregate effect of environmental policy on environmental inequality. Tables 31 and 32 summarize the results of a Oaxaca-Blinder style decomposition using the quantile points as distributional functionals for NOx exposure and PM2.5 exposure respectively. The aggregate composition effect is positive and statistically significant at conventional levels for the entire pollution exposure distribution, while the aggregate structure effect is negative throughout the distribution (and larger at each quantile). Aggregate sociodemographic changes appear to be responsible for a net increase in exposure at any given quantile, but this effect is swamped by the effect of policy (the structure effect). Tables 33 and 34 decompose the changes in Lorenz ordinates for the two pollutants.2 It is possible to glean information about the contribution of changes in demographics from the detailed decomposition terms. I can see that race and ethnicity correlate positively with measured inequality according to the Lorenz curve, with proportionally larger effects on the cumulative share of pollution exposure for the most polluted tracts. The black share of a tract’s population contributes positive changes in Lorenz curves across the exposure distribution for both PM2.5 and NOx. The opposite is true for the Hispanic share of the population, suggesting that the proportion African-American contributes 1The detailed decomposition of the structure effect, as noted by Firpo et al. (2011) has no clear interpretation in most cases. 2Tables 37, 38, 39 and 40 repeat the exercise for generalized and absolute Lorenz curves. 151 to rising environmental inequality, while the proportion Hispanic contributes to falling environmental inequality. As noted in chapters III and IV of this dissertation, the environmental justice literature highlights the degree to which black and Latino populations are exposed to excess pollution exposure relative to non-Hispanic whites. I can perform decomposition exercises to further illuminate this disparity, and its connection to environmental inequality. Following Fowlie et al. (2012), I divide census tracts into “diverse” tracts (where the proportion of African-Americans and Latinos exceeds 30%) and non-diverse tracts. I have observations of annual average PM2.5 and NOx exposure (and demographic characteristics) for all tracts for each year from 2005-2011. This means I can use a panel data extension of the RIF-based decomposition above. Following Firpo et al. (2011), I allow each census tract to have a tract-specific fixed effect θi, and assume that the return to the fixed effect is 1 for the non-diverse tracts (now distribution A in my notation), and σ for the diverse tracts (now distribution B). The change in any functional ν can be decomposed as νB − νA = K∑ k=1 XB,k ( βˆB,k − βˆA,k ) + θB (σ − 1) + K∑ k=1 βˆA,k ( XB,k −XA,k ) + ( θB − θA ) Where there are two new terms: θB (σ − 1) is the structure effect of the fixed effect, and( θB − θA ) is the composition effect of the fixed effects. Tables 35 and 36 summarize decompositions of the difference in quantile points between diverse and non-diverse census tracts for NOx and PM2.5 respectively. I find that the first evidence of education effects — for heavily polluted tracts, the composition effect of the less than high school educated proportion is negative and statistically significant. Poverty and the unemployment rate both contribute positively to the diverse vs. non- diverse gap for highly exposed tracts, but negatively for lightly-exposed tracts. 152 TABLE 31. NOx Quantile Detailed Decomposition, 2005 vs. 2011 p=0.1 p=0.25 p=0.5 p=0.75 p=0.9 aggregate composition 0.575*** 0.854*** 1.958*** 2.578*** 4.908*** (0.030) (0.038) (0.055) (0.071) (0.170) aggregate structure -0.672*** -0.992*** -1.085*** -2.059*** -3.503*** (0.046) (0.048) (0.066) (0.102) (0.191) black 0.000*** 0.001*** 0.000 -0.005*** -0.008*** (0.000) (0.000) (0.000) (0.001) (0.001) highschool -0.008*** -0.017*** -0.036*** -0.046*** -0.034*** (0.001) (0.001) (0.002) (0.003) (0.005) incometoneeds50to99 -0.003*** -0.005*** -0.009*** -0.021*** -0.042*** (0.000) (0.001) (0.001) (0.001) (0.003) latino -0.021*** -0.023*** -0.041*** -0.017*** 0.002 (0.001) (0.001) (0.003) (0.003) (0.009) lessthanhs -0.001*** 0.000* 0.001 -0.003*** 0.007*** (0.000) (0.000) (0.000) (0.001) (0.002) medianinc 0.011*** 0.021*** 0.039*** 0.032*** -0.037*** (0.001) (0.001) (0.002) (0.002) (0.005) unemployment 0.006*** 0.027*** 0.062*** 0.071*** 0.087*** (0.001) (0.001) (0.002) (0.003) (0.008) Estimates show the contribution to the change in quantile points from 2005-2011. Bootstrapped standard errors shown in parentheses. Other sociodemographic variables omitted. 153 TABLE 32. PM2.5 Quantile Detailed Decomposition, 2005 vs. 2011 p=0.1 p=0.25 p=0.5 p=0.75 p=0.9 aggregate composition 3.043*** 1.664*** 1.779*** 2.420*** 4.940*** (0.280) (0.140) (0.169) (0.113) (0.235) aggregate structure -2.910*** -2.863*** -3.983*** -3.463*** -2.650*** (0.391) (0.165) (0.161) (0.152) (0.167) black -0.010*** -0.010*** -0.017*** -0.004*** 0.004*** (0.002) (0.002) (0.003) (0.001) (0.001) highschool -0.057*** -0.039*** -0.062*** -0.029*** 0.009 (0.008) (0.004) (0.006) (0.004) (0.008) incometoneeds50to99 -0.021*** -0.021*** -0.036*** -0.024*** -0.020*** (0.004) (0.002) (0.004) (0.002) (0.004) latino -0.129*** -0.179*** -0.131*** -0.022*** 0.082*** (0.013) (0.008) (0.009) (0.005) (0.012) lessthanhs -0.032*** -0.026*** -0.013*** -0.006*** -0.017*** (0.005) (0.004) (0.002) (0.001) (0.003) medianinc 0.085*** 0.040*** 0.036*** -0.007*** -0.065*** (0.007) (0.003) (0.004) (0.002) (0.006) unemployment 0.060*** 0.097*** 0.189*** 0.131*** 0.209*** (0.009) (0.005) (0.009) (0.006) (0.011) Estimates show the contribution to the change in quantile points from 2005-2011. Bootstrapped standard errors shown in parentheses. Other sociodemographic variables omitted. 154 TABLE 33. NOx Relative Lorenz Curve Detailed Decomposition, 2005 vs. 2011 p=0.1 p=0.25 p=0.5 p=0.75 p=0.9 aggregate composition 0.001 -0.006∗ 0.038∗∗∗ 0.107∗∗∗ 0.197∗∗∗ (0.002) (0.003) (0.007) (0.009) (0.008) aggregate structure -0.006∗∗∗ -0.023∗∗∗ -0.031∗∗∗ -0.023∗∗ -0.036∗∗∗ (0.001) (0.003) (0.006) (0.010) (0.008) black 0.000 0.000∗∗∗ 0.000∗∗∗ 0.000∗ 0.000∗∗∗ (0.000) (0.000) (0.000) (0.000) (0.000) highschool 0.000 0.000 -0.002∗∗∗ -0.004∗∗∗ -0.004∗∗∗ (0.000) (0.000) (0.000) (0.000) (0.000) incometoneeds50to99 0.000∗∗∗ 0.000∗∗∗ 0.000∗∗∗ 0.001∗∗∗ 0.000 (0.000) (0.000) (0.000) (0.000) (0.000) latino 0.000∗∗∗ -0.001∗∗∗ -0.005∗∗∗ -0.005∗∗∗ -0.002∗∗∗ (0.000) (0.000) (0.000) (0.001) (0.000) lessthanhs 0.000∗∗∗ 0.000∗∗∗ 0.000∗ 0.000∗∗∗ 0.000∗∗∗ (0.000) (0.000) (0.000) (0.000) (0.000) medianinc 0.000∗∗∗ 0.001∗∗∗ 0.005∗∗∗ 0.009∗∗∗ 0.008∗∗∗ (0.000) (0.000) (0.000) (0.001) (0.000) unemployment -0.001∗∗∗ 0.000∗∗∗ 0.003∗∗∗ 0.008∗∗∗ 0.011∗∗∗ (0.000) (0.000) (0.000) (0.000) (0.000) Estimates show the contribution to the change in RLC points from 2005-2011. Bootstrapped standard errors shown in parentheses. Other sociodemographic variables omitted. 155 TABLE 34. PM2.5 Relative Lorenz Curve Detailed Decomposition, 2005 vs. 2011 p=0.1 p=0.25 p=0.5 p=0.75 p=0.9 aggregate composition 0.019∗∗∗ 0.038∗∗∗ 0.014∗∗∗ -0.010∗∗∗ 0.003 (0.002) (0.004) (0.005) (0.003) (0.002) aggregate structure 0.004∗∗ -0.008∗∗ -0.020∗∗∗ -0.024∗∗∗ -0.014∗∗∗ (0.002) (0.004) (0.004) (0.003) (0.002) black 0.000∗∗ 0.000∗∗∗ 0.000∗∗∗ 0.000∗∗∗ 0.000∗∗∗ (0.000) (0.000) (0.000) (0.000) (0.000) highschool 0.000∗ -0.001∗∗∗ -0.001∗∗∗ -0.001∗∗∗ -0.001∗∗∗ (0.000) (0.000) (0.000) (0.000) (0.000) incometoneeds50to99 0.000 0.000∗∗ 0.000∗∗∗ 0.000∗∗∗ 0.000∗∗∗ (0.000) (0.000) (0.000) (0.000) (0.000) latino -0.001∗∗∗ -0.004∗∗∗ -0.005∗∗∗ -0.004∗∗∗ -0.003∗∗∗ (0.000) (0.000) (0.000) (0.000) (0.000) lessthanhs 0.000∗∗∗ -0.001∗∗∗ -0.001∗∗∗ 0.000∗∗∗ 0.000∗∗∗ (0.000) (0.000) (0.000) (0.000) (0.000) medianinc 0.001∗∗∗ 0.002∗∗∗ 0.002∗∗∗ 0.002∗∗∗ 0.001∗∗∗ (0.000) (0.000) (0.000) (0.000) (0.000) unemployment 0.000∗∗ 0.000 0.000∗ 0.000∗∗ 0.001∗∗∗ (0.000) (0.000) (0.000) (0.000) (0.000) EEstimates show the contribution to the change in RLC points from 2005-2011. Bootstrapped standard errors shown in parentheses. Other sociodemographic variables omitted. 156 TABLE 35. NOx Quantile Detailed Decomposition, Diverse vs. Non-diverse tracts p=0.1 p=0.25 p=0.5 p=0.75 p=0.9 aggregate composition -0.117*** -0.174*** 0.000 0.528*** 1.193*** (0.024) (0.025) (0.033) (0.064) (0.136) aggregate structure 0.167 -0.096 0.597 -2.655*** -6.035*** (0.265) (0.279) (0.427) (0.949) (1.599) black -0.192*** -0.201*** -0.025 0.454*** 0.928*** (0.019) (0.019) (0.026) (0.052) (0.097) highschool -0.001*** 0.000 0.000 -0.001 0.001 (0.000) (0.001) (0.001) (0.001) (0.003) incometoneeds5099 0.010** 0.005 0.007 0.046*** -0.037* (0.004) (0.004) (0.007) (0.009) (0.020) latino 0.056*** 0.009 -0.008 0.033 0.336*** (0.018) (0.019) (0.025) (0.050) (0.103) lessthanhs -0.060*** -0.085*** -0.077*** 0.028 0.210*** (0.010) (0.010) (0.014) (0.024) (0.049) medianinc -0.007* 0.012*** 0.013* -0.112*** -0.276*** (0.004) (0.004) (0.007) (0.013) (0.032) unemployment 0.047*** 0.026*** 0.014*** -0.035*** -0.028** (0.003) (0.004) (0.005) (0.008) (0.014) Estimates show the contribution to the change in quantile points between Diverse vs. Non-diverse tracts. Bootstrapped standard errors shown in parentheses. Other sociodemographic variables omitted. 157 TABLE 36. PM2.5 Quantile Detailed Decomposition, Diverse vs. Non-diverse tracts p=0.1 p=0.25 p=0.5 p=0.75 p=0.9 aggregate composition -0.486*** -0.106 0.190*** 1.491*** 2.042*** (0.088) (0.117) (0.064) (0.170) (0.225) aggregate structure 2.869** 4.826*** -1.206 -5.141*** -3.520 (1.284) (1.007) (0.840) (1.769) (2.648) black -0.168*** -0.201** -0.135*** 0.572*** 1.291*** (0.052) (0.089) (0.050) (0.126) (0.165) highschool -0.005** -0.002 -0.002 0.004 0.000 (0.002) (0.003) (0.001) (0.004) (0.005) incometoneeds5099 -0.030* -0.031* 0.003 0.042 0.151*** (0.018) (0.019) (0.011) (0.027) (0.031) latino -0.274*** 0.257*** 0.315*** 0.647*** 0.233 (0.086) (0.084) (0.042) (0.115) (0.153) lessthanhs -0.042 -0.320*** -0.254*** -0.416*** -0.074 (0.042) (0.048) (0.024) (0.065) (0.082) medianinc -0.026 -0.093*** -0.094*** -0.150*** 0.148*** (0.021) (0.022) (0.014) (0.032) (0.043) unemployment 0.008 0.127*** 0.092*** 0.184*** 0.053** (0.014) (0.014) (0.008) (0.023) (0.026) Estimates show the contribution to the change in quantile points between Diverse vs. Non-diverse tracts. Bootstrapped standard errors shown in parentheses. Other sociodemographic variables omitted. 158 TABLE 37. NOx Generalized Lorenz Curve Detailed Decomposition, 2005 vs. 2011 p=0.1 p=0.25 p=0.5 p=0.75 p=0.9 aggregate composition 0.045*** 0.140*** 0.490*** 1.071*** 1.644*** (0.002) (0.006) (0.014) (0.025) (0.033) aggregate structure -0.050*** -0.178*** -0.438*** -0.813*** -1.201*** (0.003) (0.008) (0.020) (0.033) (0.042) black 0.000** 0.000*** 0.000*** 0.000*** -0.002*** (0.000) (0.000) (0.000) (0.000) (0.000) highschool -0.001*** -0.002*** -0.009*** -0.019*** -0.026*** (0.000) (0.000) (0.000) (0.001) (0.001) incometoneeds50to99 0.000*** -0.001*** -0.003*** -0.006*** -0.010*** (0.000) (0.000) (0.000) (0.000) (0.001) latino -0.001*** -0.004*** -0.013*** -0.020*** -0.022*** (0.000) (0.000) (0.001) (0.001) (0.002) lessthanhs 0.000*** 0.000*** 0.000 0.000 0.000 (0.000) (0.000) (0.000) (0.000) (0.000) medianinc 0.001*** 0.003*** 0.011*** 0.021*** 0.023*** (0.000) (0.000) (0.001) (0.001) (0.001) unemployment 0.000 0.002*** 0.014*** 0.031*** 0.044*** (0.000) (0.000) (0.001) (0.001) (0.001) Estimates show the contribution to the change in GLC points from 2005-2011. Bootstrapped standard errors shown in parentheses. Other sociodemographic variables omitted. 159 TABLE 38. NOx Absolute Lorenz Curve Detailed Decomposition, 2005 vs. 2011 p=0.1 p=0.25 p=0.5 p=0.75 p=0.9 aggregate composition -0.143*** -0.330*** -0.450*** -0.339*** -0.048*** (0.003) (0.008) (0.014) (0.015) (0.011) aggregate structure 0.105*** 0.210*** 0.338*** 0.350*** 0.195*** (0.005) (0.012) (0.022) (0.026) (0.024) black 0.000*** 0.000*** 0.001*** 0.000*** -0.001*** (0.000) (0.000) (0.000) (0.000) (0.000) highschool 0.002*** 0.004*** 0.005*** 0.001* -0.002*** (0.000) (0.000) (0.001) (0.001) (0.001) incometoneeds50to99 0.001*** 0.003*** 0.005*** 0.005*** 0.003*** (0.000) (0.000) (0.000) (0.000) (0.000) latino 0.002*** 0.003*** 0.000 0.000 0.003** (0.000) (0.000) (0.001) (0.001) (0.001) lessthanhs 0.000*** 0.000*** 0.000** -0.001*** -0.001*** (0.000) (0.000) (0.000) (0.000) (0.000) medianinc -0.001*** -0.001*** 0.004*** 0.010*** 0.010*** (0.000) (0.000) (0.000) (0.001) (0.001) unemployment -0.004*** -0.007*** -0.005*** 0.003*** 0.010*** (0.000) (0.000) (0.001) (0.001) (0.001) Estimates show the contribution to the change in ALC points from 2005-2011. Bootstrapped standard errors shown in parentheses. Other sociodemographic variables omitted. 160 TABLE 39. PM2.5 Generalized Lorenz Curve Detailed Decomposition, 2005 vs. 2011 p=0.1 p=0.25 p=0.5 p=0.75 p=0.9 aggregate composition 0.340*** 0.857*** 1.237*** 1.736*** 2.363*** (0.031) (0.062) (0.093) (0.101) (0.115) aggregate structure -0.117*** -0.621*** -1.506*** -2.411*** -2.867*** (0.022) (0.059) (0.094) (0.108) (0.109) black -0.001*** -0.003*** -0.006*** -0.009*** -0.009*** (0.000) (0.000) (0.001) (0.001) (0.001) highschool -0.003*** -0.012*** -0.025*** -0.033*** -0.037*** (0.001) (0.002) (0.002) (0.003) (0.004) incometoneeds50to99 -0.001*** -0.005*** -0.013*** -0.019*** -0.023*** (0.000) (0.001) (0.001) (0.002) (0.002) latino -0.014*** -0.052*** -0.091*** -0.107*** -0.104*** (0.001) (0.003) (0.005) (0.005) (0.006) lessthanhs -0.002*** -0.008*** -0.013*** -0.014*** -0.016*** (0.000) (0.001) (0.002) (0.002) (0.002) medianinc 0.008*** 0.020*** 0.029*** 0.032*** 0.028*** (0.001) (0.001) (0.002) (0.002) (0.003) unemployment 0.005*** 0.022*** 0.059*** 0.097*** 0.127*** (0.001) (0.002) (0.003) (0.004) (0.005) Estimates show the contribution to the change in GLC points from 2005-2011. Bootstrapped standard errors shown in parentheses. Other sociodemographic variables omitted. 161 TABLE 40. PM2.5 Absolute Lorenz Curve Detailed Decomposition, 2005 vs. 2011 p=0.1 p=0.25 p=0.5 p=0.75 p=0.9 aggregate composition 0.068*** 0.178*** -0.123*** -0.304*** -0.085*** (0.024) (0.042) (0.042) (0.029) (0.016) aggregate structure 0.196*** 0.162*** 0.060 -0.062** -0.048** (0.019) (0.038) (0.045) (0.027) (0.019) black 0.000*** 0.000*** -0.002*** -0.002*** -0.001*** (0.000) (0.000) (0.000) (0.000) (0.000) highschool 0.000 -0.004*** -0.008*** -0.008*** -0.007*** (0.001) (0.001) (0.001) (0.001) (0.001) incometoneeds50to99 0.001*** 0.001 0.000 -0.001** -0.001*** (0.000) (0.001) (0.001) (0.000) (0.000) latino -0.004*** -0.029*** -0.046*** -0.039*** -0.023*** (0.001) (0.002) (0.002) (0.002) (0.001) lessthanhs -0.001*** -0.004*** -0.004*** -0.001*** -0.001*** (0.000) (0.001) (0.001) (0.000) (0.000) medianinc 0.006*** 0.015*** 0.018*** 0.016*** 0.009*** (0.000) (0.001) (0.001) (0.001) (0.001) unemployment -0.009*** -0.013*** -0.011*** -0.008*** 0.002** (0.001) (0.001) (0.002) (0.001) (0.001) Estimates show the contribution to the change in ALC points from 2005-2011. Bootstrapped standard errors shown in parentheses. Other sociodemographic variables omitted. 162 APPENDIX D ADDITIONAL TABLES AND FIGURES Chapter II Miscellaneous Tables and Figures FIGURE 35. State-level Gini Coefficient, Pre-transfer Income 1986 1994 2000 2012 25 30 35 40 45 50 25 30 35 40 45 50 −120 −100 −80 −120 −100 −80 long la t Gini 0.40 0.45 0.50 0.55 0.60 0.65 State Pre−tax, Pre−transfer Income Inequality, 1986−2012 163 FIGURE 36. State-level Gini Coefficient, Post-transfer Income 1986 1994 2000 2012 25 30 35 40 45 50 25 30 35 40 45 50 −120 −100 −80 −120 −100 −80 long la t Gini 0.4 0.5 0.6 State Pre−tax, Post−transfer Income Inequality, 1986−2012 164 FIGURE 37. State-level Gini Coefficient, Post-tax Income 1986 1994 2000 2012 25 30 35 40 45 50 25 30 35 40 45 50 −120 −100 −80 −120 −100 −80 long la t Gini 0.30 0.35 0.40 0.45 0.50 State Post−Tax Income Inequality, 1986−2012 165 FIGURE 38. State-level Gini Coefficient, Post-fiscal Income 1986 1994 2000 2012 25 30 35 40 45 50 25 30 35 40 45 50 −120 −100 −80 −120 −100 −80 long la t Gini 0.25 0.30 0.35 0.40 State Post−fiscal Income Inequality, 1986−2012 166 TABLE 41. Determinants of State Income Inequality (Pre-transfer and Post-transfer Gini), All Covariates Dependent variable: (1) (2) (3) (4) (5) (6) union cov −0.231∗∗∗ −0.323∗∗∗ −0.327∗∗∗ −0.183 −0.320∗∗ −0.338∗∗ (0.085) (0.108) (0.124) (0.121) (0.142) (0.163) state minwage −0.0001 −0.001 −0.002 0.0004 −0.001 −0.001 (0.002) (0.002) (0.002) (0.003) (0.002) (0.003) Total rate capgains −0.006∗∗∗ −0.007∗∗ −0.008∗ −0.008∗∗∗ −0.008∗∗ −0.009 (0.002) (0.003) (0.004) (0.002) (0.004) (0.006) Total rate wages 0.003 0.004 0.004 0.006∗ 0.005 0.004 (0.003) (0.004) (0.005) (0.004) (0.005) (0.006) UR 0.004∗∗∗ 0.003∗∗ 0.002 0.004∗∗ 0.004∗ 0.002 (0.001) (0.001) (0.002) (0.002) (0.002) (0.003) Real PersIncPC 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000∗ (0.00000) (0.00000) (0.00000) (0.00000) (0.00000) (0.00000) college prop 0.077 0.130 0.0001 0.163 0.183 0.062 (0.144) (0.169) (0.178) (0.188) (0.234) (0.247) std educ 0.009 0.012 0.011 0.017 0.022 0.016 (0.017) (0.023) (0.021) (0.022) (0.030) (0.025) mean educ −0.021 −0.032 −0.013 −0.005 −0.007 0.002 (0.022) (0.026) (0.030) (0.028) (0.036) (0.039) manufacturing −0.035 0.044 −0.073 0.022 0.090 −0.063 (0.070) (0.111) (0.132) (0.090) (0.144) (0.170) popdens −0.0001 −0.0002 −0.001 −0.0001 0.00004 −0.001 (0.0001) (0.0005) (0.001) (0.0001) (0.0005) (0.001) RD percap −0.042∗∗ −0.125∗∗∗ −0.048 −0.034 −0.159∗∗∗ −0.055 (0.018) (0.039) (0.062) (0.023) (0.047) (0.070) total patents PC 0.013 0.024 0.015 0.021 0.028 0.013 (0.011) (0.015) (0.023) (0.014) (0.020) (0.028) GovtPC −7.169 −15.692∗∗ −9.810 −4.184 −13.262 −9.147 (4.772) (7.443) (8.851) (6.066) (9.290) (10.275) black 0.108∗∗ 0.121∗∗ 0.097 0.099 0.105 0.115 (0.054) (0.061) (0.073) (0.069) (0.083) (0.095) 167 latino 0.087 0.085 0.099 0.198∗∗∗ 0.149 0.160 (0.064) (0.091) (0.099) (0.076) (0.119) (0.129) age 0.001 0.0001 −0.001 0.002 0.001 −0.001 (0.001) (0.002) (0.001) (0.002) (0.002) (0.002) married 0.158 0.098 −0.036 0.328∗∗∗ 0.240∗ 0.082 (0.102) (0.115) (0.121) (0.126) (0.144) (0.147) divorced 0.181 0.057 −0.115 0.159 0.023 −0.156 (0.166) (0.172) (0.152) (0.218) (0.228) (0.197) over55 0.038 0.098 0.040 −0.100 0.024 −0.017 (0.085) (0.099) (0.116) (0.116) (0.130) (0.149) under25 0.120 −0.017 −0.177 0.162 0.019 −0.171 (0.135) (0.147) (0.127) (0.175) (0.184) (0.143) noncitizen 0.260∗ −0.003 −0.064 0.255 0.069 −0.037 (0.140) (0.205) (0.243) (0.155) (0.258) (0.285) nativeborn 0.143 −0.011 −0.006 0.226 0.088 0.076 (0.127) (0.155) (0.178) (0.155) (0.193) (0.221) Linear Trends? No Yes Yes No Yes Yes Quad. Trends? No No Yes No No Yes Observations 1,000 1,000 1,000 1,000 1,000 1,000 Note: ∗p<0.1; ∗∗p<0.05; ∗∗∗p<0.01 All models include State and Year fixed effects TABLE 42. Determinants of State Income Inequality (Post-tax and Post-fiscal Gini), All Covariates Dependent variable: (1) (2) (3) (4) (5) (6) union cov −0.160∗∗ −0.220∗∗ −0.195∗ −0.115 0.009 0.001 (0.080) (0.097) (0.113) (0.128) (0.145) (0.173) state minwage 0.001 0.00002 −0.001 −0.003 0.001 −0.0003 (0.002) (0.002) (0.002) (0.002) (0.003) (0.004) Total rate capgains −0.006∗∗∗ −0.006∗∗ −0.007∗ −0.005 −0.003 −0.005 (0.002) (0.003) (0.004) (0.003) (0.006) (0.007) Total rate wages 0.004∗ 0.004 0.004 0.004 0.004 0.005 (0.002) (0.004) (0.005) (0.004) (0.007) (0.009) UR 0.003∗∗∗ 0.003∗∗ 0.003 0.001 0.001 −0.0004 168 (0.001) (0.001) (0.002) (0.002) (0.002) (0.003) Real PersIncPC 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 (0.00000) (0.00000) (0.00000) (0.00000) (0.00000) (0.00000) college prop 0.084 0.082 −0.014 −0.147 −0.056 −0.142 (0.130) (0.157) (0.168) (0.198) (0.271) (0.300) std educ 0.006 0.011 0.009 −0.018 −0.027 −0.027 (0.015) (0.020) (0.018) (0.022) (0.027) (0.031) mean educ −0.015 −0.016 −0.003 0.031 0.015 0.026 (0.020) (0.025) (0.028) (0.029) (0.038) (0.044) manufacturing 0.006 0.044 −0.054 0.192∗ 0.106 −0.006 (0.062) (0.099) (0.117) (0.098) (0.164) (0.182) popdens −0.0001 0.00001 −0.0004 −0.0003∗∗ 0.0001 0.00002 (0.0001) (0.0003) (0.0004) (0.0001) (0.001) (0.001) RD percap −0.025∗ −0.116∗∗∗ −0.031 −0.054∗ 0.052 −0.051 (0.015) (0.032) (0.054) (0.033) (0.075) (0.115) total patents PC 0.012 0.021 0.016 0.004 0.040∗ 0.043 (0.010) (0.014) (0.020) (0.016) (0.023) (0.037) GovtPC −3.729 −11.331∗ −6.775 4.831 −1.080 0.248 (4.034) (6.566) (7.622) (8.164) (9.778) (13.843) black 0.110∗∗ 0.118∗∗ 0.098 0.124 0.135 0.085 (0.049) (0.057) (0.067) (0.139) (0.149) (0.173) latino 0.102∗∗ 0.111 0.115 0.066 0.298∗ 0.326∗ (0.052) (0.085) (0.091) (0.111) (0.160) (0.172) age 0.001 0.0003 −0.001 0.003 0.002 0.001 (0.001) (0.001) (0.001) (0.002) (0.002) (0.003) married 0.184∗∗ 0.129 0.021 −0.063 −0.053 −0.185 (0.083) (0.097) (0.101) (0.201) (0.213) (0.247) divorced 0.162 0.067 −0.081 −0.009 −0.075 −0.217 (0.148) (0.156) (0.137) (0.199) (0.215) (0.229) over55 −0.059 0.020 −0.007 −0.218∗ −0.176 −0.270 (0.079) (0.088) (0.107) (0.121) (0.148) (0.178) under25 0.110 0.001 −0.137 −0.112 −0.181 −0.330 (0.121) (0.130) (0.109) (0.217) (0.242) (0.281) noncitizen 0.191∗ 0.001 −0.027 0.115 −0.815∗∗ −0.787∗ 169 (0.103) (0.179) (0.210) (0.182) (0.354) (0.414) nativeborn 0.124 0.028 0.053 0.032 −0.247 −0.187 (0.109) (0.134) (0.149) (0.146) (0.203) (0.247) Linear Trends? No Yes Yes No Yes Yes Quad. Trends? No No Yes No No Yes Observations 1,000 1,000 1,000 1,000 1,000 1,000 170 Chapter III Miscellaneous Tables and Figures FIGURE 39. National Black-White PM2.5 Exposure Ratio (by Percentile), 1998-2014 1998 1999 2000 2001 2002 2003 2004 2005 2006 2007 2008 2009 2010 2011 2012 2013 2014 1.0 1.2 1.4 1.0 1.2 1.4 1.0 1.2 1.4 1.0 1.2 1.4 0.25 0.50 0.75 0.25 0.50 0.75 percentile Bl ac k/ W hi te P M 2. 5 Ex po su re R at io Black/White PM2.5 Exposure Ratio, by Percentile 171 FIGURE 40. National Black-White NOx Exposure Ratio (by Percentile), 2005-2011 2005 2006 2007 2008 2009 2010 2011 1.0 1.5 2.0 2.5 3.0 1.0 1.5 2.0 2.5 3.0 1.0 1.5 2.0 2.5 3.0 0.00 0.25 0.50 0.75 1.00 percentile Bl ac k/ W hi te E xp os ur e Ra tio Black/White NOx Exposure Ratio, by Percentile 172 Chapter IV Miscellaneous Tables and Figures TABLE 43. Effect of Income Inequality on Average Latino Exposure (1) (2) (3) IV Results: Gini −3.717∗∗∗ −5.493∗∗∗ −6.020∗∗∗ (1.388) (1.702) (2.114) OLS Results: Gini −0.309 −0.477 −0.484 (0.373) (0.377) (0.445) Observations 1,578 1,578 1,578 First Stage F 86.22 70.5 55.34 Control Variables? No Yes Yes MSA-specific Trend? No No Yes Notes: See Table 17 for further details 173 TABLE 44. Effect of Income Inequality on Average Poor Exposure (1) (2) (3) IV Results: Gini −3.696∗∗∗ −5.460∗∗∗ −6.117∗∗∗ (1.374) (1.701) (2.131) OLS Results: Gini −0.287 −0.465 −0.461 (0.368) (0.373) (0.437) Observations 1,578 1,578 1,578 First Stage F 86.22 70.5 55.34 Control Variables? No Yes Yes MSA-specific Trend? No No Yes Notes: See Table 17 for further details TABLE 45. Effect of Income Inequality on Average Rich Exposure (1) (2) (3) IV Results: Gini −4.174∗∗∗ −5.922∗∗∗ −6.605∗∗∗ (1.341) (1.683) (2.121) OLS Results: Gini −0.335 −0.491 −0.494 (0.359) (0.366) (0.428) Observations 1,578 1,578 1,578 First Stage F 86.22 70.5 55.34 Control Variables? No Yes Yes MSA-specific Trend? No No Yes Notes: See Table 17 for further details 174 REFERENCES CITED Acemoglu, D. (2002). Technical change, inequality, and the labor market. Journal of Economic Literature, 3(1). Alkire, S. and Foster, J. (2011). Counting and multidimensional poverty measurement. Journal of Public Economics, 95. Alvaredo, F. (2011). A note on the relationship between top income shares and the gini coefficient. Economics Letters, 110(3):274–277. Armour, P., Burkhauser, R., and Larrimore, J. (2014). Levels and trends in united states income and its distribution: A crosswalk from market income towards a comprehensive haig-simons income measure. Southern Economic Journal, 81(2). Atkinson, A. (2007). Measuring top incomes: Methodological issues. In Atkinson, A. and Piketty, T., editors, Top Incomes over the Twentieth Century: A Contrast BetweenContinental European and English-Speaking Countries. Oxford University Press. Atkinson, A., Piketty, T., and Saez, E. (2011). Top incomes in the long run of history. Journal of Economic Literature, 49(1953):3–71. Baek, J. and Gweisah, G. (2013). Does income inequality harm the environment?: Empirical evidence from the United States. Energy Policy, 62:1434–1437. Banzhaf, S., editor (2012). The Political Economy of Environmental Justice. Stanford University Press. Barrett, G., Donald, S., and Bhattacharya, D. (2014). Consistent nonparametric tests for lorenz dominance. Journal of Business & Economic Statistics, 32(1). Bartels, L. (2009). Economic inequality and political representation. In Jacobs, L. and King, D., editors, The Unsustainable American State. Oxford University Press, Oxford. Baum-Snow, N. and Ferreira, F. (2015). Causal inference in urban economics. In Duranton, G., Henderson, V., and Strange, W., editors, Handbook of Regional and Urban Economics, vol. 5A. Elseview North-Holland. Been, V. and Gupta, F. (1997). Coming to the nuisance or going to the barrios - a longitudinal analysis of environmental justice claims. Ecology Law Quarterly, 24. Berthe, A. and Elie, L. (2015). Mechanisms explaining the impact of economic inequality on environmental deterioration. Ecological Economics, 116. 175 Bishop, J., Formby, J., and Smith, W. J. (1991). Lorenz dominance and welfare: Changes in the us distribution of income, 1967-1987. The Review of Economics and Statistics, 73(1). Boustan, L., Ferreira, F., Winkler, H., and Zolt, E. (2013). The Effect of Rising Income Inequality on Taxation and Public Expenditures: Evidence from US Municipalities and School Districts, 19702000. Review of Economics and Statistics, 95(October):1291–1302. Boyce, J. and Voirnovytskyy, M. (2010). Economic inequality and environmental quality: evidence of pollution shifting in Russia. Working Paper. Boyce, J., Zwickl, K., and Ash, M. (2016). Measuring environmental inequality. Ecological Economics, 124. Boyce, J. K. (1994). Inequality as a Cause of Environmental Degradation. Ecological Economics, 11(3). Bra¨nnlund, R. and Ghalwash, T. (2008). The income-pollution relationship and the role of income distribution: An analysis of Swedish household data. Resource and Energy Economics, 30:369–387. Brulle, R. J. and Pellow, D. N. (2006). Environmental justice: human health and environmental inequalities. Annual review of public health, 27(102):103–124. Bryant, B. and Mohai, P. (1992). Environmental racism: reviewing the evidence. In Bryant, B. and Mohai, P., editors, Race and the Incidence of Environmental Hazards: A Time for Discourse, page 16376. Westview. Brzezinski, M. (2013). Asymptotic and bootstrap inference for top income shares. Economic Letters, 120(1). Burkhauser, R., Feng, S., and Jenkins, S. (2009). Using the p90/10 index to measure us inequality trends with current population survey data: A view from inside the census bureau vaults. Review of Income and Wealth, 55(1). Burkhauser, R., Feng, S., Jenkins, S., and Larrimore, J. (2011). Estimating trends in us income inequality using the current population survey: The importance of controlling for censoring. Journal of Economic Inequality, 9(1):373–415. Caiazzo, F., Ashok, A., Waitz, I. A., Yim, S. H., and Barrett, S. R. (2013). Air pollution and early deaths in the united states. part i: Quantifying the impact of major sectors in 2005. Atmospheric Environment, 79. Chavis, B. and Lee, C. (1987). Toxic wastes and race in the united states. Technical report, United Church Christ. 176 Clark, L. P., Millet, D. B., and Marshall, J. D. (2014). National Patterns in Environmental Injustice and Inequality: Outdoor NO2 Air Pollution in the United States. PloS one, 9(4). Currie, J. (2011). Inequality at Birth: Some Causes and Consequences. American Economics Review, 101(3). Currie, J. and Walker, R. (2011). Traffic congestion and infant health: Evidence from e-zpass. American Economic Journal: Applied Economics, 3(1):65–90. Daly, M. and Wilson, D. (2013). Inequality and mortality: New evidence from u.s. county panel data. Working Paper. Diaz-Bazan, T. (2015). Measuring inequality from top to bottom. Working Paper. DiNardo, J., Fortin, N., and Lemieux, T. (1996). Labor market institutions and the distribution of wages, 1973-1992: A semiparametric approach. Econometrica, 64(5). Downey, L. (2007). US Metropolitan Area Variation in Environmental Inequality Outcomes. Urban Studies, 44(5-6):953–977. Drabo, A. (2011). Impact of income inequality on health: Does environment quality matter? Environment and Planning A, 43:146–165. Dube, A. (2013). Minimum wages and the distribution of family incomes. Working Paper. Enamorado, T., Lopez-Calva, L.-F., Rodriquez-Castelan, C., and Winkler, H. (2014). Income Inequality and Violent Crime: Evidence from Mexico’s Drug War. Working Paper. Essama-Nssah, B. and Lambert, P. J. (2012). Influence functions for policy impact analysis. In Bishop, J. A. and Salas, R., editors, Inequality, Mobility and Segregation: Essays in Honor of Jacques Silber. Emerald Group. Essama-Nssah, B. and Lambert, P. J. (2016). Counterfactual decomposition of pro-poorness using influence functions. Journal of Human Development and Capabilities, 17(1):74–92. Firpo, S., Fortin, N., and Lemieux, T. (2009). Unconditional quantile regressions. Econometrica, 77(3):953–973. Firpo, S., Fortin, N., and Lemieux, T. (2011). Decomposition Methods In Economics. In Card, D. and Ashenfelter, O., editors, Handbook of Labor Economics. Flaichaire, E. and Davidson, R. (2007). Asymptotic and bootstrap inference for inequality and poverty measures. Journal of Econometrics, 141. Florida, R. and Mellander, C. (2013). The Geography of Inequality. Working Paper. 177 Fowlie, M., Holland, S., and Mansur, E. (2012). What do emissions markets deliver and to whom? evidence from southern california’s nox trading program. American Economic Review, 102. Frank, M. W. (2009). Inequality and Growth in the United States: Evidence From a New State-Level Panel of Income Inequality Measures. Economic Inquiry, 47(1):55–68. Gaskin, D. J., Zare, H., Haider, A. H., and LaVeist, T. A. (2015). The quality of surgical and pneumonia care in minority-serving and racially integrated hospitals. Health Services Research. Gimpelson, V. and Treisman, D. (2015). Misperceiving inequality. Working Paper. Glaeser, E. L., Resseger, M., and Tobio, K. (2009). Inequality in Cities. Journal of Regional Science, 49(4):617–646. Golley, J. and Meng, X. (2012). Income inequality and carbon dioxide emissions: The case of Chinese urban households. Energy Economics, 34(6):1864–1872. Groseclose, T., Levitt, S., and Snyder, J. (1999). Comparing interest group scores across time and chambers: Adjusted ada scores for the u.s. congress. American Political Science Review, 93. Harper, S., Ruder, E., Roman, H. a., Geggel, A., Nweke, O., Payne-Sturges, D., and Levy, J. I. (2013). Using inequality measures to incorporate environmental justice into regulatory analyses. International journal of environmental research and public health, 10(9):4039–59. Heckley, G., Gerdtham, U.-G., and Kjellsson, G. (2016). A general method for decomposing the causes of socioeconomic inequality in health. Journal of Health Economics, 48. Heerink, N., Mulatu, a., and Bulte, E. (2001). Income inequality and the environment: Aggregation bias in environmental Kuznets curves. Ecological Economics, 38(3):359–367. Jenkins, S., Burkhauser, R., Feng, S., and Larrimore, J. (2011). Measuring inequality using censored data: a multiple-imputation approach to estimation andinference. Journal of the Royal Statistical Society: Series A (Statistics in Society), 174(1):63–81. Kakwani, N. C. (1977). Measurement of tax progressivity: An international comparison. The Economic Journal, 87(345):71–80. Kolm, S.-C. (1976). Unequal inequalities. I. Journal of Economic Theory, 12(3):416–442. 178 Kovacevic, M. and Binder, D. (1997). Variance estimation for measures of income inequality and polarization - the estimating equations approach. Journal of Official Statistics, 13(1). Lamsal, L., Martin, R., van Donkelaar, A., Boersma, E. C. R., Dirksen, R., Luo, C., and Wang, Y. (2010). Indirect validation of tropospheric nitrogen dioxide retrieved from the omi satellite instrument: Insight into the seasonal variation of nitrogen oxides at northern mid-latitudes. Journal of Geophysical Research, 115. Lamsal, L., Martin, R., van Donkelaar, A., Steinbacher, M., Celarier, E., Bucsela, E., Dunlea, E., and Pinto, J. (2008). Ground-level nitrogen dioxide concentrations inferred from the satellite-borne ozone monitoring instrument (omi). Journal of Geophysical Research, 113. Larrimore, J., Burkhauser, R., Feng, S., and Zayatz, L. (2008). Consistent cell means for topcoded incomes in the public use march cps (1976-2007). Journal of Economic and Social Measurement, 33:89–128. Levinson, A. and O’Brien, J. (2015). Environmental Engels Curves. Working Paper. Magnani, E. (2000). The environmental Kuznets Curve, environmental protection policy and income distribution. Ecological Economics, 32(3):431–443. Maguire, K. and Sheriff, G. (2011). Comparing distributions of environmental outcomes for regulatory environmental justice analysis. International journal of environmental research and public health, 8(5):1707–26. Majumder, A. and Chakravarty, S. R. (1990). Distribution of personal income: Development of a new model and its application to u.s. income data. Journal of Applied Econometrics, 5(2):189–196. Matlack, J. L. and Vigdor, J. L. (2008). Do rising tides lift all prices? income inequality and housing affordability. Journal of Housing Economics, 17(3):212 – 224. McCarty, N., Poole, K., and Rosenthal, H. (1997). Income Redistribution and the Realignment of American Politics. AEI Press, Washington, DC. McDonald, J. B. (1984). Some generalized functions for the size distribution of income. Econometrica, 52(3):647–663. McDonald, J. B. and Ransom, M. (2008). The generalized beta distribution as a model for the distribution of income: Estimation of related measures of inequality. In Chotikapanich, D., editor, Modeling Income Distributions and Lorenz Curves, pages 147–166. Springer New York, New York, NY. 179 Mellor, J. and Milyo, J. (2002). Income inequality and health status in the united states: Evidence from the current population survey. The Journal of Human Resources, 37(3):510–539. Mills, J. and Zandvakili, S. (1997). Statistical inference via bootstrapping for measures of inequality. Journal of Applied Econometrics, pages 133–150. Mohai, P., Pellow, D., and Roberts, J. T. (2009). Environmental Justice. Annual Review of Environment and Resources, 34:405–430. Moller, S., Alderson, A., and Nielsen, F. (2009). Changing Patterns of Income Inequality in US Counties, 197020001. American Journal of Sociology, 114(4):1037–1101. Morello-Frosch, R. and Jesdale, B. M. (2006). Separate and unequal: residential segregation and estimated cancer risks associated with ambient air toxics in U.S. metropolitan areas. Environmental health perspectives, 114(3):386–393. Morello-frosch, R., Jr, M. P., Porras, C., and Sadd, J. (2002). Environmental Justice and Regional Inequality in Southern California : Implications for Future Research. 110(April):149–154. Moyes, P. (1987). A new concept of lorenz domination. Economics Letters, 23:203–207. Neumayer, E. (2004). The environment, left-wing political orientation and ecological economics. Ecological Economics, 51:167–175. Peters, D. J. (2013). American income inequality across economic and geographic space, 1970-2010. Social Science Research. Piketty, T. and Saez, E. (2003). Income inequality in the United States, 1913-1998. Quaterly Journal of Economics, 118(1). Piketty, T., Saez, E., and Stantcheva, S. (2014). Optimal taxation of top labor incomes: A tale of three elasticities. American Economic Journal: Economic Policy, 6(1):230–71. Ravallion, M., Heil, M., and Jalan, J. (2000). Carbon emissions and income inequality. Oxford Economic Papers, 52:651–669. Reiter, J. (2003). Inference for partially synthetic, public use microdata sets. Survey Methodology, 29. Scruggs, L. A. (1998). Political and economic inequality and the environment. Ecological Economics, 26:259–275. Sheriff, G. and Maguire, K. (2014). Ranking Distributions of Environmental Outcomes. Working Paper. Shorrocks, A. (1983). Ranking income distributions. Economica, 50. 180 Sommeiller, E. and Price, M. (2014). The increasingly unequal states of america income inequality by state, 1917 to 2011. Torras, M. and Boyce, J. K. (1998). Income, inequality, and pollution: a reassessment of the environmental Kuznets Curve. Ecological Economics, 25(2):147–160. van Donkelaar, A., Martin, R., Brauer, M., Hsu, N. C., Kahn, R., Levy, R., Lyapustin, A., Sayer, A., and Winker, D. (2016). Global estimates of fine particulate matter using a combined geophysical-statistical method with information from satellites, models, and monitors. Environmental Science And Technology. Voorheis, J. (2016). Income inequality and carbon emissions: Evidence from state-level data. Working Paper. Voorheis, J., McCarty, N., and Shor, B. (2015). Unequal incomes, ideology and gridlock: How rising inequality increases political polarization. Working Paper. Wilfling, B. (1996). Lorenz ordering of generalized beta-ii income distributions. Journal of Econometrics, 71(12):381 – 388. Wolverton, A. (2009). Effects of Socio-Economic and Input-Related Factors on Polluting Plants’ Location Decisions. The B.E. Journal of Economic Analysis & Policy, 9(1):1–32. Zhu, R. (2016). Wage differentials between urban residents and rural migrants in urban china during 20022007: A distributional analysis. China Economic Review, 37:2 – 14. Special Issue on Human Capital, Labor Markets, and Migration. Zwickl, K., Ash, M., and Boyce, J. (2014). Regional variation in environmental inequality: Industrial air toxics exposure in US cities. Working Paper. Zwickl, K. and Moser, M. (2015). Informal environmental regulation of industrial air pollution: Does neighborhood inequality matter? Working Paper. 181