Sept. 19, 2006

On Tracking Inequality

Paul Krugman

Now that rising income inequality has become a big political issue, people are throwing around a lot of numbers. Some of these numbers are reliable, other aren’t. But how are readers to tell the difference?

Well, one thing that might help is knowing where the standard sources are.

The first point of call is data from the Census. Census numbers are based on the Current Population Survey, a questionnaire filled out by a sample of Americans, then extrapolated to the nation as a whole. For historical comparisons, go to Historical Income Tables, http://www.census.gov/hhes/www/income/histinc/histinctb.html.

Data there are gathered under several categories: households (people living together), families (they have to be related), and individuals. (Formal definitions at http://www.census.gov/population/www/cps/cpsdef.html.) As of now, only the household data have been updated to 2005, which is why I recently turned to Table H-13 – Educational Attainment of Householder – to show that most Americans with a college education have lost ground in recent years. http://www.census.gov/hhes/www/income/histinc/h13.html

The Census data are the key source for assessing how most Americans are doing. However, they do a poor job of tracking incomes at the very top, for two reasons. First, because Census data are based on a limited sample, not the whole population, they’re unreliable in tracking the income of small groups – and the really rich are a small group, who just happen to bulk large in the economy. Second, the questionnaire is “top-coded”: if the individual interviewed has earnings higher than $999,999, those earnings are recorded simply as $999,999. Since a lot of income growth in the last few decades has taken place among people with multimillion-dollar incomes, the Census data miss an important part of the story. In particular, what you won’t learn from Census data is the extent to which rising inequality is a story, not about the top 20 or even the top five percent of the population, but about the top one and the top 0.1 percent.

Fortunately, there’s another source of information: income tax returns, which aren’t top-coded. Tax return data are especially useful if you want to look at long-term trends going back before 1947, which is when the Current Population Survey data begin; high-income Americans have been paying income taxes since 1913. The I.R.S. does its own analyses of these data, and the Congressional Budget Office produces reports based on a merge of Census and I.R.S. data, but the most convenient and comprehensive analyses come from Thomas Piketty, at the Ecole Normal Superior in Paris, and Emmanuel Saez at UC Berkeley. Their latest data set is at Prof. Saez’s Berkeley home page, http://elsa.berkeley.edu/~saez/TabFig2004.xls(Excel file.)

There are other sources, too – which I’ll explain when I use them. You see, I’ve decided to institute a new policy. On inequality, and in fact on many matters economic, it’s all too common to have numbers – some from unknown sources – flying in all directions. The issues are hard enough without clarity about where numbers come from. So from now on I’m going to post sources for the numbers in each column on TimesSelect, with links where possible (it usually is.) Basically, this is the same thing I do when filing my columns; I always provide sources and links to my copy editors. But now I guess my explanations will have to be grammatical! Anyway, I hope that other economic commentators will follow the same practice, which is easy in this Internet age, and will save all of us a lot of confusion.