Home > Economics, Politics, Rant, Theory > How Not to Do Income-Disparity Statistics

How Not to Do Income-Disparity Statistics

I am a statistics snob. It unfortunately means that I end up sounding like a cynic most of the time, because I am naturally skeptical about every statistic I hear. One gets used to the fact that most stats you see are poorly measured, poorly presented, poorly collected, or poorly contexted. I actually play a game with my kids (because I want them to be shunned as sad, cynical people as well) that I call “what could be wrong with that statistic.” In this game, they have to come up with reasons that the claimed implication of some statistic is misleading because of some detail that the person showing the chart hasn’t mentioned (not necessarily nefariously; most users of statistics simply don’t understand).

But mostly, bad statistics are harmless. I have it on good authority that 85% of all statistics are made up, including that one, and another 12.223% are presented with false precision, including that one. As a result, the only statistic that anyone believes completely is the one they are citing themselves. So, normally, I just roll my eyes and move on.

Some statistics, though, because they are widely distributed or widely re-distributed and have dramatic implications and are associated with a draconian prescription for action, deserve special scrutiny. I saw one of these recently, and it is reproduced below (original source is Ray Dalio, who really ought to know better, although I got it from John Mauldin’s Thoughts from the Frontline).

Now, Mr. Dalio is not the first person to lament how the rich are getting richer and the poor are getting poorer, or some version of the socialist lament. Thomas Piketty wrote an entire book based on bad statistics and baseless assertions, after all. I don’t have time to tackle an entire book, and anyway such a work automatically attracts its own swarm of critics. But Mr. Dalio is widely respected/feared, and as such a simple chart from him carries the anti-capitalist message a lot further.[1]

I quickly identified at least four problems with this chart. One of them is just persnickety: the axis obviously should be in log scale, since we care about the percentage deviation and not the dollar deviation. But that is relatively minor. Here are three others:

  1. I suspect that over the time frame covered by this chart, the average age of the people in the top group has increased relative to the average age of the people in the bottom group. In any income distribution, the top end tends to be more populated with older people than the bottom end, since younger people tend to start out being lower-paid. Ergo, the bottom rung consists of both young people, and of older people who haven’t advanced, while the top rung is mostly older people who have Since society as a whole is older now than it was in the 1970s, it is likely that the average age of the top earners has risen by more than the average age of the bottom earners. But that means the comparison has changed since the people at the top now have more time to earn, relative to the bottom rung, than they did before. Dalio lessens this effect a little bit by choosing 35-to-64-year-olds, so new graduates are not in the mix, but the point is valid.
  2. If your point is that the super-wealthy are even more super-wealthier than they were before, that the CEO makes a bigger multiple of the line worker’s salary than before, then the 40th percentile versus 60th percentile would be a bad way to measure it. So I assume that is not Dalio’s point but rather than there is generally greater dispersion to real earnings than there was before. If that is the argument, then you don’t really want the 40th versus the 60th percentile either. You want the bottom 40% versus the top 40 percent except for the top 1%. That’s because the bottom of the distribution is bounded by zero (actually by something above zero since this chart only shows “earners”) and the top of the chart has no bound. As a result, the upper end can be significantly impacted by the length of the upper tail. So if the top 1%, which used to be centi-millionaires, are now centi-billionaires, that will make the entire top 40% line move higher…which isn’t fair if the argument is that the top group (but not the tippy-top group, which we all agree are in a category by themselves) is improving its lot more than the bottom group. As with point 1., this will tend to exaggerate the spread. I don’t know how much, but I know the direction.
  3. This one is the most insidious because it will occur to almost nobody except for an inflation geek. The chart shows “real household income,” which is nominal income (in current dollars) deflated by a price index (presumably CPI). Here is the issue: is it fair to use the same price index to deflate the incomes of the top 40% as we use to deflate the income of the bottom 40%? I would argue that it isn’t, because they have different consumption baskets (and more and more different, as you go higher and higher up the income ladder). If the folks at the top are making more money, but their cost of living is also going up faster, then using the average cost of living increase to deflate both baskets will exaggerate how much better the high-earners are doing than the low-earners. This is potentially a very large effect over this long a time frame. Consider just two categories: food, and shelter. The weights in the CPI tell us that on average, Americans spend about 13% of their income on food and 33% on shelter (these percentages of course shift over time; these are current weights). I suspect that very low earners spend a higher proportion of their budget on food than 13%…probably also more than 33% on shelter, but I suspect that their expenditures are more heavily-weighted towards food than 1:3. But food prices in real terms (deflated by the CPI) are basically unchanged over the last 50 years, while real shelter prices are up about 37%. So, if I am right about the relative expenditure weights of low-earners compared to high-earners, the ‘high-earner’ food/shelter consumption basket has risen by more than the ‘low-earner’ food/shelter consumption basket. Moreover, I think that there are a lot of categories that low-earners essentially consume zero of, or very small amounts of, which have risen in price substantially. Tuition springs to mind. Below I show a chart of CPI-Food, CPI-Shelter, and CPI-College Tuition and Fees, deflated by the general CPI in each case.

The point being that if you look only at incomes, then you are getting an impression from Dalio’s chart – even if my objection #1 and #2 are unimportant – that the lifestyles of the top 40% are improving by lots more than the lifestyles of the bottom 40%. But there is an implicit assumption that these two groups consume the same things, or that the prices of their relative lifestyles are changing similarly. I think that would be a hard argument. What should happen to this chart, then, is that each of these lines should be deflated by a price index appropriate to that group. We would find that the lines, again, would be closer together.

None of these objections means that there isn’t a growing disparity between the haves and the have-nots in our country. My point is simply that the disparity, and moreover the change in the disparity, is almost certainly less than it is generally purported to be with the weakly-assembled statistics we are presented with.

[1] Mr. Mauldin gamely tried to object, but the best he could do was say that capitalists aren’t good at figuring out how to share the wealth. Of course, this isn’t a function of capitalists. The people who decide how to distribute the wealth in capitalism are the consumers, who vote with their dollars. Bill Gates is not uber-rich because he decided to keep hundreds of billions of dollars away from the huddling masses; he is uber-rich because consumers decided to pay hundreds of billions of dollars for what he provided.

Categories: Economics, Politics, Rant, Theory
  1. Ron Wooten
    June 10, 2019 at 4:25 pm

      NIcely done and presented.  

  2. Mike Myers
    June 10, 2019 at 5:58 pm

    I haven’t met many people who are 45 years of age and younger who will let the finer points of statistical analysis change their already made up minds. As far as they’re concerned, corporations are bad, rich people are not paying enough taxes and government needs to be expanded to police corporations, tax the rich and to punish and isolate those of us who do not share their values. There are plenty of politicians who happily facilitate their viewpoint, knowing that it is an open door to cronyism and favoritism when applying the laws of the land.

    We of the Boomer generation did a terrible job when it comes to monitoring what our children were taught in school and by whom. We also failed to instill the merits of personal responsibility and deferred gratification as ingredients to future prosperity. Unfortunately for our children and subsequent generations, the price for this lack of oversight will be paid by them, in heavy terms.

  3. June 11, 2019 at 1:08 pm

    Thanks for both comments!

  4. Margo
    June 11, 2019 at 7:26 pm

    Thanks for doing this.  I am not anywhere as expert as you, but I thought when I read the article that I should go back and relook at the charts, but it was like two in the morning, so…

    From: E-piphany Reply-To: E-piphany Date: Monday, June 10, 2019 at 3:55 PM To: Subject: [New post] How Not to Do Income-Disparity Statistics

    Michael Ashton posted: “I am a statistics snob. It unfortunately means that I end up sounding like a cynic most of the time, because I am naturally skeptical about every statistic I hear. One gets used to the fact that most stats you see are poorly measured, poorly presented, po”

  1. No trackbacks yet.

Leave a Reply

%d bloggers like this: