This is the summer break and I’m publishing old essays written when the audience of this newsletter was confidential. This post has been originally published April 5, 2022.
In a previous post, I briefly mentioned the suggestion made by the philosopher Paul Weithman about a possible Rawlsian account of the populist vote. I want to explore this possibility in more detail here. Weithman’s suggestion relies on the importance of the concept of chain-connectedness in Rawls’s characterization of the difference principle. Let me explain this concept first.
Rawls’s difference principle (DP) says that economic inequalities are justified as long as they are to the advantage of the worst-off segment of the population, as measured in terms of primary social goods owned. The most well-known defense of the DP is made in the context of the original position. Under a veil of ignorance where they ignore their identities as well as their social position, rational individuals would settle on principles of justice according to which institutions are just if and only if (i) “each person is to have an equal right to the most extensive basic liberty compatible with a similar liberty for others” and (ii) “social and economic inequalities are to be arranged so that they are both (a) reasonably expected to be to everyone’s advantage, and (b) attached to positions and offices open to all”.[1] As it has been emphasized many times over the last fifty years, formulated in the context of the original position, the DP is formally equivalent to the maximin criterion of choice under uncertainty. This formal relationship between the DP and the maximin has been the source of extensive criticism, especially from economists. It is just unclear why rational individuals as Rawls characterizes them would choose according to the maximin criterion.
Whatever we may think of this defense of the DP,[2] Rawls actually offers another argument in its support. This argument is not based on a notion of impartiality grounded in epistemic ignorance, but rather on the economist’s fetish concept of Pareto-efficiency. It is put forward in sections 12 and 13 of A Theory of Justice. It can be understood first by starting with another, more compact formulation that Rawls gives to his two principles:
“All social values – liberty and opportunity, income and wealth, and the bases of self-respect – are to be distributed equally unless an equal distribution of any, or all, of these values is to everyone’s advantage” (p. 62, my emphasis).
Rawls thus starts with the presumption that prima facie, only an equal distribution of social values (as measured by primary social goods) is justified. So imagine a society divided into two segments of the population with exactly the same endowment in primary social goods. We suppose that these two segments are engaged in a cooperative venture such that any additional primary good to be distributed is the result of reciprocal relationships. From this perspective, Rawls argues that only changes in the arrangement of the basic structure that are conducive to Pareto improvements are acceptable; reciprocity in the cooperative venture forbids changes that favor one segment of the population when they are in the meantime detrimental to the other segment. An efficient basic structure is a structure that is Pareto optimal, i.e., one in which “it is impossible to change the rules, to redefine the scheme of rights and duties, so as to raise the expectations of any representative man (at least one) without at the same time lowering the expectations of (at least one) some other representative man” (p. 70). How to choose among Pareto-optimal arrangements then? Starting from a situation of perfect equality, the DP says that any rise in the expectations of one segment must also raise the expectations of the other segment by any amount, even if fairly low. In other words, the DP is equivalent to the claim that starting from a situation of equality, only strict Pareto improvements are compatible with the requirement of reciprocity. As Rawls demonstrates graphically, this implies social indifference curves intersecting at right angles at the 45° line in the space representing the endowment of primary social goods of each segment. In this space, if we draw a curve representing the expectations of the less-favored segment as a function of the expectations of the most-favored segment (the “contribution curve”), the DP selects the point of this curve that is tangent to the horizontal segment of an indifference curve. Past this point, any increase in the expectations of the most-favored segment implies a decrease in the expectations of the less-favored segment. In this configuration, the DP is nothing but the strong Pareto criterion applied in a context of initial equality – probably a less contentious characterization of the DP than the maximin version, at least from the economist’s perspective.
So far, so good. Intuitively, things are more complicated once we attempt at generalizing this characterization. Suppose now that the society is divided into three segments: S1 (the most well-off), S2 (the less well-off), and S3 whose endowment in primary social goods is intermediate. There are no real difficulties as long as any move along the contribution curve (in three-dimensional space) is strictly increasing the expectations of all segments. In figure 1 below, where expectations of segments S2 and S3 are represented as a function of the expectations of segment S1, the DP selections point A, i.e., the point beyond which the expectations of segment S2 start to decrease. Consider figure 2 however. Here, the expectations of segment S2 are increasing up to point B, but the expectations of segment 3 are decreasing beyond point C. As formulated by Rawls, the DP selects point B, but it is clear that the reciprocity requirement of cooperation is not fully satisfied past point C.
Rawls acknowledges the difficulty but at first put it on the sidelines by making “certain natural assumptions”. He asks us to “suppose that inequalities in expectations are chain-connected: that is, if an advantage has the effect of raising the expectations of the lowest position, it raises the expectations of all positions in between” (p. 80, my emphasis). The assumption of chain-connectedness makes the case depicted in figure 2 impossible, or at least not covered by the DP. Now, according to Weithman, this is exactly what has happened in the U.S. case, leading to Trump election in 2016. This is a bold claim. If true, that also means that the strong focus economists have given to the “one percent” (i.e., the increasing gap between the income and wealth of the top 1% percent and the rest of the population) since the financial crisis is misplaced, at least as we want to understand the current democratic crisis in Western countries. The problem is not – only – that the wealthiest are getting more, but that those in the intermediate segments of the population believe that those of the segments below them are receiving benefits while their own expectations are decreasing.
We may of course wonder whether this is actually what is happening. The difficulty here in asserting empirically if inequalities are chain-connected or not is that in Rawls’s account, they are expressed in terms of the primary social goods index. Looking at the evolution of income and wealth distributions is not sufficient, though it may be a good approximation. Second, what is relevant is not so much facts as beliefs. Rawls is indeed clear that he is reasoning in terms of expectations. There is no presumption that people’s expectations must be correct, including over the long run. At this point, a potential Rawlsian account of populism can connect with more standard ones, emphasizing the lack of information from parts of the population, reinforced by political communication and strategies of populist leaders.
Anyway, a legitimate question is to ask is “why should we consider that chain-connectedness holds in the first place?” Rawls does not give any real indication about the likelihood that the assumption holds, nor about the reasons it should hold or not. Still, he asserts that in those conditions it does not hold, “those who are better off should not have a veto over the benefits available for the least favored” (p. 80). I would say that the assumption was fairly plausible at the times when A Theory of Justice was written when Western societies were still enjoying sustained growth in the context of redistribution-friendly institutions. This is obviously less true today: growth has considerably slowed down and most of its fruits are captured by individuals belonging to the last decile or even centile of the income distribution. Also, a case can be made for the claim that it is more likely that people perceive the chain-connectedness assumption to hold the more the society they live in is homogeneous. This follows from Rawls’s basic assumption that society is a “cooperative venture for mutual advantage”. The very possibility of cooperation supposes that individuals share some minimal views about their interests. Rawls of course came to realize that this can be a threat to his theory as a whole, leading to its reformulation as a “political” theory of justice. But this reformulation made the limit of Rawls’s theory of justice even more evident: it only applies to societies with a democratic political morality and culture that can sustain an overlapping consensus between conceptions of the good. Though Rawls never made the connection – as far as I know, it might also be that the fate of the chain-connectedness assumption is also related to the stability problem that preoccupied Rawls in the last part of his life.
Now, Brian Barry has also suggested that a defense of the DP without the chain-connectedness assumption can be made.[3] Rawls alludes to this defense in Section 17 of A Theory of Justice where he suggests that the DP “provides an interpretation of the principle of fraternity” (p. 105). Rawls takes the family to illustrate the case of an institution where maximizing the sum of advantages is rejected as a principle. Implicitly, the same should apply to the society conceived as a cooperative venture for mutual advantage. As Barry puts it, if the DP is understood as an exemplification of the principle of fraternity, from the point of view of the worse-off, the fact that some other segments are receiving benefits while others are losing while still remaining better-off is irrelevant. These are those who must have the definitive say, and they will have no reasonable complaint as long as they gain.
What the rise of populism indicates in a Rawlsian perspective is either of the two following things then: (a) the DP might no be a good account of the fraternity principle, and more generally of conceptions of justice that actually prevail in Western societies; or (b) it might be that the place of fraternity in the political morality of Western societies has receded and that considerations of justice are less politically relevant than they have been due to the cultural, social, and economic transformations of the last three or four decades.
[1] A Theory of Justice, p. 60. All page references in the post are to the original edition.
[2] Among economists, John Harsanyi and Ken Binmore are among those who have been the harshest toward the defense of the DP based on the maximin. Michael Moehler has argued convincingly – according to me at least – that these criticisms are however oblivious to the morally-grounded features of Rawls’s original position that justify the use of this choice criterion. See his book Minimal Morality (Princeton, 2018).
[3] See Brian Barry, Theories of Justice (1989, University of California Press), especially pp. 231-4.
Populism points to another problem with Rawls' approach. Behind the veil of ignorance, presumably people know and agree on the cause and effect principles determining the outcomes of policies. But populists disagree sharply in this area, viewing immigration and trade policies as having consequences that differ from those predicted by conventional economics. If people really agreed on their predictions regarding policy outcomes, the negotiation over which to pick would be more straightforward.
Is this a typo? It makes no sense as is.
“are to be distributed equally unless an equal distribution of any, or all, of these values is to everyone’s advantage”