Close
About
FAQ
Home
Collections
Login
USC Login
Register
0
Selected
Invert selection
Deselect all
Deselect all
Click here to refresh results
Click here to refresh results
USC
/
Digital Library
/
University of Southern California Dissertations and Theses
/
Aggregation and the structure of value
(USC Thesis Other)
Aggregation and the structure of value
PDF
Download
Share
Open document
Flip pages
Contact Us
Contact Us
Copy asset link
Request this asset
Transcript (if available)
Content
AGGREGATION AND THE STRUCTURE OF VALUE Weng Kin San A Dissertation Presented to the FACULTY OF THE USC GRADUATE SCHOOL UNIVERSITY OF SOUTHERN CALIFORNIA In Partial Fulfillment of the Requirements for the Degree DOCTOR OF PHILOSOPHY (PHILOSOPHY) August 2024 Table of Contents List of Tables . . . . . . . . . . . . . . . . . . . . . . . . . . . . iv List of Figures . . . . . . . . . . . . . . . . . . . . . . . . . . . vi Chapter 1 Introduction . . . . . . . . . . . . . . . . . . . . . . 1 1.1 The Components of Classical Utilitarianism . . . . . . . . . . . . . . 1 1.2 Pitfalls of the Standard Framework of Distributive Ethics . . . . . . 2 1.3 A Look Ahead . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6 Chapter 2 Replaceable Value in People and Time . . . . . . . . . . 11 2.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11 2.2 The Framework . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16 2.2.1 Replacement . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20 2.2.2 Distributionism . . . . . . . . . . . . . . . . . . . . . . . . . . 22 2.2.3 Anonymity . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23 2.3 Time-Shift Invariance . . . . . . . . . . . . . . . . . . . . . . . . . . . 25 2.3.1 The Paretian Argument . . . . . . . . . . . . . . . . . . . . . 26 2.3.2 The Argument from Relativity . . . . . . . . . . . . . . . . . 27 2.4 Time-Partition Dominance . . . . . . . . . . . . . . . . . . . . . . . . 30 2.4.1 The Argument from Decision-Making . . . . . . . . . . . . . 33 2.4.2 Extension-Creation Neutrality . . . . . . . . . . . . . . . . . 36 2.5 Additivism . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39 2.5.1 Uniqueness of Representation . . . . . . . . . . . . . . . . . . 42 2.5.2 Cardinal Comparisons . . . . . . . . . . . . . . . . . . . . . . 43 2.5.3 Partially Ordered Abelian groups . . . . . . . . . . . . . . . 46 2.6 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48 2.7 Appendix . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50 2.7.1 The Framework . . . . . . . . . . . . . . . . . . . . . . . . . . 50 2.7.2 Time-Shift Invariance . . . . . . . . . . . . . . . . . . . . . . 52 2.7.3 Time-Partition Dominance . . . . . . . . . . . . . . . . . . . . 52 2.7.4 The Result . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53 Chapter 3 Patternism about Value . . . . . . . . . . . . . . . . . 58 ii TABLE OF CONTENTS iii 3.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 58 3.2 Patternism, a First Pass . . . . . . . . . . . . . . . . . . . . . . . . . . 60 3.3 Framework . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 65 3.3.1 Outcomes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 65 3.3.2 Prospects . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 67 3.3.3 Axiology . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 69 3.3.4 Patternism . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 72 3.4 Prioritarianism . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 74 3.5 Patternism Decomposed . . . . . . . . . . . . . . . . . . . . . . . . . 78 3.6 Principles . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 83 3.6.1 Temporal Neutrality . . . . . . . . . . . . . . . . . . . . . . . 84 3.6.2 Statewise Anonymity . . . . . . . . . . . . . . . . . . . . . . 87 3.6.3 Timeslice Stochasticism . . . . . . . . . . . . . . . . . . . . . 93 3.7 Numerical Representations and Additivity . . . . . . . . . . . . . . 98 3.8 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 104 Chapter 4 Additive Relations on Cartesian Products . . . . . . . . . 105 4.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 105 4.2 Primer . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 107 4.3 Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 110 4.4 Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 115 4.5 Applications . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 120 4.5.1 Social Welfare . . . . . . . . . . . . . . . . . . . . . . . . . . . 120 4.5.2 Social Choice . . . . . . . . . . . . . . . . . . . . . . . . . . . 124 4.5.3 Ordered Algebraic Structures . . . . . . . . . . . . . . . . . . 127 4.5.4 Decision Theory . . . . . . . . . . . . . . . . . . . . . . . . . 130 4.5.5 Probability . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 133 4.6 Connections to Existing Work . . . . . . . . . . . . . . . . . . . . . . 137 4.7 Further Research . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 140 4.8 Appendix . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 141 Bibliography . . . . . . . . . . . . . . . . . . . . . . . . . . . . 148 List of Tables Table 3.1 Tradeoffs . . . . . . . . . . . . . . . . . . . . . . . 59 Table 3.2 Sample Distribution . . . . . . . . . . . . . . . . . . . 68 Table 3.3 Simple Tradeoff . . . . . . . . . . . . . . . . . . . . 71 Table 3.4 Expetced Occurrence of an Outcome Level . . . . . . . . . 73 Table 3.5 Interpersonal Tradeoffs . . . . . . . . . . . . . . . . . 75 Table 3.6 Prioritarianism . . . . . . . . . . . . . . . . . . . . . 75 Table 3.7 Interpersonal-Intertemporal Tradeoffs . . . . . . . . . . . 76 Table 3.8 Interpersonal-Risky Tradeoffs . . . . . . . . . . . . . . 77 Table 3.9 Forms of Prioritarianism . . . . . . . . . . . . . . . . . 78 Table 3.10 Dose Allocation 1 . . . . . . . . . . . . . . . . . . . 79 Table 3.11 Dose Allocation 2 . . . . . . . . . . . . . . . . . . . 79 Table 3.12 Dose Allocation 3 . . . . . . . . . . . . . . . . . . . 80 Table 3.13 Dose Allocation 4 . . . . . . . . . . . . . . . . . . . 81 Table 3.14 Uniform Reduction 1 . . . . . . . . . . . . . . . . . . 82 Table 3.15 Uniform Reduction 2 . . . . . . . . . . . . . . . . . . 82 Table 3.16 Uniform Reduction 3 . . . . . . . . . . . . . . . . . . 83 Table 3.17 Temporal Neutrality . . . . . . . . . . . . . . . . . . 84 Table 3.18 Statewise Anonymity . . . . . . . . . . . . . . . . . 87 Table 3.19 Objection from Fairness . . . . . . . . . . . . . . . . 88 Table 3.20 Modal Theory of Value . . . . . . . . . . . . . . . . . 89 Table 3.21 Modal Theory of Value 2 . . . . . . . . . . . . . . . . 90 Table 3.22 Choice of Designators 1 . . . . . . . . . . . . . . . . . 91 Table 3.23 Choice of Designators 2 . . . . . . . . . . . . . . . . . 92 Table 3.24 Timeslice Stochasticism . . . . . . . . . . . . . . . . . 93 Table 3.25 Covax vs. Contravax . . . . . . . . . . . . . . . . . . 94 Table 3.26 Covax vs. Itchax . . . . . . . . . . . . . . . . . . . . 95 Table 3.27 Contaminax . . . . . . . . . . . . . . . . . . . . . . 95 Table 3.28 Equalax vs. Contravax . . . . . . . . . . . . . . . . . 96 iv LIST OF TABLES v Table 3.29 Covax∗ vs. Contravax∗ . . . . . . . . . . . . . . . . . 97 Table 3.30 Numerical Representation . . . . . . . . . . . . . . . 98 Table 3.31 Asymmetry 1 . . . . . . . . . . . . . . . . . . . . . 101 Table 3.32 Asymmetry 2 . . . . . . . . . . . . . . . . . . . . . 102 Table 4.1 The Multi-Dimensional Nature of the Aggregator Approach . 140 List of Figures Figure 1.1 Standard Diagrams in Population Ethics . . . . . . . . . 4 Figure 1.2 Possible Friendship Pairings . . . . . . . . . . . . . . 5 Figure 1.3 Cat not Existing . . . . . . . . . . . . . . . . . . . . 5 Figure 1.4 Equalising Welfare . . . . . . . . . . . . . . . . . . . 5 Figure 2.1 Hasten vs. Delay . . . . . . . . . . . . . . . . . . . 12 Figure 2.2 Extension vs. Creation . . . . . . . . . . . . . . . . . 13 Figure 2.3 Extension vs. Creation− . . . . . . . . . . . . . . . . 14 Figure 2.4 Distributions across People and Time . . . . . . . . . . . 16 Figure 2.5 Sample Distribution . . . . . . . . . . . . . . . . . . 19 Figure 2.6 Replacement 1 . . . . . . . . . . . . . . . . . . . . 20 Figure 2.7 Replacement 2 . . . . . . . . . . . . . . . . . . . . 20 Figure 2.8 Coincident vs. Disjoint . . . . . . . . . . . . . . . . . 20 Figure 2.9 Replacement 3 . . . . . . . . . . . . . . . . . . . . 21 Figure 2.10 Replacement 4 . . . . . . . . . . . . . . . . . . . . 21 Figure 2.11 Anonymity 1 . . . . . . . . . . . . . . . . . . . . . 24 Figure 2.12 Anonymity 2 . . . . . . . . . . . . . . . . . . . . . 24 Figure 2.13 Time-Shift Invariance 1 . . . . . . . . . . . . . . . . 25 Figure 2.14 Time-Shift Invariance 2 . . . . . . . . . . . . . . . . 27 Figure 2.15 Spacetime Diagrams . . . . . . . . . . . . . . . . . 28 Figure 2.16 Relativity Argument . . . . . . . . . . . . . . . . . 29 Figure 2.17 Coffee vs. Tea 1 . . . . . . . . . . . . . . . . . . . . 30 Figure 2.18 Coffee vs. Tea 2 . . . . . . . . . . . . . . . . . . . . 31 Figure 2.19 Time Partition Dominance 1 . . . . . . . . . . . . . . 31 Figure 2.20 Time Partition Dominance 2 . . . . . . . . . . . . . . 32 Figure 2.21 City vs. Suburb . . . . . . . . . . . . . . . . . . . . 33 Figure 2.22 City∗ vs. Suburb∗ . . . . . . . . . . . . . . . . . . . 33 Figure 2.23 Thai vs. Indian . . . . . . . . . . . . . . . . . . . . 34 Figure 2.24 Extension vs. Creation . . . . . . . . . . . . . . . . . 36 vi LIST OF FIGURES vii Figure 2.25 Extension-Creation Neutrality . . . . . . . . . . . . . 36 Figure 2.26 Paralysis vs. Amnesia . . . . . . . . . . . . . . . . . 37 Figure 2.27 Death’s Deal . . . . . . . . . . . . . . . . . . . . . 38 Figure 2.28 Value of Continuity . . . . . . . . . . . . . . . . . . 39 Figure 2.29 The Equivalence of Every World with a Solitary World . . 41 Figure 2.30 Why Lifetime Averagism isn’t Additive . . . . . . . . . 42 Figure 2.31 Why Lifetime Egalitarianism isn’t Additive . . . . . . . 43 Figure 2.32 Quality vs. Quantity . . . . . . . . . . . . . . . . . 44 Figure 2.33 Quality∗ vs. Quantity∗ . . . . . . . . . . . . . . . . 46 Figure 3.1 Diminishing Moral Value of Welfare . . . . . . . . . . . 76 Figure 3.2 Web of Entailments . . . . . . . . . . . . . . . . . . 104 Figure 4.1 Venn Diagrams 1 . . . . . . . . . . . . . . . . . . . 135 Figure 4.2 Venn Diagrams 2 . . . . . . . . . . . . . . . . . . . 135 viii LIST OF FIGURES Chapter 1 Introduction 1.1 The Components of Classical Utilitarianism Perhaps the most well-known statement of utilitarianism comes from Bentham: “it is the greatest happiness for the greatest number that is the measure of right and wrong” (1977, 393). This can be broken down into three components: 1. Hedonism. Value consists fundamentally in happiness. 2. Additivity. Overall value is just the total sum of fundamental value. 3. Consequentialism. One ought to maximise overall value. The first is a substantive view about what the basic constituents of value are. The second is a structural view about how these basic constituents combine. And the third is a thesis about the connection between what’s good and what’s right.1 This package of views, which I’ll call classical utilitarianism, is subject to familiar objections. For instance, it seems to imply that the immense suffering of a few can be offset by the mild pleasure of sufficiently many: “Suppose that Jones has suffered an accident in the transmitter room of a television station. Electrical equipment has fallen on his arm, and we cannot rescue him without turning off the transmitter for fifteen minutes. A World Cup match is in progress, watched by many people, and it will not be over for an hour. Jones’s injury will not get any worse if we wait, but his hand 1This way of carving up utiltiarianism is common. See, for instance, Sumner (1996, 3) who characterises utilitarianism in terms of welfarism (“welfare is the only value which an ethical theory need take seriously, ultimately and for its own sake”), aggregation (“the general good is the sum total of individual goods”), and consequentialism (“the right consists in maximising the general good”). 1 2 PITFALLS OF THE STANDARD FRAMEWORK OF DISTRIBUTIVE ETHICS has been mashed and he is receiving extremely painful shocks. Should we rescue him now or wait until the match is over?” (Scanlon, 1998, 235). Intuitively, it would be wrong to let Jones suffer for the enjoyment of others because that would be unfair to Jones. But classical utilitarianism makes no room for considerations of fairness—not in what makes up value, not in how value is aggregated, and not in any gap between what’s good and what’s right. Hardcore classical utilitarians might bite the bullet. The details suitably filled in (the match sufficiently enjoyable to sufficiently many. . . ), we should let Jones suffer. More moderate utilitarians face a choice. One option is to reject Hedonism, so that fairness can somehow be baked in from the outset as one of the basic constituents of value. Another is to reject Additivity, allowing for fairness to emerge during the aggregation process as a kind of holistic pattern good. Yet another alternative is to reject Consequentialism, treating fairness as an extra consideration on top of goodness in determining what ought to be done. This kind of choice arises more generally in the context of other objections against classical utilitarianism. Many such objections underdetermine which of classical utilitarianism’s answer to the following three questions should go: 1. The Value Question. What does value consist in fundamentally? 2. The Aggregation Question. What’s the relationship between smaller constituents of value and overall value? 3. The Normative Question. What’s the relationship between what’s right or what one ought to do and what’s good? The choice of which component of classical utilitarianism to reject isn’t straightforward. We might have strong intuitions about whether it’s wrong to sacrifice Jones—less so about whether fairness is a basic or holistic good, a normative or purely axiological consideration, and so on. We might subject each component of classical utilitarianism to independent inquiry, in hopes of identifying the weakest link. Of particular interest in this dissertation is the extent to which the Aggregation Question can be examined in isolation from the other two questions. 1.2 Pitfalls of the Standard Framework of Distributive Ethics This kind of “divide and conquer” approach is standard in distributive ethics. For instance, separating the problem of aggregation from substantive issues about value’s content, we find in seminal texts statements like: CHAPTER 1. INTRODUCTION 3 ““welfare” is a term that will be used often in this essay. This concept has, not surprisingly, acquired a number of different meanings. On the one hand, we need to narrow down the possible meanings of this expression so that we know what the examples and principles that we shall discuss involve. On the other hand, we want to avoid taking a stand on controversial issues about welfare which don’t affect the nature of the problems that we are going to discuss—we don’t want to narrow the scope of our discussion unnecessarily.” (Arrhenius, 2000, 6). "I do not assume any particular account of temporal wellbeing. The wellbeing of a person at a time is how well her life goes at that time. It takes into account everything that is good or bad for her at the time. There are narrow accounts of temporal wellbeing, such as hedonism, and broader accounts. I am happy to accommodate any of them. This book is about the aggregation of wellbeing, and I prefer to remain as uncommmitted as possible about the nature of the wellbeing that is aggregated. If you think you know some sort of good that does not appear in the distribution of temporal wellbeing, you may need to broaden your conception of temporal wellbeing. Once you have done that, the good may appear in the distribution after all” (Broome, 2004b, 45). And separating the problem of aggregation from questions about how the good and the right are related, they write: “Utilitarianism is often taken to be the normative theory that tells us to maximise welfare. This theory can be broken down into two components: Consequentialism, which is the view that an action is right if and only if it maximises the good, and an axiological component which identifies the good with the sum total of people’s welfare. In this chapter, we shall discuss the axiological component of Utilitarianism. Likewise, many of the theories that we shall discuss in chapters 3-9 were originally formulated as normative theories, but we shall focus the discussion on the axiological part of these theories.” (Arrhenius, 2000, 37). "When a person faces a choice between a number of alternative acts, which ought she do? Teleology is the view that the answer depends only on the goodness of the alternative acts: what the person ought to do supervenes on the goodness of the alternative acts. . . Some normative theories are firmly teleological; hedonistic utilitarianism is one of these. But many more plausible theories are not teleological under a narrow axiology. In determining what one ought to do, they give a place to considerations that do not fit into a narrow axiology. To a large extent, I can remain noncommittal about 4 PITFALLS OF THE STANDARD FRAMEWORK OF DISTRIBUTIVE ETHICS teleology.” (Broome, 2004b, 31-36). This standard approach attempts to distance the structural problem of aggregation from substantive normative pictures or views about what constitutes value. The problem of aggregation is framed abstractly: How does more basic value (whatever that may be) relate to overall value (however that’s related to what’s right)? This treats aggregation like formal problems in philosophy and elsewhere. Mereologists investigate the structural properties of the parthood relation, while remaining mostly neutral about what exactly the parts and wholes in question are. Mathematicians and logicians investigate abstract structures like Boolean algebras and groups without taking a stance on what exactly “p” in “p ∧ q” stands for (a set? a sentence? a proposition?) or what exactly “1” in “1 + 2” is (a vonNeumann ordinal? a Platonic object? an equivalence class of objects satisfying certain properties?). As in other kindred areas, the standard approach in distributive ethics often makes use of placeholder objects as stand-ins for value. For instance, rife across population ethics are diagrams like: Time 1 Time 2 Person 1 5 8 Person 2 8 5 population size welfare where the numbers and bar heights are supposed to admit of a variety of interpretations depending on one’s preferred theory of welfare or value. But these standard representational devices aren’t entirely neutral. Embedded within them are some assumptions about the structure of value. To illustrate, consider a toy theory on which friendship is all that matters. And how good things are for a person scales in direct proportion to how many friends they have. Two friends is twice as good as one, three friends thrice as good, and so on. This theory limits the possible patterns in which value can be instantiated. Consider a population with three people: Ann, Ben, and Cat. Plausibly, friendship is a symmetric relationship (I don’t truly count as your friend unless you count as mine). There are then only three pairings to consider: Ann and Ben, Ann and Cat, and Ben and Cat. Each of these pairings is either a friendship or it isn’t. So, there are only four kinds of possibilities. Either all three pairings are friendships (A), only two are (B), only one is (C), or none are (D). CHAPTER 1. INTRODUCTION 5 ✓ 2 1 0 A ✓ 2 1 0 B ✓ 2 1 0 C ✓ 2 1 0 D Modulo exactly which pairings are friendships—that is, modulo rearrangements of the bars—these are the only possible ways in which value can be distributed. This has implications for what questions can be meaningfully asked about distribution. For instance, it wouldn’t make sense to ask: holding fixed how well Ann and Ben’s lives go in B, would it have been better for Cat never to have existed (E)? ✓ 2 1 0 B ✗ 2 1 0 E E can’t be instantiated. It’s impossible for Ann’s life to go as well as it does in B in a world where only one other person exists. Removing Cat removes a potential friend. Similarly, we cannot ask: would B be improved by equalising the spread of welfare, either holding total welfare fixed (F) or with a slight reduction in total welfare (G)? ✓ 2 1 0 B ✗ 2 1 0 F ✗ 2 1 0 G F can’t be instantiated. The number of friends one has is a discrete number. It can’t lie strictly in between two neighbouring integers. G is also impossible. “One” is a possible number of friends to have. But given the symmetry of friendship, it’s impossible for three people to each have exactly one friend. (If Ann’s friend is Ben, then Ben’s friend is Ann. So, if they each have exactly one friend, that must leave Cat with no friends). Whether some distribution is possible might also depend on information not encoded in the diagrams above. Maybe people who are too far separated in time 6 A LOOK AHEAD and space can’t be friends. Some questions, like whether making merely spatial or temporal changes can make a difference, then become more fraught. It might sometimes be impossible to move a person in space and time without changing their welfare. Here’s the upshot. On this toy theory, things like Ann’s welfare, existence, or spatio-temporal location can’t be freely dialed up and down or switched on and off while holding all else equal. How well Ann’s life goes is a matter of how many friends she has. And that’s inextricably linked to all sorts of other parameters—how many people exist, how well their lives go, their location, and so on. Sometimes, changing one parameter changes another. This means that, contrary to standard practice, one can’t just assign any combination of welfare levels or freely perform operations like adding and deleting bars or levelling them down and raising them up to arbitrary heights. This limits the questions that can be meaningfully asked. The intelligibility of various questions about aggregation presupposes some things about the nature and structure of value. At some level, this is obvious. For instance, the problem of interpersonal aggregation clearly doesn’t get off the ground until we assume that (at least some) sources of value are things that can be associated with people. Other times, it’s less obvious. Probing whether inequality is in itself bad requires being able to sometimes vary inequality independently of confounding parameters, like total welfare. Asking whether it can matter intrinsically where or when some amount of welfare is enjoyed presupposes the possibility of merely spatial or temporal changes. The Aggregation Question and the Value Question are tangled up in ways that are sometimes obfuscated by the standard framework and its representational devices. 1.3 A Look Ahead This dissertation consists of three independent chapters. The throughline connecting these chapters is a concern with issues concerning aggregation that can roughly be categorised into three kinds: foundational, substantive, and technical. Foundational. One aim of the dissertation is to carefully unpack the assumptions required to probe various questions about aggregation. Two questions will be of particular interest: The Additivity Question. Is value additive across people and time? The Pattern Question. Does the pattern in which value is distributed across CHAPTER 1. INTRODUCTION 7 people, time, and risk matter? These are the subject of chapters one and two, respectively. Part of the concern of those chapters are the background assumptions required to render those questions intelligible. Other subsidiary questions will arise along the way, like: Can temporal or spatial differences matter intrinsically? Does it matter how the distribution of value is correlated across time? We’ve had a glimpse of how the standard framework and its free use of numbers or number-like things (like bar heights) can be treacherous. To avoid these pitfalls, our investigation won’t take numerical representations as given from the get-go. Instead, questions about aggregation will be set up in minimal qualitative frameworks. In these frameworks, further assumptions can be gradually and explicitly introduced. But naively, it is only numbers that we can add up, subtract, multiply, increase or decrease in various increments, and so on. Purged along with the free use of numbers are thus straightforward answers to questions like: How do we “add up” value? What does it mean for a theory to be additive? Can we make sense of comparisons of magnitude? What is it for one thing to be twice as good, slightly better, or much worse than another? These foundational questions about measurement and the interpretation of numerical representations are also taken up along the way. Substantive. Sorting out the foundational issues clears the path for an examination of different answers to the questions about aggregation. Broadly speaking, this dissertation advocates for substantive conclusions that can be seen as highly stripped back forms of utilitarianism. In particular, Chapter 1 argues for an affirmative answer to the Additivity Question—value is additive across people and time. And Chapter 2 argues for a negative answer to the Pattern Question—how value is distributed across people, time, and risk doesn’t matter. In each chapter, the conclusion is shown to follow from more fundamental principles about how to balance trade-offs across people, time, and risk. These principles are backed by strong theoretical considerations. But they are also propped up by background assumptions that do a fair bit of heavy lifting. Partial answers to the Value Question and the Normative Question aren’t just prerequisites for asking certain questions about aggregation. It turns out they can also stack the deck in favour of certain answers to those questions. In general, the richer the structure of value is assumed to be (the more closely 8 A LOOK AHEAD it’s assumed to approximate how value would be structured given Hedonism), the more congenial the initial conditions are towards utilitarian views on aggregation. Similarly if the connection between what’s good and what’s right is assumed to be more robust. It’s no surprise that Additivity, though detachable from Hedonism and Consequentialism, is so naturally allied with those views. Certain features of those views (that aren’t necessarily exclusive to them) lay the groundwork for a strong case to be mounted for Additivity. The best bet for skeptics of utilitarian views about aggregation will sometimes be to reject some of the underlying assumptions, rather than to challenge the arguments directly. Technical. Our investigation will also organically give rise to some technical questions. In a qualitative framework where numbers aren’t taken for granted, the most natural way of understanding what it means for value to be “additive” is in terms of there being an additive representation. This means that there’s some way of assigning numbers to each component such that better distributions are precisely those whose components’ values have a greater total sum. Understanding additivity in terms of the existence of an additive representation prompts various technical questions. Are representations unique? Can additive theories have non-additive representations? Might theories that are traditionally introduced in a non-additive form in fact also have additive representations? Are there non-additive theories which disregard the pattern in which value is distributed? Or does a disregard for pattern imply the existence of an additive representation? And so on. Subsuming many of these questions is the general question of when an additive representation exists. The interest of this question extends beyond distributive ethics to various other areas in philosophy, economics, and mathematics. This technical question is the subject of Chapter 3. The intended contributions of the dissertation—the foundational, the substantive, and the technical—are, to some extent, modular. And the dissertation is, in equal parts, argumentative and exploratory. Though the discussion will inevitably be biased in favour of views I find more promising, particular emphasis will also be placed on laying out the different choice points and branching paths that might be pursued. Hopefully, even hardened skeptics of utilitarian views about aggregation will find value in the clarity gained throughout—the hidden assumptions and distinctions uncovered, the landscape of possible views and CHAPTER 1. INTRODUCTION 9 objections carefully mapped out, the technical issues clarified and answered. 10 A LOOK AHEAD Abstracts Chapter 1. Replaceable Value in People and Time Roughly, the view I call ‘Additivism’ sums up value across time and people. Given some standard assumptions, I show that Additivism follows from two principles. The first, Time-Shift Invariance, says that how lives align in time can’t, in itself, matter. The second, Time-Partition Dominance, roughly says that a world can’t be better unless it’s better within some period or another. These two principles are supported by strong theoretical considerations. The nonAdditivist’s best prospect is to reject some standard background assumptions of population ethics about value being freely replaceable. Chapter 2. Patternism about Value The separateness of persons objection against theories like utilitarianism is invoked often but rarely made precise. This paper carefully isolates out one interpretation of the objection. According to Patternism, a mere difference in how value is distributed across people, time, and possibilities can make for a difference in overall value. Anti-Patternism says otherwise. This paper lays out the issue precisely and offers some considerations in favour of Anti-Patternism. Chapter 3. Additive Relations on Cartesian Products At an abstract level, aggregation problems involve balancing trade-offs along different components to determine which possibility is overall better. Such problems arise not just in distributive ethics but also in many other areas of philosophy, economics, and mathematics. In many applications, it’s desirable to be able to assign numbers to each component and to think of the better possibilities as exactly those with a greater total numerical sum. This chapter consider the question of when such additive representations are possible. The general results proved in this chapter supplements existing results and provides a new perspective on the issue. And they extend and generalise various more specific representation theorems in social welfare theory, social choice theory, decision theory, abstract algebra, and comparative probability theory. Chapter 2 Replaceable Value in People and Time Utilitarians might say: ‘If it does not, as such, matter when something happens, why does it matter to whom it happens? Both of these are mere differences in position . . . Part of the disagreement is, then, this. Non-Utilitarians take the question ‘Who?’ to be quite unlike the question ‘When?’. Derek Parfit, Reasons and Persons Roughly, the view I call ‘Additivism’ sums up value across time and people. Given some standard assumptions, I show that Additivism follows from two principles. The first, Time-Shift Invariance, says that how lives align in time can’t, in itself, matter. The second, Time-Partition Dominance, roughly says that a world can’t be better unless it’s better within some period or another. These two principles are supported by strong theoretical considerations. The nonAdditivist’s best prospect is to reject some standard background assumptions of population ethics about value being freely replaceable. 2.1 Introduction What’s best for one person isn’t always best for another. From this arises the question of interpersonal aggregation central to distributive ethics: how does what’s good for each person combine to determine what’s best overall? 11 12 INTRODUCTION But also, people’s lives are extended in time. And what’s best at one time needn’t be best at another. So, there’s also the question of intertemporal aggregation: how does what’s good at different times combine to determine what’s best overall? Considering both questions in tandem can be highly fruitful. A careful investigation of issues at the interface of the dimensions of time and people will lead us towards an “additive” picture of value.1 To foreshadow some of the arguments to come, consider a non-additive theory like Egalitarianism. Most of us think that inequality is bad—it can badly affect how well people’s lives go. But some think that, even once we account for inequality’s bad effects, an unequal distribution of welfare among people is intrinsically bad. For such egalitarians, a question arises when we take into account the dimension of time. What’s intrinsically bad: inequality at a time or inequality over a lifetime? 2 Suppose it’s inequality at each time. This kind of Timeslice Egalitarianism is objectionably sensitive to mere differences in how lives align in time. Consider: Hasten vs. Delay. Ann can induce labour and give birth earlier. This will affect neither her nor her child’s wellbeing. But elsewhere, Bea is also pregnant and about to deliver. Although Ann and Bea’s children will never meet or interact in any way, when Ann gives birth will affect whether their children’s lives are contemporaneous. 4 8 4 8 4 8 4 8 t0 t1 t2 t0 t1 t2 t3 Ann’s child Bea’s child Ann’s child Bea’s child Hasten Delay Insofar as it has no bearing on how well anyone’s life goes, there’s no reason to try to align the two children’s timelines one way or another. And that’s indeed the case on a view like Total Utilitarianism, which compares worlds by their total welfare. But not so according to Timeslice Egalitarianism. Only in Delay is there any time where there’s interpersonal inequality. The distribution of welfare is unequal from t1 to t2, when the low of one life happens to coincide with the peak of the other. So, unless Delay is better in some other respect that makes up for the inequality, Timeslice Egalitarians should prefer Hasten. 1A similar theme is explored, notably, in Parfit (1984) and Broome (1991, 2004b). 2See McKerlie (1989, 2001, 2012), Temkin (1993), Adler (2007). CHAPTER 2. REPLACEABLE VALUE IN PEOPLE AND TIME 13 It’s presumably for reasons like this that most egalitarians are Lifetime Egalitarians instead—what’s bad is inequality in lifetime welfare. For instance, Nagel says that “the subject of an egalitarian principle is not the distribution of particular rewards to individuals at some time, but the prospective quality of their lives as a whole, from birth to death” (1995, 69). But Lifetime Egalitarians face problems of their own at the intersection of the interpersonal and intertemporal dimensions. For instance, there can sometimes be trade-offs between adding people and adding time: Extension vs. Creation. Bertha isn’t pregnant. But if she were to be, there’s no scenario in which her and her child both survive childbirth. In fact, this condition is hereditary—Bertha’s mother, Anna, died this way giving birth to Bertha. And Bertha’s child, Cat, wouldn’t even have a long life—only as long as Bertha herself would otherwise go on to live. The child’s quality of life would also be only as good as Bertha’s would have been.3 4 4 4 4 4 4 t0 t1 t2 t3 t0 t1 t2 t3 Anna Bertha Anna Bertha Cat Extension Creation Even if Bertha has the prerogative to have a child, it surely can’t be justified here on the basis of bringing about a better world. But it might on Lifetime Egalitarianism. Assuming lifetime welfare to be obtained by summing up welfare across time, Creation contains perfect lifetime equality (all three people enjoy a total lifetime welfare of 4) while Extension contains two people with unequal total lifetime welfare of 4 and 8. So, unless something else about Extension makes it better, Creation must be better for Lifetime Egalitarians—an objectionable conclusion. Worse still, if Creation is better, it’s better by some margin. So, things would presumably still be better than Extension, even if Bertha’s child were to have a slightly worse life as in Creation−. 3Similar cases of prolonging vs. creating are discussed in Broome (1991, 2004a) and Arrhenius (2011). 14 INTRODUCTION 4 4 4 4 4 3 t0 t1 t2 t3 t0 t1 t2 t3 Anna Bertha Anna Bertha Cat Extension Creation− But Bertha’s choice affects only what happens after t2, prior to which the distribution of welfare is the same. But after t2, one person is at welfare level 4 in Extension whereas one person is at welfare level 3 in Creation−. So, Creation− is the same as Extension before t2 and worse after. How then can Creation− be better when there’s no time when it’s better? These problems aren’t decisive but they gesture at the arguments to come. And the problems aren’t exclusive to Egalitarianism. For instance, taking into account the dimension of time, there are also at least two salient forms of Averagism.4 A Timeslice Averagist first averages welfare across people at each time before aggregating those averages across time. Like Timeslice Egalitarianism, this runs into problems with cases like Hasten vs. Delay, where the only difference is in when one person lives relative to another. Both violate the following principle to be explained and defended in §2.3: Time-Shift Invariance. Worlds that differ only by a time-shift are equally good. A Lifetime Averagist, on the other hand, first aggregates welfare across time for each person before averaging their resulting lifetime welfare. Like Lifetime Egalitarianism, this runs into problems with cases like Extension vs. Creation. On both theories, one possibility can be better than another even though there’s no period when it’s better. They violate the following principle explored in greater depth in §2.4: Time-Partition Dominance. If one world is at least as good as another both before and after time t, then it’s at least as good overall. If, furthermore, it’s also better either before or after t, then it’s better. A view that satisfies both these principles is a family of views I call ‘Additivism’. On those views, roughly, value can be thought of as being summed up across people and time. Of course, Egalitarianism, Averagism, and Additivism 4See Hurka (1982a,b). CHAPTER 2. REPLACEABLE VALUE IN PEOPLE AND TIME 15 don’t exhaust the space of possible theories. But it turns out that the problems for Egalitarianism and Averagism generalise to any “non-additive” theory. The main technical result of this paper in §2.5 shows that given some standard background assumptions, the only kind of theory that satisfies both Time-Shift Invariance and Time-Partition Dominance is additive—jointly, the two principles axiomatically characterise Additivism.5 Presenting the case for Additivism requires a deep dive into some foundational issues in axiology. Such issues arise even just in formulating the two key axioms. Take Time-Shift Invariance. Clearly, when a person lives can make a difference—my life would be much worse were I to be magically transported, as I am, to the Middle Ages. So, the claim that it’s irrelevant when someone lives is plausible only if all else besides the time difference is held constant. To control for this, Hasten vs. Delay held fixed the numbers in the diagram when shifting the boxes sideways along the time axis. This is supposed to indicate that the difference is purely temporal. But this raises the question of what these numbers represent and what exactly is involved in holding them fixed. Formulating the other axiom, Time-Partition Dominance, requires explicating what it means for things to be better or worse within a given time period. This raises similar foundational questions. The interpretation of Additivism is similarly tangled up with some foundational issues. Additivism isn’t quite what a naive understanding of “adding up” value across time and people might suggest. The values that are added up can be interpreted in many different ways and they don’t have to be representable by real numbers. Unpacking the exact content of Additivism helps defuse some standard objections against an additive picture of value. But it requires delving into delicate issues concerning the use of numbers in representing value—issues left for §2.5. A careful examination of these foundational issues isn’t just an idle exercise in rigour. The standard framework of population ethics that trades freely in numbers encourages an uncritical acceptance of the legitimacy of certain numerical operations, like holding numbers fixed while shifting boxes left or right in time. But the ability to perform these formal operations depends on some substantive assumptions about the “ontology” of value. Value is implicitly assumed to be 5This paper’s key result continues in the rich tradition of deriving additive representations using separability-like principles (see Debreu (1960), Gorman (1968), Krantz et al. (1971), Wakker (2013)). Generalisations beyond real-valued representations to representations taking values in arbitrary ordered Abelian groups, applied specifically to the context of population ethics, can be found in Pivato (2014) and Thomas (2022). 16 THE FRAMEWORK freely recombinable—the space of value forms a rich mosaic of tiles that we can freely stack, remove, and rearrange however we like. This assumption, to be codified later in §2.2 as “Replacement”, permeates much of population ethics. But once it’s made explicit, we’ll see that Replacement and closely related principles provide a strong foundation for the case in favour of Additivism. The conclusion of this paper is therefore disjunctive: accept an additive theory of value or reject replacement principles. The paper proceeds as follows. §2.2 lays out a framework in which questions of personwise and timewise aggregation can be formulated precisely while remaining as neutral as possible on substantive questions about what exactly value consists in. §2.3 then introduces the first axiom of Time-Shift Invariance and provides two arguments for it. §2.4 does the same for the second axiom, Time-Partition Dominance. The main formal result shows that given the framework’s assumptions, these two axioms are equivalent to Additivism. Relegating the proof to the Appendix, §2.5 is devoted to clarifying some subtle issues regarding the interpretation of Additivism. Along the way, some standard worries with “adding up” value are addressed. §2.6 concludes by noting that the most promising avenues for resisting Additivism are all best understood as rejections of Replacement and cognate principles. This upshot has major methodological implications for how we conceptualise and approach the enterprise of population ethics. 2.2 The Framework Work on interpersonal aggregation often contains diagrams like the following: World A World B Person 1 50 Person 1 100 Person 2 80 Person 2 80 In a setting that involves not just people but time, a natural two-dimensional extension of the diagrams above are diagrams like the ones below: 20 30 40 40 50 50 40 40 t0 t1 t2 t0 t1 t2 t3 Person 1 Person 2 Person 1 Person 2 World A World B CHAPTER 2. REPLACEABLE VALUE IN PEOPLE AND TIME 17 Similar diagrams are found in Broome (1991, 2004b) and elsewhere. But hidden in the use of such diagrams are some non-trivial assumptions about the nature and structure of value. These assumptions, though often left implicit, can seriously tip the scale in favour of particular ways of aggregating value. So, in laying out the foundational framework, it’s worth proceeding slowly—assuming as little as possible from the start and properly codifying any assumptions along the way. The first thing to note is that the diagrams above assign numbers to each person within each time interval. This assumes two things. One is that we can make sense of not just how good things are overall but how good things are for each person at each fixed period in time. Another is that the structure of value is similar enough to the structure of the real numbers that the latter can be used to faithfully represent the former. Both of these assumptions have raised suspicions. But they can be relaxed significantly. At a very general level, the problem of value aggregation is that of deriving the value of wholes from the value of their parts. For us, the wholes of interest are worlds—historically complete description of all that ever happens. Now, some of the things that happen in each world are “locatable” in time and in people—they are things that can be associated with some person within some period of time or another. These are the parts of worlds I’ll call life episodes. The question of aggregating parts into wholes is then: how does a world’s value depend on the value of the episodes it contains? To expand on this notion of life episodes, they have two key features. First, they are localisable—each episode that’s instantiated in a world is attached to some person and some time interval. The time intervals don’t necessarily have to be of a fixed length. Episodes could last a millisecond or a lifetime. We could also call episodes ‘life events’, ‘life stages’, or ‘life segments’—all of which capture the localisable parts of worlds we’re interested in. But ‘episodes’ is evocative of a second crucial feature. An episodic television series is one with a sufficiently disconnected narrative arc so that each episode is self-contained enough to be watched and judged on its own. The episodes that make up a person’s life are meant to be modular in the same way. An episode contains everything that matters to determining its value, so that each episode’s value can be evaluated in isolation from those that precede or proceed it. I take these properties of localisability and modularity to completely characterise what I’ll call ‘episodes’. Beyond these structural properties, we can remain 18 THE FRAMEWORK relatively neutral on what constitutes an episode and when one episode is better than another. This leaves room for a wide range of substantive views about what’s valuable. Consider a life like the one below. t1 t2 t3 t4 I attend a concert (Episode of pleasure) I get stuck in traffic (Episode of displeasure) My desire to enjoy good music is satisfied (Episode of desire satisfaction) My desire to be home early isn’t satisfied (Episode of desire frustration) For hedonists, all that matters is pleasure and pain. So, the life can be carved into two localisable and modular bits, corresponding to the two separate hedonic episodes. One life episode spans t1 to t3 and the other spans t3 to t4. And the former (the episode is pleasure) is better than the latter (the episode of displeasure). Similarly for desire-satisfaction theorists who care only about the satisfaction or frustration of desires. There would be two episodes corresponding to the separate instances of desire satisfaction and frustration—one from t1 to t2 and another from t2 to t4. And the former episode is better than the latter.6 And it doesn’t always have to be possible to carve a life up into episodes in a non-trivial way. Sometimes, a life might consist of only a single episode. For instance, on a pluralist view where both pleasure and desire-satisfaction matter, the life above can only be decomposed into a single episode that lasts from t1 to t4. That’s because there is no finer modular individuation of times where each period can be assigned independent value. 6Hedonists can differ on how they individuate instances of pleasure and pain. Perhaps episodes of pleasure and pain can be individuated so finely that each occupies barely a millisecond. Similar questions of individuation arise in the context of desire satisfaction too. Does the value of a satisfied desire accrue in the entire period of time between a desire being obtained and it being fulfilled? What if I want something, cease to want it, and want it again later? What if what I desire has, unbenkownst to me, already been realised before my desire was even formed? Issues to do with locating the value of satisfied desires in time are discussed by Dorsey (2013) and Sarch (2013), amongst others. There are also questions to do with how to compare episodes: Is an intense basal pleasure better than a mild refined one? Is it better to satisfy a strong but uninformed desire over a weak but informed one? All of these questions have to do with the exact form of a hedonist, or desire-satisfaction, or some other kind of theory of value. They are orthogonal to the issue of aggregation and we can remain neutral on them. CHAPTER 2. REPLACEABLE VALUE IN PEOPLE AND TIME 19 Hopefully, the flexibility and neutrality of the setup is clear. Episodes can occupy variably long stretches of time (possibly an entire lifetime, in some cases). This makes room for those skeptical of the possibility of comparing value at a particular instant or within fixed periods in time.7 And the framework can accommodate not only hedonist or desire-satisfaction theories of value. There’s room for theories that value friendship, knowledge, achievement, and so on. There’s even room for traditionally non-welfarist values like virtue, desert, and so on. These values can be built into whatever makes an episode better or worse, provided that they can be attached to some person and time interval in the same way that pleasure or desire-satisfaction can. The life episodes across all of time for everyone who ever exists can be represented with diagrams like the one below with boxes of varying lengths: a b c t1 t2 t3 t4 This represents that the first person experiences episode a from t1 to t2 and episode b from t2 to t4. And the life of a second person consists of a single episode c spanning t1 to t3. Each such diagram corresponds to what I’ll call a distribution. A distribution is a finite collection of episodes, each of which is associated with some person and some time interval. Going forward, we can use diagrams like the one above to illustrate key concepts and ideas, leaving the precise formal descriptions for the Appendix. I’ll assume that each world can be described by a distribution. This should be fairly uncontroversial once we recall just how flexible the framework is. For instance, there can be distributions where each life can only be evaluated holistically and each consists of only a single episode. And the empty world containing no one can be described by the empty distribution. However, the fact that every world can be described by a distribution shouldn’t be taken to imply that each world has a unique distribution. As we’ll see in §2.3.2, some worlds admit of descriptions by multiple distributions, since there’s some arbitrariness in the choice of time coordinates. The remainder of this section introduces three substantive assumptions which are extensions of standard assumptions in population ethics—Replacement (§2.2.1), Distributionism (§2.2.2), and Anonymity (§2.2.3). 7See Brännmark (2001), Bramble (2014, 2017), Slote (2017), King (2018), and Rosati (2021). 20 THE FRAMEWORK 2.2.1 Replacement Even if every world can be described by a distribution, not every distribution describes a world. Certain distributions might simply not be metaphysically possible. For instance, certain relationships might be necessarily symmetric. If we’re the only people who ever exist, then either both of us are in a romantic relationship together at the same time or neither of us are. Suppose a and b below are life episodes that consist, among other things, of being in a romantic relationship. a b a b ✓ α ✗ β Then, α might be a distribution that can be realised by some world but β isn’t. A romantic relationship is either doubly instantiated at the same time or not at all. Similarly, whether some episode can be instantiated might depend on what other episodes obtain. c d d ✓ γ ✗ δ For instance, if I desire for someone in the past to have enjoyed episode c and d is an episode that partly consists in the satisfaction of that desire, then distribution γ can be realised but δ can’t. A desire can’t be satisfied without its content obtaining. In general, whether some episode can be had might depend on what other episodes are had, by whom, and when. So, not all combinatorially possible patterns of episodes-allocation correspond to metaphysically possible worlds. Nevertheless, most work in population ethics that trades in numbers implicitly takes levels of goodness to be freely recombinable. Consider a diagram similar to the one in Hasten vs. Delay: 4 4 4 4 t0 t1 t0 t1 t2 Ann Ben Ann Ben Coincident Disjoint Suppose that in Coincident, how good things are for Ann and Ben is derived from their friendship. Separating them temporally in Disjoint deprives them of that friendship, thereby removing the original source of their welfare. CHAPTER 2. REPLACEABLE VALUE IN PEOPLE AND TIME 21 Still, it’s often taken for granted that the distribution in Disjoint can be instantiated. The thought is that we can fix how good things are for each of them by making up the “loss” of their friendship with another source of value. This is a source that’s exactly as good as being friends but that doesn’t require their lives to overlap in time. Perhaps, it’s some achievement, some amount of knowledge, or a dose of dopamine. That there is always such an equivalent source of value is a non-trivial assumption about the content of value. Call two distributions equivalent if they have the same pattern of distribution but differ only in that each episode in one is replaced in the other with an equally good episode. Our qualitative framework doesn’t assume numerical representability from the outset. In this framework, the assumption that it’s always possible to make compensations like the one above can be expressed as follows: Replacement. For every distribution, there’s a world with an equivalent distribution. For instance, distribution β below might be impossible to realise because episodes a and b consist partly in being in a symmetric relationship that can only be instantiated together or not at all. a b a ∗ b ∗ β β ∗ ✗ ✓ Replacement says that there’s nevertheless an equivalent distribution β ∗ that can be realised. This is a distribution in which a and b are replaced, respectively, by equally good episodes a ∗ and b ∗ . These episodes, unlike a and b, can be instantiated at separate times. They contain some amount of pleasure, knowledge, achievement, or whatever else that jointly make for the same amount of value as being in a relationship. Similarly, if d is an episode that cannot be instantiated on its own, δ might be a metaphysical impossibility. d d ∗ δ δ ✓ ∗ ✗ But according to Replacement, there should be an episode d ∗ that’s exactly as good as d that can be instantiated on its own so that the distribution δ ∗ equivalent to δ is realisable. Replacement is a kind of domain condition on the set of worlds. Though there might not be a world corresponding to every distribution, Replacement guaran- 22 THE FRAMEWORK tees that there’s at least one world for each equivalence class of distributions. Of course, what possible worlds there are isn’t a freely adjustable parameter—it’s constrained by our best metaphysical theories. So, Replacement is better understood as a constraint on which distributions ought to count as equivalent. That is, in turn, a constraint on our underlying theory of value. It requires that the underlying conception of the good make for a sufficiently rich space of episodes so that for any configuration of episodes, we can find replacements for each episode to create an equivalent configuration that’s metaphysically possible. Replacement is such a widespread assumption that it’s rarely ever even made explicit. It’s taken for granted in any work that freely makes stipulations of the kind “let this person have this level of goodness at this time, that person have that level of goodness at that time, . . . ” without justification. But we’ll see over and over again that Replacement and similar principles—like Spacetime Replacement (§2.3.2) and Choice Replacement (§2.4.1)—are, by no means, weak assumptions. They lay the groundwork for a strong case in favour of Additivism. Because of this, I’ll ultimately suggest that the non-Additivist’s most promising path forward is to reject Replacement and related principles. This will have ramifications for how we conceptualise and do population ethics. 2.2.2 Distributionism The complete history of all that ever happens includes more than just facts about how well each individual’s life goes. But it’s common in population axiology to be presented only with facts about individual goodness and to be expected to compare different worlds solely on that basis. The implicit assumption is that there can be no difference in how good worlds are without any difference in how well individual lives go—the former supervenes on the latter. The analogous assumption in the present context is: Distributionism. Worlds that can be described by equivalent distributions are equally good. Consider first the weaker assumption that worlds described by the very same distribution must be equally good. This precludes aspects of a world not captured by its description as a distribution, like any extratemporal or non-human value contained in it, from factoring into the calculation of the world’s value. For instance, a distribution doesn’t tell us how much biodiversity or natural beauty a world contains. So, according to Distributionism, non-localisable value like CHAPTER 2. REPLACEABLE VALUE IN PEOPLE AND TIME 23 ecological value or aesthetic value can’t affect how good a world is—except indirectly via the influence they might have on people’s lives. Note that Distributionism doesn’t rule out the existence of what Broome calls “pattern goods” (2004b, 44)—holistic value that arises from the shape of a distribution. An example of a pattern good might be the value of equality in the distribution of episodes. An equal world has a different distribution from an unequal world. So, it’s consistent with Distributionism for an unequal world to be worse. Now, recall that equivalent distributions are those with the same pattern of allocation but possibly differ only in that each episode in one is replaced with an equally good episode in the other. Distributionism says that worlds are equally good not just when they share the same distribution but also when they have equivalent distributions. The idea is that for the purposes of comparing worlds, equivalent distributions can be treated as if they were the same. It matters not what each episode is exactly but only how good it is. Things aren’t made better or worse when an episode is replaced by one that’s exactly as good. This idea is built into the standard numerical framework, since numerical levels conflates all sources of value that make for the same amount of goodness.8 2.2.3 Anonymity The final assumption is that it shouldn’t matter who has which life. If I were born into your life, going through exactly all of the things that you do from birth to death, and you mine, the world would not thereby be better or worse. Now, maybe we can’t always swap the lives of any two individuals. Perhaps the mutated gene that makes cilantro taste like soap is essential to me so that I couldn’t be me and have the kind of life where I enjoy having cilantro daily. Perhaps episodes are things that are bound to each individual (like Ann experiencing pleasure) rather than things that aren’t (like the property of experiencing pleasure) so that we can’t freely assign episodes to people. Or, perhaps we want to expand our circle of moral concern to animals and there are lives that human beings can have that cows can’t. To accommodate these and other metaphysical subtleties, let’s call a sequence 8Believers in non-distributional value, like biodiversity and aesthetic value, will find Distributionism objectionable. For such believers, interest in what follows can still be salvaged if distributional value and non-distribution value are “separable” in that the two kinds of value can be entertained separately. If so, non-distributional value can be safely set aside. Any future occurrence of value terms like ‘good’ or ‘better’ can be understood as ‘good qua distribution’ or ‘better distributionally’. 24 THE FRAMEWORK of episodes ordered in time a life. Pictorially, the lives in a distribution correspond to the rows of a diagram. The basic thought was that swapping who has which life—that is, permuting the rows of a diagram—shouldn’t change how good a world is. But because some lives might be unavailable to certain individuals, we need the notion of equivalent lives. One life is equivalent to another if they differ only in that each episode in one sequence is replaced by an equally good episode in the other. a b c d c d a b c ∗ d ∗ a ∗ b ∗ Me You Me You Me You α β β ∗ ✗ ✓ So, for instance, swapping the rows in the distribution α above might result in a distribution β that’s not realisable—perhaps because episodes c and d can’t be had by me. But Replacement guarantees that there is an equivalent distribution β ∗ that is realisable, where each x ∗ in β ∗ is exactly as good an episode as x in β. Let’s call two distributions permutation-equivalent if one can be obtained from another by permuting the rows vertically in a diagram, while possibly replacing episodes with equally good ones. For instance, α and β ∗ above are permutationequivalent. Then, our final assumption states that: Anonymity. Worlds with permutation-equivalent distributions are equally good. This idea that the identity of individuals doesn’t matter shouldn’t be confused with the much more controversial idea (discussed in §2.4.2) that facts about personal identity and continuity or persistence over time don’t matter. Anonymity allows us to permute entire rows vertically without thereby changing how good things are but not parts of rows, as in: a b c d a ∗ d ∗ c ∗ b ∗ Me You Me You α γ ∗ Anonymity doesn’t imply that α and γ ∗ have to be equally good. Having laid out the foundational framework, we can now introduce and argue for the two axioms that will jointly lead to Additivism. CHAPTER 2. REPLACEABLE VALUE IN PEOPLE AND TIME 25 2.3 Time-Shift Invariance In §2.1, Hasten vs. Delay showed how Timeslice Egalitarianism is sensitive to when one person lives relative to another. But holding fixed how much good there is within the segments of each life, a mere difference in how lives align temporally shouldn’t matter. This is the basic idea behind Time-Shift Invariance. This section spells the principle out in more detail and outlines two arguments for it. Suppose the a’s below are equivalent episodes (that is, a is exactly as good an episode as a ′ ), as are the b’s and the c’s. Then, the distributions below are time-shifts of each other. t0 t1 t2 t3 t1 t2 t3 t4 α β a b c a ′ b ′ c ′ Modulo the replacement of each episode with an equivalent one, the two distributions differ only in how lives align in time. In general, two distributions differ only by a time-shift roughly when one can be obtained from the other by moving rows in the diagram horizontally, possibly replacing some life episodes with equivalent ones (see the Appendix for a precise definition). Two worlds are then said to differ only by a time-shift when they can be described by distributions that differ only by a time-shift. Intuitively, a pure time-shift shouldn’t affect how good things are: Time-Shift Invariance. Worlds that differ only by a time-shift are equally good. Of course, changing when someone lives can make things better or worse. Separate two friends in time and you deprive them of their friendship. But such knock-on effects are controlled for in the relevant kinds of time-shift. Any episode lost with a change in time has to be compensated for with an equivalent episode. Replacement guarantees the possibility of such compensations. Once the effects of a time-shift are properly controlled for, a mere change in temporal order shouldn’t matter. This sentiment echoes Parfit: “Most of us believe that a mere difference in when something happens, if it does not affect the nature of what happens, cannot be morally significant. 26 TIME-SHIFT INVARIANCE Certain answers to the question ‘When?’ are of course important. We cannot ignore the timing of events. . . But we aim for [a certain timing] only because of its effects. We do not believe [timing] is, as such, morally important.” (Parfit, 1984, 340). Certain considerations might nevertheless seem to tell against Time-Shift Invariance. If there’s intrinsic value in the longevity of humanity’s existence, then it’d be better all else equal to spread lives out in time. Or, the opposite would be true if, for whatever reason, there’s intrinsic value in people coexisting at the same time. Strong forms of time-discounting might also favour good things being had at certain times—like earlier rather than later.9 But these considerations don’t hold up to scrutiny. That’s because besides its intuitive plausibility, TimeShift Invariance is supported by two compelling theoretical arguments—an argument from a Pareto principle (§2.3.1) and an argument from relativity (§2.3.2). 2.3.1 The Paretian Argument The first argument is that between two worlds that differ only by a time-shift, there’s no one for whom one world is better than another. And a world can’t be better or worse unless it’s better or worse for someone. More precisely, recall that two lives are equivalent if they differ only in that each episode in one is replaced by an equivalent episode in the other. Now, in any two distributions which differ only by a time-shift, the life of each person in one distribution is equivalent to their life in the other. So, Time-Shift Invariance follows from a highly plausible principle: Fixed-Population Pareto Equivalence. Let α and β be distributions containing the same people. If each person’s life in α is equivalent to their life in β, then worlds described by α and β are equally good. The principle can be broken down into two bits. The first thought is that equivalent lives are equally good. And the second is that if two worlds contain the same people with equally good lives in either world, then the two worlds must be equally good. There can be no difference in how good worlds are without a difference in how well the lives in those worlds go. 9On moral reasons for time-discounting, see, for instance, Mogensen (2022). CHAPTER 2. REPLACEABLE VALUE IN PEOPLE AND TIME 27 2.3.2 The Argument from Relativity A second argument arises from scientific considerations about time. In the special theory of relativity, there is no frame-independent fact about what’s simultaneous or what precedes what. Although our assessment of how good things are goes via picking some arbitrary spacetime coordinates and representing worlds as distributions relative to those coordinates, facts about how good things are shouldn’t depend on our choice of representation. So, the second argument for Time-Shift Invariance roughly goes: how lives align in time can depend on the choice of reference frame; but changing reference frames doesn’t change which worlds are better; so, how lives align in time shouldn’t make a moral difference. This sketch of the argument needs refinement. That’s because while the order in which life episodes unfold can depend on the choice of reference frame, it doesn’t always. Whether it does depends on how far apart individuals are located in space. To illustrate the argument more carefully, consider the distributions below, where the a’s below are all equivalent, as are the b’s and the c’s. α γ a b c a ∗ b ∗ c ∗ Consider a world A with distribution α and a world C with distribution γ. According to Time-Shift Invariance, these worlds must be equally good, since they differ only by a time-shift. Let’s see how this follows from considerations of relativity. First, note that the lives in α might be jointly realisable only under certain proximity constraints. Maybe the episodes a, b, and c consist partly in friendship or physical intimacy—things that can’t be had by people too far apart in space. If the episodes in α require that the inhabitants of any world with that distribution be close together, that limits the extent to which we can “shift” the episodes in time simply by changing the reference frame. In that case, relativistic concerns alone could not establish that A and C must be equally good. The missing ingredient is a spatial analogue of Replacement. When introducing Replacement in §2.2.1, we saw that “shifting” people’s lives in time as in Coincident vs. Disjoint can deprive them of certain time-sensitive sources of value, like friendship. The idea behind Replacement was that the “loss” of such value can be made up for with equivalent sources of value that aren’t time-sensitive in the same way. 28 TIME-SHIFT INVARIANCE There’s a natural spatial analogue. “Moving” people apart in space might mean the loss of certain distance-sensitive sources of value, like friendship or physical intimacy. Still, we can hold fixed how good things are for each person in each period while shifting their position. We do this by replacing some episodes with equivalent episodes that aren’t distance-sensitive in the same way. Let’s state this more precisely. So far, the diagrams used associate each episode with a person and a time interval. Spacetime diagrams like the ones above add spatial information. (For ease of visualisation, we’ll focus on only one dimension of space instead of three). These diagrams associate each episode with a spatial location (the horizontal axis x), a temporal location (the vertical axis t), and a person (each person’s life is coded with groups of similar colours, e.g. red and pink vs. blue).10 Each spacetime diagram, subject to certain constraints, corresponds to a spacetime distribution. 11 Not every spacetime distribution can be instantiated. For instance, episodes a, b, and c might contain distance-sensitive goods like friendship or physical intimacy. Then, a spacetime distribution like the one on the left above with the two individuals far apart might not correspond to any metaphysically possible world. But just as Replacement says that every distribution has an equivalent distribution that can be realised, Spacetime Replacement says the same thing of spacetime distributions: 10The spacetime diagrams contain some innocuous simplifications (like that a person is pointsized and has no spatial extension or that, perhaps, each person has a “center point” where we can locate the episodes attached to them). 11One might introduce additional constraints on the spacetime diagrams that correspond to spacetime distribution. For instance, we might require that the worldlines in the diagram don’t intersect so that people aren’t allowed to phase through one another. CHAPTER 2. REPLACEABLE VALUE IN PEOPLE AND TIME 29 Spacetime Replacement. For every spacetime distribution, there’s an equivalent spacetime distribution that’s instantiated by some world. This means even if the people in A can’t be far apart (as in the spacetime distribution on the left above) because the episodes a, b, and c contain distance-sensitive goods, there are nevertheless equally good episodes a ′ , b ′ , and c ′ where the spacetime distribution on the left above is realised by some world B. This intermediary world B serves as bridge between world A with distribution α and world C with distribution γ. As the spacetime diagrams above illusα β1 β2 γ a b c a ′ b ′ c ′ a ′ b ′ c ′ a ∗ b ∗ c ∗ trate, there is more than one possible choice of spacetime coordinates. Relative to the reference frame (t, x) on the left, B can be described as having distribution β1 above. On this reference frame, c ′ starts (t = 0) before a ′ starts (t = 1). And both end at the same time (t = 2) as b ′ starts. But relative to a second reference frame (t ′ , x ′ ) on the right (which differs from the first by a Lorentz transformation), the same world B can be described as having distribution β2. On this reference frame, c ′ starts at the same time that a ′ ends and b ′ starts (t ′ = 3). And c ′ ends (t ′ > 5) after b ′ ends (t ′ < 5). So, this intermediary world B can be described by (at least) two distributions: β1 (which is equivalent to A’s distribution α) and β2 (which is equivalent to C’s distribution γ). By Distributionism, worlds with equivalent distributions must be equally good. So, A and B are equally good, as are B and C. It follows by transitivity that A and C are equally good—exactly as Time-Shift Invariance requires. This kind of argument for Time-Shift Invariance follows in full generality from Spacetime Replacement. For any two worlds A and C that differ by a timeshift, Spacetime Replacement furnishes a world B. The inhabitants of this world are spatially located such that B has a distribution that’s equivalent to A’s relative to one reference frame and a distribution that’s equivalent to C’s relative to another. Distributionism then entails that A and C must be equally good. Spacetime Replacement isn’t unassailable. Perhaps some distance-sensitive sources of value, like friendship, have no distance-insensitive substitute. Perhaps certain levels of goodness are necessarily unachievable by spatially isolated lives. If this is what it takes to be a non-Additivist, then it would be a surprising enough 30 TIME-PARTITION DOMINANCE conclusion in itself. The upshot of denying replacement principles is further explained in §2.6. We now turn to the second axiom. 2.4 Time-Partition Dominance We often compare worlds not just over their complete historical timelines but also over more restricted time periods. We might say that a policy makes things worse in the short term but better in the long run. Granting the possibility of these local comparisons, it’s plausible that a world that’s at least as good as another in every period must be at least as good overall. And it must be better if there’s also some period when it’s better. We saw in Extension vs. Creation from §2.1 how Lifetime Egalitarianism runs afoul of this thought. But making this principle of Time-Partition Dominance precise requires explicating what it means exactly for a world to be better or worse within some period. This section does this and outlines two arguments for the principle. Let’s begin by assuming numerical levels of goodness and comparing two possibilities. Coffee vs. Tea. Ann could have a cup of coffee with breakfast and another after lunch. Alternatively, she could have two cups of tea instead—one with breakfast and another after lunch. Ann generally enjoys coffee more but because of her limited tolerance for caffeine, coffee has quicker diminishing marginal value than tea. So, she would enjoy a first cup of coffee more than a first cup of tea but a second cup of tea more than a second cup of coffee. 7 3 6 4 t t Coffee Tea Intuitively, Tea is better than Coffee after t, since Ann enjoys a second cup of tea more than a second cup of coffee. This can be understood in terms of what’s better overall. We can compare worlds within some period by ignoring what happens outside of that period. For instance, we compare Coffee and Tea after t by imagining a world Coffee-after-t that contains 3 units of goodness for Ann after t but nothing before. Similarly, Tea-after-t contains 4 units of goodness for Ann after t but nothing before. CHAPTER 2. REPLACEABLE VALUE IN PEOPLE AND TIME 31 7 3 6 4 t t Coffee-after-t Tea-after-t To say that Tea is better than Coffee after t is just to say that a world with a distribution like Tea-after-t is better than one with a distribution like Coffee-after-t. But what exactly should we picture when we try to imagine a world with distribution Tea-after-t? We can’t simply imagine a world where Ann simply skips her morning cup of tea. How much Ann enjoys her afternoon cup depends on whether she’s already had a cup in the morning. Tea-after-t is supposed to hold fixed, relative to Tea, how good things are for Ann after t but deprive her of any prior sources of value. To achieve this, Tea-after-t must be a world in which Ann undergoes something after t that’s exactly as good as having a second cup of tea but that doesn’t require her to have previously enjoyed anything of value before. (Perhaps she has a cup of tea with an especially high caffeine content). That there is something fitting this description that Ann can undergo presupposes Replacement. To formalise this idea within our framework, let’s call a time a partition for a distribution if every episode in the distribution ends before that time or begins after it. Diagrammatically, the partitions correspond to those vertical lines that don’t intersect the interior of any boxes. A partition cleanly separates the episodes in a distribution into two periods. For instance, t is a partition for distribution α below. a b c d e a b c d e a b c d e t t t α α-before-t α-after-t Then, α-before-t is the distribution that’s exactly like α up till t and contains nothing after. And α-after-t is the distribution that’s exactly like α after t but contains nothing before. Even if α is a realisable distribution, α-before-t and α-after-t needn’t be. Suppose episode a consists partly in the pleasure derived from having a first cup of tea and b consists partly in the pleasure from a second cup. Then, α-after-t describes a world in which someone enjoys a second cup of coffee without ever having had a first—a metaphysical impossibility. 32 TIME-PARTITION DOMINANCE a b c d e a b ∗ c d e ∗ t t α-after-t α ∗ ✗ ✓ -after-t Nevertheless, Replacement guarantees that there’s an equivalent distribution α ∗ -after-t that is instantiated by some world. That’s a world in which episodes that become impossible to instantiate as a side-effect of “deleting” all episodes outside of some time period are made up for with equivalent episodes. And similarly with α-before-t. Now, suppose worlds A and B have distributions α and β, respectively, with a common partition t. We can then define: (i) A is at least as good as (or better than) B after t just in case a world with a distribution equivalent to α-after-t is at least as good as (better than) a world with a distribution equivalent to β-after-t; (ii) A is at least as good as (or better than) B before t just in case a world with a distribution equivalent to α-before-t is at least as good as (better than) a world with a distribution equivalent to β-before-t. 12 The second axiom can then be stated precisely:13 Time-Partition Dominance. If world A is at least as good as world B both before and after t, then A is at least as good as B. And if, furthermore, A is better than B either before or after t, then A is better than B. Thinking of the partition as the present time, the principle roughly says that a world can’t be worse unless it’s worse in the past or in the future. I’ll now consider an argument for Time-Partition Dominance from decisionmaking (§2.4.1) and address some potential objections that arise from an implication of Time-Partition Dominance (§2.4.2). 12More than one world might have a distribution equivalent to α-after-t, and similarly for β-after-t. But given Distributionism, the choice of worlds doesn’t matter. If one world whose distribution is equivalent to α-after-t is at least as good as one world whose distribution is equivalent to β-after-t, then every world whose distribution is equivalent to α-after-t is at least as good as each world whose distribution is equivalent to β-after-t. The same applies in the case of α-before-t and β-before-t. 13Going forward, whenever one world is said to be at least as good or better than another before or after t, it’s assumed that the worlds have distributions with t as a common partition. CHAPTER 2. REPLACEABLE VALUE IN PEOPLE AND TIME 33 2.4.1 The Argument from Decision-Making When making decisions, we usually focus only on what would change as a result of our choice and ignore the things that would remain the same regardless. Decision-making would be incredibly difficult otherwise. Fleshing out this thought gives rise to an argument for Time-Partition Dominance. To begin, consider this case from Nagel designed to elicit egalitarian sympathies: “Suppose I have two children, one of which is normal and quite happy, and the other of which suffers from a painful handicap. . . I must decide between moving to an expensive city where the second child can receive special medical treatment and schooling. . . or else moving to a pleasant semi-rural suburb where the first child, who has a special interest in sports and nature, can have a free and agreeable life. . . the gain to the first child of moving to the suburb is substantially greater than the gain to the second child of moving to the city. If one chose to move to the city, it would be an egalitarian decision.” (Nagel, 1979, 123-4) What’s notable is the lack of description of how the children fared in the past. For all that’s said, we could have faced many similar decisions in the past. And perhaps in all of them, the second child has always been favoured at the expense of greater gain to the first. If so, the choice we face now might bring about something like either City or Suburb. 5 5 5 5 5 4 5 8 t t City Suburb Or maybe our previous decisions didn’t favour the second child, in which case the choice might be between something like City∗ and Suburb∗ instead. 4 5 8 5 4 4 8 8 t t City∗ Suburb∗ But intuitively, how value was distributed in the past is irrelevant, insofar as it is unaffected by the choice at hand. Generally, in trying to identify what’s best, 34 TIME-PARTITION DOMINANCE we should be able to focus only on the periods in a distribution when our choice would make a difference: Past Separability. If worlds A and B have equivalent distributions before t, then A is at least as good as B just in case it’s at least as good after t. So, for instance, if it’s better to bring about City rather than Suburb, then it’s also better to bring about City∗ over Suburb∗ . What holds of the past holds equally of the future. Often, the causal influence of our choice decays over time so that our choice ceases to make a difference to the distribution of value after some time. We might, for instance, be deciding between Thai (which we like equally) and Indian (which I like much more and you a little less than Thai) for dinner. 5 ? 5 ? 4 ? 8 ? t t Thai Indian Let’s assume our choice will have no impact on how value is distributed past tonight. Then, we should be able to bracket off future unaffected times and base our decision solely on the period before tomorrow. More generally: Future Separability. If worlds A and B have equivalent distributions after t, then A is at least as good as B just in case it’s at least as good before t. The significance of this pair of principles, which I’ll refer to as ‘Separability’, is that they jointly entail Time-Partition Dominance. Most descriptions of decision problems, as exemplified by the quote from Nagel, betray an implicit commitment to something like Separability. They neglect to specify how value would be distributed outside of some narrow period where our choice would make a difference—the assumption being that unaffected times are irrelevant and can be ignored. Indeed, a full description of a decision problem would be unrealistic and often impossible. Pinning down the exact world that an action would bring about requires describing all that ever happens down to the smallest details. This includes details about the lives of ancient Egyptians or those of beings on distant planets beyond our observable universe—lives wholly unaffected by anything we’ll ever do. CHAPTER 2. REPLACEABLE VALUE IN PEOPLE AND TIME 35 But as Parfit puts it, “research in Egyptology cannot be relevant to our decision whether [for instance] to have children” (Parfit, 1984, 420). Separability sets reasonable informational constraints on deliberation, thereby keeping widespread decision paralysis and skepticism about what we ought to do at bay. Without Separability, even our most mundane choices could theoretically depend on unobtainable information about the distant past or distant planets. Now, Separability says that how any two worlds A and B compare depends only on the time periods where their distributions differ. The argument is that deciding between A and B would be intractable otherwise. But what if there’s never any decision situation in which we’d have to choose between that particular pair of worlds? Perhaps some worlds are simply not in my ability to bring about. Or perhaps any choice situation in which I have the opportunity to bring about A is one that precludes the possibility of realising B. If so, then the argument from decision-making in favour of Separability doesn’t generalise to arbitrary pairs of worlds. The argument would generalise if the space of choices that we could, in principle, face is sufficiently rich. The following assumption would suffice: Choice Replacement. For any two worlds A and B, there are worlds: (i) A′ whose distribution is equivalent to A; and (ii) B ′ whose distribution is equivalent to B, such that some possibility requires choosing between bringing about A′ and B ′ . This grants that we might never be in a position to choose between certain pairs of worlds—say, A and B. Nevertheless, like Replacement, the idea is that the underlying theory of the good makes for a sufficiently rich space of episodes. So, by replacing some episodes in A and B with equally good ones, there’d be worlds A′ and B ′ with equivalent distributions that could figure in our choices. Denying this underlying assumption would make the rejection of certain instances of Separability more palatable. The dependence of certain comparisons on unaffected parts of a distribution would not interfere with our ability to deliberate and make informed decisions. But, denying Choice Replacement, as with denying other replacement principles, does not come without cost, as I’ll note in §2.6. 36 TIME-PARTITION DOMINANCE 2.4.2 Extension-Creation Neutrality Let’s now turn to addressing some potential objections to Time-Partition Dominance. Recall the case of Extension vs. Creation from the introduction. In it, Bertha’s life would be cut short if she were to have a child. 4 4 4 4 4 4 t0 t1 t2 t3 t0 t1 t2 t3 Anna Bertha Anna Bertha Cat Extension Creation We saw how Lifetime Egalitarianism suggests the unpalatable implication that Bertha should cut her life short in order to equalise total lifetime welfare. Total Utilitarianism avoids this implication. But it still implies that we should be indifferent between Bertha extending her life and having the child. Some might find this objectionable. Intuitively, Extension is better than Creation. Let’s frame the problem qualitatively. Call two distributions extension-creation equivalent if they differ only in whether a life is extended or a new one is created instead. For instance, if a is exactly as good as a ′ and b is exactly as good as b ′ , the distributions α and β below are extension-creation equivalent. a b a ′ b ′ t t α β Given our framework’s background assumptions, Time-Partition Dominance implies that, all else equal, we should be indifferent between prolonging a life and bringing about a new one: Extension-Creation Neutrality. Worlds with distributions that are extension-creation equivalent are equally good. To illustrate, consider a world A with distribution α above and a world B with distribution β. It follows from Distributionism that A and B are equally good before t, since their distributions before t are equivalent. Furthermore, Anonymity implies that A and B are also equally good after t, since their distributions after t are permutation-equivalent. A and B must therefore be equally good, according CHAPTER 2. REPLACEABLE VALUE IN PEOPLE AND TIME 37 to Time-Partition Dominance, since they are equally good both before and after t. For some, this is grounds for rejecting Time-Partition Dominance. One reason that prolonging a life might be better than bringing about a new one could be because a long life (as in α) is better than multiple shorter ones (as in β). However, it’s possible to explain the intuition favouring longevity in terms of the countless things only adults can enjoy—self-sufficiency, freedom, selfactualisation, intellectual fulfilment. It’s no surprise that we generally prefer long lives replete with the goods that typically accompany maturity over short lives made up of goods characteristic of newborns. If longevity’s value is fully accounted for by its promotion of other goods, then it poses no problem for Extension-Creation Neutrality, which applies only once each episode’s value is held fixed. Take away the association of longer lives with better goods, as in: Paralysis vs. Amnesia. A horrible accident has left Carlos with injuries to his body and a severe concussion. Time is of the essence and the doctors face a choice: (i) treat the concussion and leave Carlos paralysed neck down from the injuries (Paralysis), or; (ii) treat his injuries and leave him with total amnesia from the concussion (Amnesia). a b a ′ b ′ Paralysis Amnesia Suppose, following a prominent view whose history traces back to Locke, that some kind of psychological continuity is necessary for persistence over time. Specifically, suppose that post-amnesia “Carlos” would, strictly speaking, be a different person from pre-amnesia “Carlos”. Amnesia might deprive Carlos of certain good things—reminiscing with loved ones, reconnecting with childhood friends. But equally, an amnesiac can more easily enjoy many things that a quadriplegic can’t—freedom of movement, ease in performing daily tasks. Assume that anything Carlos would lose as a result of amnesia or paralysis would be made up for with something exactly as good in the other possibility. That is, assume that Paralysis and Amnesia are extension-creation equivalent—for each episode x in Paralysis, there’s an equivalent episode x ′ in Amnesia. All else fixed, do the doctors have reason to prevent amnesia rather than paralysis simply because there would otherwise be two short lives rather than a 38 TIME-PARTITION DOMINANCE long one? Intuitively not. Once we equalise the effects of memory and mobility loss on Carlos’ quality of life, no grounds remain for preferring one of Paralysis and Amnesia over the other. As a litmus test, imagine that Carlos’ bodily injuries were slightly worse so that the potential paralysis would be a little more debilitating. With the effects of paralysis worse, the doctors should now prioritise preventing paralysis. The fact that amnesia would technically bring about a new person carries no countervailing moral weight. This is indication that the value of longevity is purely instrumental. Once we equalise each episode’s value and control for the association of longer lives with better goods, longevity contributes no extra value. One can deny that such equalising is always possible. Maybe some good things only come with maturity, with nothing of comparable value available on overly short lives. That would be to reject Replacement—not Extension-Creation Neutrality. Here’s a sketch of another argument against the intrinsic value of longevity. Consider: Death’s Deal. The life you’ve lived so far is overall neutral—it’s neither good nor bad (α). As your time nears, Death offers you a deal. You can live a little longer. The catch is that your extra borrowed time will contain none of the things that make life good—no pleasure, no friends, no desires satisfied,. . . In fact, it’ll even be a little bad (β1). Perhaps your nose will itch the entire time. Still, you reason: ‘if longevity is intrinsically good, living longer must surely be worth enduring a slight annoyance’. So, you take the deal. Encouraged, Death extends a “better” deal (β2). ‘If living a little longer can offset a little badness’, he says, ‘then surely living even longer can offset even more badness’. You can’t help but agree. Death continues: ‘For an even longer life. . . ’ By the time you realise where this is going, it’s too late—you’ve agreed to an incredibly long life, almost all of which is bad and none of which is good (βn). You’ve traded a neutral life for one that isn’t worth living. a ≺ a b1 ≺ a b2 ≺ ≺ a bn . . . α β1 β2 βn Besides longevity, there are other grounds on which one might challenge Extension-Creation Neutrality. For instance, some might argue that continuity CHAPTER 2. REPLACEABLE VALUE IN PEOPLE AND TIME 39 is valuable—a continuous life is better than multiple fragmented ones. Others might argue that there is value in a life having a certain “shape”—perhaps an upwards-trending life is better than a constant or downwards-trending one.14 When a single life is broken into two, we potentially lose some of the value derived from a life’s shape. I’ll only sketch a brief response, since the dialectic will be similar to our response to the objection from longevity. On the one hand, the intuitions that favour long, continuous, upwards-trending lives can be explained by what typically accompanies such lives—self-actualisation, self-discovery, feelings of achievement, fulfilment of lifelong goals. If the value of longevity, continuity, and a positive trajectory lies entirely in the promotion of these sources of value, then they give rise to no genuine objection to ExtensionCreation Neutrality. On the other hand, if their value isn’t just instrumental but intrinsic, then problems like Death’s Deal arise. The problems stem from the idea that any alleged intrinsic value would be able to offset the introduction of more and more badness into each part of a distribution. For instance, consider a world with distribution α below containing many neutral lives made up of a single episode a—an episode that’s neither good nor bad. a a a . . . b b a . . . c c c α ≺ β ≺ γ . . . ≺ . . . If continuity had intrinsic value, then conjoining two such lives into a single continuous life (as in β) should result in a better world. That’s the case even if b is ever so slightly worse than a. Co-opting a third life to form an even longer life made up of an even worse episode c (as in γ) should further improve things. Carrying this line of reasoning through leads us to an absurd end-point—if continuity were intrinsically valuable, then one continuous life, every single part of which is bad, is better than many neutral lives. 2.5 Additivism We saw in the introduction how various forms of Egalitarianism violate at least one of Time-Shift Invariance and Time-Partition Dominance. We also noted that 14See Velleman (1991). 40 ADDITIVISM the same is true of various forms of Averagism. That’s no accident—given the framework and its background assumptions, the only kind of theory that satisfies both principles is an additive one. The aim of this section is to state this result precisely and unpack what it means exactly for value to be additive. To begin, value is often represented using real numbers. But this is overly restrictive. As we’ll see, familiar problems associated with an additive picture of value stem from properties of the real numbers—properties which aren’t shared by more general structures, in which we can also add and compare things in relatively well-behaved ways. For example, we might represent how good things are using pairs of real numbers. Any two pairs can be added up by adding each component individually: (x1, x2) + (y1, y2) = (x1 + y1, x2 + y2). To compare pairs, there are at least two possibilities. One is the lexical ordering. We begin by comparing the first components: if x1 > y1, then (x1, x2) > (y1, y2). If the first components are equal, we move on to compare the second: if x1 = y1 and x2 ≥ y2, then (x1, x2) ≥ (y1, y2). Another way to compare pairs is using the dominance ordering, where one pair is no less than another just in case it’s no less along both components. That is, (x1, x2) ≥ (y1, y2) if and only if x1 ≥ y1 and x2 ≥ y2. Pairs can be generalised naturally to triples, quadruples, and arbitrarily long sequences of real numbers. Adding sequences up and comparing them in the ways defined above result in what I’ll call lexical spaces and dominance spaces. These are examples of partially ordered Abelian groups, which are general structures that share many of the real numbers’ desirable mathematical properties while possibly lacking some of its overly restrictive features (see §2.5.3). Instead of restricting ourselves to the real numbers, we can allow value to be represented using any partially ordered Abelian group. Call a reflexive and transitive ordering ≽ on the set of worlds an axiology, where intuitively A ≽ B means that world A is at least as good as world B. Throughout, we’ve also assumed background facts about which episodes are equally good—for instance, in the notion of equivalent distributions. Let’s call a function V from worlds and episodes to a partially ordered Abelian group a representation of an axiology if it assigns better worlds greater values and vice versa—that is, for all worlds A and B: A ≽ B if and only if V(A) ⩾ V(B), and it assigns equally good episodes equal value. A representation V is additive CHAPTER 2. REPLACEABLE VALUE IN PEOPLE AND TIME 41 if whenever world A has a distribution that contains episodes a1, . . . , an, then: V(A) = V(a1) + · · · + V(an). An additive axiology is one that has an additive representation. Additivism is the class of additive axiologies. The main formal result of this paper is that given our framework and its background assumptions, Additivism is equivalent to the two axioms defended in the previous sections: Theorem. Given Replacement, Distributionism, and Anonymity, an axiology satisfies Time-Shift Invariance and Time-Partition Dominance if and only if it is additive. The proof is a little involved and is left for the Appendix. But the gist of it isn’t difficult to grasp. To illustrate, consider worlds with distributions below, where the a’s are equally good (i.e. a, a ′ , and a ∗ are equally good), as are the b’s and the c’s: a b c a ′ b ′ c ′ a ∗ b ∗ c ∗ A B C Time-Shift Invariance implies that A and B are equally good. And ExtensionCreation Neutrality (which follows from Time-Partition Dominance) implies that B and C are equally good. Together, they imply that A, a two-person world, is exactly as good as C, a one-person world. This generalises. Time-Shift Invariance allows us to stack multiple lives up back-to-back in time. Extension-Creation Neutrality then allows us to agglomerate those lives into one long life. In this way, every world can be “collapsed” into an equally good solitary world—a world in which at most one person ever lives. For solitary worlds, the question of interpersonal aggregation is trivial, since there is at most one person. So, the question of how to aggregate across time and people reduces to the question of how to aggregate across just time. To derive Additivism, all that remains is to show that the two axioms imply that timewise aggregation must be additive. This is proved in the Appendix. (The strategy is to use equivalence classes of certain kinds of distributions to construct a structure with an addition operation and an ordering. The resulting structure can be used 42 ADDITIVISM to represent the value of worlds and episodes. The two axioms guarantee that the constructed structure has certain properties that make it a partially ordered Abelian group.) The remainder of this section is devoted to unpacking the exact content of Additivism. Doing so will help clarify the upshot of the theorem and defuse some possible worries. 2.5.1 Uniqueness of Representation First, recall that an additive axiology is one with an additive representation. This doesn’t preclude an additive axiology from also having non-additive representations. To illustrate, suppose V is a real-valued representation for an axiology. Then, so is the function V 2 which assigns to each world and episode the square of its value on V, since, for any real numbers x and y, x is greater than y just in case x 2 is greater than y 2 . But V 2 needn’t be additive even if V is, since x + y = z doesn’t generally imply that x 2 + y 2 = z 2 . In that case, V 2 would be a non-additive representation of an additive axiology. The non-uniqueness of a representation makes identifying an additive axiology less straightforward than one might expect. In the standard framework that assumes numbers from the outset, axiologies are often introduced via functional equations. For instance, Lifetime Averagism can be described as the ordering of worlds obtained as follows. First, we sum up the value of the episodes in a world’s distribution. Then, we average that sum by the total number of people who ever exist. Better worlds are those with greater averages. But this is just one method of arriving at which worlds Lifetime Averagism deems to be better or worse. What’s to say that the same ranking of worlds doesn’t also have an additive representation? To see why not, consider three worlds with the distributions below: a b a b A B C A and B each contains only one person whose life consists of a single episode, a in one case and b in the other. Assume that these are equally good episodes. And C contains both of the lives in A and B. According to Lifetime Averagism, C must be exactly as good as A and B. Now, suppose V is a representation of Lifetime Averagism, in which case V(A) = V(B) = V(C). If V were an additive representation, then it must be that CHAPTER 2. REPLACEABLE VALUE IN PEOPLE AND TIME 43 V(A) + V(B) = V(C). It’s familiar in the case of real numbers that x + x = x implies that x = 0. This is also a feature of any partially ordered Abelian group more generally. So, V(A) = V(B) = V(C) and V(A) + V(B) = V(C) imply that V(A) = V(B) = V(C) = 0. So, Lifetime Averagism is additive only in the trivial case where every world is exactly as good as the empty world in which no one exists. Or, consider Lifetime Egalitarianism. How do we know that it doesn’t have an additive representation? To see how, consider worlds like the ones below: a c b a b c t t A B Suppose that c is a positive episode (that is, a world whose distribution is equivalent to one that contains only c is better than the empty world). For Lifetime Egalitarians, it’s better for c to be had by whomever had the worse life prior to t. So, if a is better than b, then B is better than A. But it’s clear that can’t be the case on an additive axiology, since on any additive representation, V(A) = V(B) = V(a) + V(b) + V(c). So, Lifetime Egalitarianism isn’t additive. More generally, any two worlds with distributions that have exactly the same episodes must be equally good according to an additive axiology. So, Additivism rules out axiologies, like Egalitarianism, that are sensitive to pattern goods. So, while Additivism simply requires that the true axiology have some additive representation and not that every representation of it be additive, this is a genuine constraint that rules out axiologies that are standardly thought to be non-additive. At the same time, Additivism allows for more than just the kinds of view usually associated with labels like ‘Total Utilitarianism’ or ‘Classical Utilitarianism’. The ways in which Additivism is more general open up avenues for resisting some standard worries about an additive picture of value. 2.5.2 Cardinal Comparisons Total Utilitarianism is often criticised for allowing quantity to trump quality, as in: Repugnant Conclusion (Parfit, 1984). Suppose that a lifetime of listening to Mozart (a) is a thousand times better than a lifetime of listening to Muzak 44 ADDITIVISM (b)—a life that, let’s suppose, would be barely worth living. Consider a world, Quality, in which a hundred people spend their lives listening to Mozart. Compare that to a world, Quantity, in which a million people spend their lives listening to Muzak. Intuitively, Quality is better than Quantity. a . . . a b b . . . b b Quality Quantity 100 people 1,000,000 people The problem for an additive theory is supposed to be that it implies otherwise. The thought is that if the value of b is x, then the value of a, which is a thousand times as good, must be 1000x. But adding 1000x a hundred times over still amounts to less than adding x a million times over. But what exactly does it mean for a lifetime of listening to Mozart (henceforth, a ‘Mozart-life’) to be a thousand times better than a lifetime of listening to Muzak (henceforth, a ‘Muzak-life’)? There are three salient possibilities—none of which results in an interpretation of the Repugnant Conclusion that poses a problem for Additivism. One possibility is that cardinal facts are derived from how we make interpersonal trade-offs: A Mozart-life is 1,000 times betterpersonwise than a Muzak-life = A world with one Mozart-life is exactly as good as a world with a thousand Muzak-lives. On this picture, cardinal comparisons simply encode facts about how many lives of a worse kind it takes to make the world as good as one with a single better life. This renders the Repugnant Conclusion almost trivial. Take the following ‘replication invariance’ principle: if a world with one Mozart-life is exactly as good as a world with 1,000 Muzak-lives, then a world with N many Mozartlives is exactly as good as a world with 1, 000 × N many Muzak-lives, for any natural number N. This principle is highly plausible and is satisfied by most axiologies.15 Given the principle, the “Repugnant” Conclusion follows from the very definition of what it means for a one life to be a thousand times better. A second possibility is that cardinal facts are derived from some independent dimension, like the strength of a pleasure or desire, or trade-offs under risk. Perhaps there’s a natural cardinal scale for measuring the strength of pleasure (by the concentration of dopamine, say). A hedonist might then define: 15See Blackorby et al. (2005, ch. 4). CHAPTER 2. REPLACEABLE VALUE IN PEOPLE AND TIME 45 A Mozart-life is 1,000 times betterstrengthwise than a Muzak-life = The pleasure contained in a Mozart-life is 1,000 times as strong as that contained in a Muzak-life. Or, cardinal comparisons could track considerations of risk: A Mozart-life is 1,000 times betterriskwise than a Muzak-life = A one-in-athousand chance of a Mozart-life is exactly as good as a Muzak-life for sure. Neither of these interpretations render the Repugnant Conclusion trivial. However, on these interpretations, Additivism doesn’t imply that Quantity is better than Quality. Recall that an additive axiology is simply one with an additive representation, on which value can be thought of as added up over time and people. Nothing in this definition requires that the values assigned by an additive representation also be the ones that track hedonic strength or risk management. Absent any further argument, there’s no requirement in Additivism that the personwise, strengthwise, and riskwise interpretations above coincide. So, as it stands, either cardinal comparisons arise from interpersonal tradeoffs, in which case the “Repugnant” Conclusion is trivial and unobjectionable. Or, they arise from some independent scale, in which case the Repugnant Conclusion isn’t actually a consequence of Additivism. But there’s a third possibility. Cardinal facts could be understood in terms of how value is aggregated across time: A Mozart-life is 1,000 times bettertimewise than a Muzak-life = a Mozart-life is exactly as good as a life that consists of 1,000 episodes (each equivalent to a Muzak-life) back-to-back. Additivism does require a parity between the dimensions of people and time. For Additivists, the personwise and timewise definitions of cardinal comparisons must coincide. So, this interpretation gives rise to a version of the Repugnant Conclusion that is both non-trivial and also a genuine implication of Additivism. But the interpretation fails to retain the intuitive force of the Repugnant Conclusion. Most of those averse to the Repugnant Conclusion would be equally troubled by its temporal analogue: Temporal Repugnant Conclusion. Consider a world, Quality∗ , which consists of a single life spent listening to Mozart. Compare that to a world, Quantity∗ , which consists of a single life, that’s a thousand times as long, made up of a 46 ADDITIVISM thousand episodes back-to-back, each exactly as good as listening to Muzak. Intuitively, Quality∗ is better than Quantity∗ . a b1 b2 Quality . . . b1000 ∗ Quantity∗ Indeed, cases like these have also been used to argue against forms of Total Utilitarianism.16 But to accept that Quality∗ is better than Quantity∗ is precisely to reject the supposition that a Mozart-life is 1,000 times bettertimewise than a Muzaklife. So, this version of the Repugnant Conclusion on which cardinal comparisons are interpreted temporally would have dialectical force against Additivism only for those who are troubled by the interpersonal Repugnant Conclusion but somehow untroubled by its temporal analogue—an unlikely combination of intuitions. So far, we’ve focused on a very specific form of the Repugnant Conclusion. There might be a worry that clarifications on how cardinal comparisons are understood can only go so far. Some might retort that it’s immaterial whether a Mozart-life is a thousand, or a million, or a trillion times better than a Muzaklife. What’s supposed to be objectionable is the implication that there’s some sufficiently large number of Muzak-lives that would be as good as a hundred Mozart-lives. Call this the ‘generalised Repugnant Conclusion’. This generalised Repugnant Conclusion is supposed to be an implication of Additivism no matter how cardinal claims are interpreted. And that’s because if a Mozart-life has a value of x (however large) and a Muzak-life has a value of y (however small), there’s bound to be some large enough positive integer N such that N × y is greater than x. This is indeed the case if x and y are real numbers—it follows from what’s called the ‘Archimedean property’ of the real numbers. But recall that we allow for representations to take value in more general structures, partially ordered Abelian groups, which can lack this property. This leads us to yet another respect in which Additivism is more general than Total Utilitarianism standardly conceived—the possibility of non-real valued representations. 2.5.3 Partially Ordered Abelian groups Recall the lexical spaces introduced earlier, in which sequences of real numbers are compared by comparing the first components, moving on to the second only 16See Crisp (1997, 24). CHAPTER 2. REPLACEABLE VALUE IN PEOPLE AND TIME 47 if there’s a tie in the first, moving on to the third only if there’s a tie in the first two components, and so on. These spaces lack the Archimedean property. For instance, assign a Mozart-life a value of (1, 0) and a Muzak-life (0, 1). Then, a Mozart-life is “infinitely” better in the following sense. Any arbitrarily large finite number N of Muzak-lives would still only add up to a value of (0, N). That is less than (1, 0) on the lexical ordering, since 0 is less than 1 along the first components. Since an additive representation can assign values like the ones above, the generalised Repugnant Conclusion isn’t forced upon an Additivist. While lexical spaces relax the Archimedean property of the real numbers, the dominance spaces also introduced earlier relax its completeness. On the standard ordering of the real numbers, any two real numbers are comparable—either one must be at least as great as the other. But some values might be incommensurable.17 Perhaps a Mozart-life and a Monet-life are so different that neither is better than the other but neither are they equally good—they’re just incomparable. Such incomparability can be accommodated using dominance spaces, on which one sequence of real numbers is at least as great as another just in case it’s at least as great along all components. For instance, valuing a Mozart-life at (1, 0) and a Monet-life at (0, 1) would make them incomparable according to the dominance ordering, since neither is at least as good as the other along both components of the pairs. Intuitively, each component would represent the magnitude of value along some aspect, like musical vs. visual edification. It’s possible for an axiology to have both things that are infinitely better than others as well as things that are incomparable. Such axiologies might be represented using a combination of lexical and dominance spaces. For instance, instead of pairs, consider 2-by-2 matrices like: x = 1 0 0 0! y = 0 1 1 0! These matrices can be added entry-wise. So, for instance, x + y would be a 2- by-2 matrix with 1 everywhere except for a 0 on the bottom right. To compare matrices like the ones above, we first compare them row-wise using the lexical ordering. For instance, x is better than y along the first row (since (1, 0) is better than (0, 1) on the lexical ordering) but worse along the second row (since (0, 0) is worse than (0, 1)). Then, we compare them using the dominance ordering, 17See Chang (2002), though see Dorr et al. (forthcoming). 48 CONCLUSION where one matrix is at least as good as the other if it’s at least as good along every row. In the case above, x is at least as good as y only along the first row. So, x is not at least as good as y. And neither is y at least as good as x. So, they are incomparable. These combinations of lexical and dominance spaces exhaust the partially ordered Abelian groups.18 So, we can think of representations of axiologies as assigning values that are matrices like the one above, summed up and compared in the way just described. This means that the cumulative effect of allowing representations to take values in arbitrary partially ordered Abelian groups, rather than just the real numbers, has a simple description. It allows for axiologies that are possibly lexical (on which some worlds are infinitely better than others) and possibly incomplete (on which some worlds might be incomparable). 2.6 Conclusion Value can be distributed not just across people but also across time. These dimensions aren’t independent—how value is aggregated across time greatly constrains how it must be aggregated across people. In particular, Additivism follows from two compelling principles about time. The first, Time-Shift Invariance, says that simply shifting lives in time shouldn’t change how good things are. The second, Time-Partition Dominance, says roughly that things can’t be better unless they’re better within some period or another. Undergirding these principles, and some arguments for them, are various replacement principles. Under certain conditions—like when people are too far apart in time or in space, or when a time period is empty of value, or when agents face particular choices—some sources of value might be impossible to instantiate. Replacement principles are principles to the effect that those sources can be made up for with other sources of value that are equally good. These principles played a prominent role even in formulating principles like Time-Shift Invariance and Time-Partition Dominance. The notion of a “pure” time-shift assumes that we can change when a life is lived while holding the value of its segments fixed. Similarly, we compare worlds within some time period by ignoring all value outside of that period while holding the values within it fixed. The possibility of holding value fixed in these ways requires being able to compensate any “lost” sources of value with equivalent ones. Arguments for both Time-Shift Invariance and Time-Partition Dominance also relied heavily on 18The theorem that underlies this fact is a generalisation of the Hahn Embedding Theorem (see Hausner & Wendel (1952), Conrad (1953), Clifford (1954)). CHAPTER 2. REPLACEABLE VALUE IN PEOPLE AND TIME 49 replacement principles—like Spacetime Replacement in the argument from relativity (§2.3.2) and Choice Replacement in the argument from decision-making (§2.4.1). Since replacement principles lay the groundwork for a strong case in favour of Additivism, I think the non-Additivist’s best prospect is to reject them. Isolating the issue of replacement as the core of the disagreement about the additivity of value has major methodological ramifications for how we conceptualise and engage in the enterprise of population ethics. First, two kinds of questions in axiology are often entertained separately. On the one hand are substantive questions about the nature and content of value— what things make for value and when is one such thing better than another? On the other are structural questions to do with aggregation—how does individual value (whatever that may consist in) and the pattern in which they are distributed combine to determine overall value? Much of population axiology takes itself to be concerned with structural questions, while attempting to remain as neutral as possible about substantive ones. But if the choice is indeed between accepting an additive picture of value and rejecting replacement principles, then the structural and substantive questions aren’t so easily separated. The plausibility of replacement principles turn on substantive questions about exactly what value consists in. For instance, are there substitutes for friendship under conditions that make friendship impossible, like when people are too far apart in time or in space? An affirmative answer is much more plausible on a theory like hedonism than on one that places overwhelming importance on our social relationships. It’s no accident then that, sociologically, certain substantive theories of value (like hedonism) tend to incline ethicists towards certain structural views (like an additive picture of value). Identifying the rejection of replacement principles as the most promising basis for a non-additive view partly demystifies this sociological fact. A second methodological upshot concerns a reformation of the standard framework of population ethics. The use of numbers obfuscates many important foundational issues and must thus be approached with care. Mathematical representations of value can be justified—for instance, via a representation theorem, like the one proved in the Appendix. Though, even then, subtle issues of interpretation remain, as we saw in §2.5. But non-Additivists whose best course of action is to reject replacement principles should be especially wary of the use of numbers. A value of ‘10’ could 50 APPENDIX stand for many equally good things—an adrenaline rush, a feeling of peace, an intellectual achievement, a bonding moment. Some of these ‘10’s might be possible to instantiate at some time but not others, for some people but not others, at some spatial locations but not others, in conjunction with some other sources of value but not others, and so on. The use of quantitative surrogates that conflate all these different qualitative sources of value strongly suggests a picture of value in which replacement principles hold. In the absence of such principles, we must carefully specify and keep track of exactly what each instance of ‘10’ in a distribution stands for. For, there’s no guarantee that there will always be some source that makes for that amount of value. In light of this, those who reject replacement principles do best by abandoning the standard numerical framework of population ethics altogether. 2.7 Appendix The goal of this Appendix is to show that given the background assumptions, Time-Shift Invariance and Time-Shift Dominance entail Additivism. We already saw, roughly, how given the two axioms, every world is equivalent to some world with a solitary distribution, in which at most one person ever lives. In a nutshell, the proof strategy will be to construct a partially ordered Abelian group whose members are equivalence classes of solitary distributions, whose addition operation is the concatenation of solitary distributions in time, and whose ordering is the ordering of solitary distributions according to the axiology. The axioms and assumptions of the framework guarantee that the algebraic structure and ordering structure, defined this way, have the desired properties. 2.7.1 The Framework Let E be a set of episodes, including an empty episode 0. And let E ∗ = E/{0}. Let N be an index set for possible individuals. A distribution is a set {αn : R → E}n∈N of functions such that the union of their images S n∈N αn(R) is a finite subset {a1, . . . , ak} of E and for each a ∈ {a1, . . . , ak} where a ̸= 0, there is a finite subset K ⊂ N such that the preimage (αn) −1 (a) is a non-empty interval of R for n ∈ K and is empty for n ̸∈ K. Intuitively, a1, . . . , ak are the episodes contained in the distribution and (αn) −1 (a) is the interval of time for which episode a is had by individual n. Let D be the set of all distributions. Formally, we can identify a world with the set of all distributions that describe it. So, the set W of all worlds is some CHAPTER 2. REPLACEABLE VALUE IN PEOPLE AND TIME 51 subset of the powerset of D. And α ∈ A means that world A can be described by distribution α. An axiology ≽ is a preorder on W. Intuitively, A ≽ B means that world A is at least as good as world B. We say that A and B are equally good (A ∼ B) when both A ≽ B and B ≽ A. And we say that A is strictly better than B (A ≻ B) when A ≽ B but not B ≽ A. An axiology induces a preorder on the set of episodes. Intuitively, some worlds are singleton worlds, whose distributions are such that exactly one episode is ever instantiated. So, for each episode a, we can find a singleton world A whose distribution instantiates only a. We can then define episodes a and b to be equally good (a ≡ b) when the singleton world A associated with a and the singleton world B associated with b are equally good.19 Two distributions α, β are then said to be equivalent (written ‘α ∼= β’) if for each n ∈ N and a ∈ A ⊆ E such that S a∈A(αn) −1 (a) = R, there exists b ∈ E where (αn) −1 (a) = (βn) −1 (b) and a ≡ b. Informally, distributions are equivalent if they have the same pattern but with episodes possibly replaced by equally good ones. The basic assumptions of our framework can then be stated as follows: Replacement. For any α ∈ D, there exist β ∈ D and B ∈ W such that α ∼= β and β ∈ B. Distributionism. For any A,B ∈ W and α, β ∈ D, if α ∈ A and β ∈ B and α ∼= β, then A ∼ B. Anonymity. For any α ∈ D and bijection π : N → N, if α = {αn} ∈ A and απ = {απ(n)} ∈ B, then A ∼ B. Given Replacement and Distributionism, an axiology ≽ induces a ranking on distributions as follows (which we can also represent with ‘≽’ without any risk of confusion). For any α, β ∈ D, let α ≽ β if and only if there exist α ′ ∼= α and β ′ ∼= β with α ′ ∈ A and β ′ ∈ B such that A ≽ B. The ordering of distributions, defined this way, can easily be checked to be a well-defined preorder given Replacement and Distributionism. 19Of course, there are many possible choices of singleton worlds for each episode but given our axioms later, the choice will turn out not to matter. 52 APPENDIX 2.7.2 Time-Shift Invariance We now have the requisite vocabulary to state our first axiom precisely. Call a function f : R → R a translation function if it is of the form f : t 7→ t + x for some translation constant x ∈ R. Say that distributions α, β ∈ D differ only by a time-shift if for each n ∈ N, there exist a translation function fn : R → R such that αn = fn ◦ βn (i.e. αn is the result of applying βn followed by fn). Two worlds A,B are then said to differ only by a time-shift if there exist distributions α, α ′ , β with α ∈ A and β ∈ B such that α and α ′ differ only by a time-shift and α ′ ∼= β. Then, according to the first axiom: Time-Shift Invariance. For any A,B ∈ W, if A and B differ only by a time-shift, then A ∼ B. It follows that any two distributions that differ only by a time-shift must be equally good. For suppose α, β ∈ D differ only by a time-shift. Given Replacement, there exist α ′ , β ′ ∈ D such that α ′ ∈ A and β ′ ∈ B for some A,B ∈ W. By definition, A and B differ only by a time-shift and so by Time-Shift Invariance, A ∼ B. And so, by the definition of preorder on the distributions, it follows that α ∼ β. 2.7.3 Time-Partition Dominance For the second axiom, let us first define a binary concatenation operation ⊕t on functions αn, βn : R → E as follows. For each x ∈ R: αn ⊕t βn(x) = αn(x) if x ≤ t; βn(x) if x > t. We can then define the distribution that’s a concatenation of distributions at t by letting α ⊕t β = {αn ⊕t βn}n∈N. Intuitively, this is the spliced distribution that looks like α before time t and β after t. Let ∅ = {∅n}n∈N be the empty distribution, where ∅n(t) = 0 for all n ∈ N and t ∈ R. We call t ∈ R a partition for a distribution α if for each n ∈ N and a ∈ E/{0}, either min{(αn) −1 (a)} ≥ t or max{(αn) −1 (a)} ≤ t. In other words, no one enjoys a non-empty episode whose interval spans a time that includes t. We say that world A is at least as good as B before time t (A ≽<t B) if there exist α ∈ A and β ∈ B such that: (i) t ∈ R is a common partition for α, β; and CHAPTER 2. REPLACEABLE VALUE IN PEOPLE AND TIME 53 (ii) for all α ′ ∼= (α ⊕t ∅) and β ′ ∼= (β ⊕t ∅), if α ′ ∈ A′ and β ′ ∈ B ′ , then A′ ≽ B ′ . Similarly, A is at least as good as B after time t (A ≽>t B) if there exist α ∈ A and β ∈ B such that: (i) t ∈ R is a common partition for α, β; and (ii) for all α ′ ∼= (∅ ⊕t α) and β ′ ∼= (∅ ⊕t β), if α ′ ∈ A′ and β ′ ∈ B ′ , then A′ ≽ B ′ . The second axiom then says that: Time-Partition Dominance. For all A,B ∈ W: (i) if A ≽>t B and A ≽<t B, then A ≽ B; and (ii) if, furthermore, either A ≻>t B or A ≻<t B, then A ≻ B. It follows that for any common partition t ∈ R of distributions α, β ∈ D: 1. if (α ⊕t ∅) ≽ (β ⊕t ∅) and (∅ ⊕t α) ≽ (∅ ⊕t β), then α ≽ β; and 2. if, furthermore, either (α ⊕t ∅) ≻ (β ⊕x ∅) or (∅ ⊕x α) ≻ (∅ ⊕x β), then α ≻ β. 2.7.4 The Result A valuation V : W → G is a function from the set of worlds to a partially ordered Abelian group G = ⟨G, +, ≥⟩. A valuation V : W → G represents an axiology ≽ if for all A,B ∈ W: A ≽ B ⇐⇒ V(A) ≥ V(B). As already mentioned, we can conflate each episode and some singleton world whose distribution contains only that episode. Identifying each episode a with an arbitrary singleton world A in this way, we can abuse notation slightly and let V(a) = V(A). An axiology ⟨W, ≽⟩ is then said to be additive if there exists a valuation V : W → G representing it such that for all A ∈ W, if α ∈ A and the episodes contained in the distribution is S n∈N αn(R) = {a1, . . . , ak}, then: V(A) = V(a1) + · · · + V(ak). We want to show: 54 APPENDIX Theorem. Given Replacement, Distributionism, and Anonymity, an axiology satisfies Time-Shift Invariance and Time-Partition Dominance if and only if it is additive. The right-to-left direction is easy to check. For the converse direction, fix an axiology (W, ≽) and suppose it satisfies Time-Shift Invariance and Time-Partition Dominance. Our proof strategy will be to, first, construct a partially ordered commutative monoid M = ⟨M, +, ≥⟩ from equivalence classes of distributions containing at most one person and then showing that the constructed monoid has the properties required to be embedded into a partially ordered Abelian group. Step 1. The monoid set M Let D0 ⊂ D be the subset of solitary distributions in which no one besides possibly person 0 enjoys any non-empty episodes. That is, α ∈ D0 just in case for all n ̸= 0, αn(t) = 0 for all t ∈ R. For any α ∈ D0, let [α] = {β ∈ D0 : α ∼ β} be the equivalence class of distributions in D0 that are exactly as good as α. We define the underlying set of the monoid to be M = {[α] : α ∈ D0}. Step 2. The monoid operation + Let α, β ∈ D0. First, we define ←−α to be the time-shifted distribution where the final non-empty episode in α ends at time 0, i.e. ←−α = fα ◦ α, where fα : r 7→ r − max{α −1 0 [E ∗ ]}. Similarly, −→β is the time-shifted distribution where the first nonempty episode in β starts at 0, i.e. −→β = gβ ◦ β, where gβ : r 7→ r − min{β −1 0 [E ∗ ]}. Now, we define the concatenation operation ⊙ by letting α ⊙ β = ←−α ⊕0 −→β . Informally, this is the distribution in which person 0 lives the life they would have in α followed by the life they would have in β. Now, we define the monoid operation +, where for all [α], [β] ∈ M: [α] + [β] = [α ⊙ β]. First, we need to check that this is well-defined in that it doesn’t depend on the choice of representatives. That is, suppose α ′ ∈ [α] and β ′ ∈ [β]. We need to show that [α ⊙ β] = [α ′ ⊙ β ′ ]. Now, α and α ⊙ ∅ differ only by a time-shift and so, by Time-Shift Invariance, α ∼ (α ⊙ ∅). Similarly, α ′ ∼ (α ′ ⊙ ∅). By assumption, CHAPTER 2. REPLACEABLE VALUE IN PEOPLE AND TIME 55 α ∼ α ′ . So, (α ⊙ ∅) ∼ (α ′ ⊙ ∅). By the same reasoning, (∅ ⊙ β) ∼ (∅ ⊙ β ′ ). So, by Time-Partition Dominance, (α ⊙ β) ∼ (α ′ ⊙ β ′ ), which means that [α ⊙ β] = [α ′ ⊙ β ′ ]. Next, we need to show that the operation, as defined above, satisfies the properties required for it to be the operation of a commutative monoid: (i) Identity. There is an identity element [∅] ∈ M such that [α] + [∅] = [α] for all [α] ∈ M. This follows from Time-Shift Invariance, since (α ⊙ ∅) = ←−α differs from α only by a time-shift, and so [α] + [∅] = [α ⊙ ∅] = [α]. (ii) Associativity. For all [α], [β], [γ] ∈ M, ([α] + [β]) + [γ] = [α] + ([β] + [γ]). Again, this follows from Time-Shift Invariance. It’s easy to see that (α ⊙ β) ⊙ γ and α ⊙ (β ⊙ γ) differ only by a time-shift, and so ([α] + [β]) + [γ] = [(α ⊙ β) ⊙ γ] = [α ⊙ (β ⊙ γ)] = [α] + ([β] + [γ]). a1 a2 b1 b2 c1 c2 a1 a2 b1 b2 ∼ c1 c2 t = 0 t = 0 (α ⊙ β) ⊙ γ α ⊙ (β ⊙ γ) (iii) Commutativity. For all [α], [β] ∈ M, [α] + [β] = [β] + [α]. This amounts to showing that (α ⊙ β) ∼ (β ⊙ α). Let π : N → N be a bijection such that π(0) ̸= 0. Given Anonymity, π ◦ β ∼ β. So, by TimePartition Dominance, (α ⊙ β) ∼ (α ⊙ (π ◦ β)) and (β ⊙ α) ∼ ((π ◦ β) ⊙ α). By Time-Shift Invariance, (α ⊙ (π ◦ β)) ∼ ((π ◦ β) ⊙ α). So, (α ⊙ β) ∼ (β ⊙ α). a1 a2 b1 b2 a1 a2 b1 b2 a1 a2 b1 b2 b1 b2 ∼ a1 a2 1 ∼ 2 ∼ 3 α ⊙ β α ⊙ (π ◦ β) (π ◦ β) ⊙ α β ⊙ α 56 APPENDIX Step 3. The monoid preorder ≥ Define ≥ on M as follows: [α] ≥ [β] iff α ≽ β, for all [α], [β] ∈ M. This is well-defined in that it doesn’t depend on the choice of representatives. For, suppose α ′ ∈ [α] and β ′ ∈ [β]. Then, α ′ ∼ α and β ′ ∼ β. Since ≽ is a preorder, this means that α ≽ β if and only if α ′ ≽ β ′ . It’s also clear that ≥, as defined, is a preorder, since ≽ is. Finally, we need to check that the monoid preorder is compatible with the monoid operation previously defined in the sense that for all [α], [β], [γ] ∈ M: [α] ≥ [β] iff [α] + [γ] ≥ [β] + [γ]. To see this, first note that α and α ⊙ ∅ differ only by a time-shift and so, by TimeShift Invariance, α ≽ (α ⊙ ∅). Similarly, β ≽ (β ⊙ ∅). So, α ≽ β if and only if (α ⊙ ∅) ≽ (β ⊙ ∅). Furthermore, by reflexivity, (∅ ⊙ γ) ∼ (∅ ⊙ γ). So, by Time-Partition Dominance, (α ⊙ ∅) ≽ (β ⊙ ∅) if and only if (α ⊙ γ) ≽ (β ⊙ γ). Putting all these together: [α] ≥ [β] ⇐⇒ α ≽ β ⇐⇒ (α ⊙ ∅) ≽ (β ⊙ ∅) ⇐⇒ (α ⊙ γ) ≽ (β ⊙ γ) ⇐⇒ [α ⊙ γ] ≥ [β ⊙ γ] ⇐⇒ [α] + [γ] ≥ [β] + [γ]. This completes our construction of the partially ordered commutative monoid. Now, our partially ordered commutative monoid can embedded into a partially ordered Abelian group if it’s cancellative, i.e. for all [α], [β], [γ] ∈ M: [α] + [γ] = [β] + [γ] ⇒ [α] = [β] Suppose [α] + [γ] = [β] + [γ]. Then, [α ⊙ γ] = [β ⊙ γ] and so, (α ⊙ γ) ∼ (β ⊙ γ). By Time-Partition Dominance, α ∼ β, which implies that [α] ∼ [β], as desired. Step 4. The valuation function V CHAPTER 2. REPLACEABLE VALUE IN PEOPLE AND TIME 57 First, note that every distribution is exactly as good as some distribution in D0. For instance: a1 a2 a3 a4 a1 a2 a3 a4 ∼ a1 a2 a3 a4 1 ∼ 2 where 1 follows by Time-Shift Invariance and 2 by Time-Partition Dominance and Anonymity. So, we can define V : W → G, where G is a partially ordered Abelian group in which the partially ordered commutative monoid M constructed above embeds, as follows: V(A) = [α ′ ], where α ∈ A, α ∼ α ′ , and α ′ ∈ D0. This is an additive representation for the axiology, since if A contains episodes a1, . . . , an: V(A) = [α] = [α1 ⊙ · · · ⊙ αn] = [α1] + · · · + [αn] = V(a1) + · · · + V(an), where each singleton distribution αi contains only episode ai . Chapter 3 Patternism about Value The separateness of persons objection against theories like utilitarianism is invoked often but rarely made precise.1 This paper carefully isolates out one interpretation of the objection. According to Patternism, a mere difference in how value is distributed across people, time, and possibilities can make for a difference in overall value. Anti-Patternism says otherwise. This paper lays out the issue precisely and offers some considerations in favour of Anti-Patternism. 3.1 Introduction Consider the following case: Tradeoffs. Ann and Ben will be sick for the next two days. A drug can help alleviate their symptoms. Unfortunately, supply of the drug is limited— only one dose per day. Here are three possible plans on how to allocate the doses:2 A1. Give Ann the drug both days. A2. Give the drug to Ann the first day and to Ben the second day. A3. Give one person the drug both days but a coin toss decides who. 1For instance, Norcross writes: “The [separateness of persons] charge is often made, but rarely explained in any detail, much less argued for” (2009, 76). Woodard calls the objection “very slippery, because it is not at all clear exactly what the mistake is supposed to be” (2019, 25). 2Suppose that the choice of plans has no knock-on effects besides the potential alleviation of symptoms. For instance, suppose neither Ann nor Ben know about the choice that’s faced. So, even if they might perceive some plans to be fairer, neither would feel regret or resentment at certain plans being chosen over others. 58 CHAPTER 3. PATTERNISM ABOUT VALUE 59 1/2 1/2 1/2 1/2 1/2 1/2 A1 Heads Tails A2 Heads Tails A3 Heads Tails Ann (4 , 4) (4 , 4) (4 , 0) (4 , 0) (4 , 4) (0 , 0) Ben (0 , 0) (0 , 0) (0 , 4) (0 , 4) (0 , 0) (4 , 4) t1 t2 t1 t2 t1 t2 t1 t2 t1 t2 t1 t2 This choice requires balancing tradeoffs along three dimensions. What’s best for one person, at one time, given one possibility of how things could turn out isn’t always best for another person, at another time, or given another possibility. From this arises the general problem of aggregation: when is one distribution of value across people, time, and possibilities better than another? Call such distributions social prospects. Some prospects, like the three depicted by the diagrams above, contain the same amount of value and differ only in the pattern in which value is distributed. Towards a solution to the general problem of aggregation, we might first consider an intermediary question: can a mere difference in pattern make for a difference in overall value? The view I’m going to call Patternism says that it can whereas Anti-Patternism says that it can’t. The issue of Patternism, as I’ll frame it, hasn’t thus far been the subject of any thorough, focused investigation. One purpose of this paper is to carefully distinguish the various objections that are often lumped together under the banner of ‘separateness of persons’—and in doing so, carefully isolate out Patternism as an issue worthy of further future study.3 Beyond that, this paper also makes the case for Anti-Patternism. A key observation of this paper is that three plausible principles about how the dimensions of people, time, and risk interact jointly rule out Patternism. Any Patternist theory must violate at least one of these three principles. There’s a case to be made for each of these principles, so accepting Patternism comes at some significant cost. The paper proceeds as follows. §3.2 introduces the general idea of Patternism before §3.3 then outlines a framework in which Patternism can be stated precisely. In probing the case against Patternism, concrete examples of Patternist and Anti-Patternist theories will be helpful. For reasons that will become clear, especially germane for this purpose are various versions of Prioritarianism introduced in §3.4. Then, §3.5 shows how three principles—Temporal Neutrality, Statewise Anonymity, and Timeslice Stochasticism—jointly entail Anti-Patternism. 3Some previous work attempting to make precise different versions of the separateness of persons objection include Brink (2011), Hirose (2013), and Chappell (2015). 60 PATTERNISM, A FIRST PASS The case for and against each principle is then explored in §3.6. Some further issues, like the relation between Patternism and an additive theory of value, are discussed in §3.7. Throughout the paper, different versions of the separateness of persons objections are distinguished and their relation to Patternism clarified. 3.2 Patternism, a First Pass This section introduces the general idea of Patternism. Consider again the case of Tradeoffs above. Were we to sum up the values across people and time in each possibility and then take the expectation, the result would be the same in each case. The total expected value of each plan is 8. Total utilitarians, for whom a prospect is better just in case its total expected value is greater, would thus be indifferent among the three plans. As Rawls observes: “The striking feature of the utilitarian view. . . is that it does not matter, except indirectly, how this sum of satisfactions is distributed among individuals any more than it matters, except indirectly, how one man distributes his satisfactions over time.” (Rawls, 1971, 26). Some find this objectionable. The problem is supposed to be that total utilitarians care only about the amount of value and not distributional patterns. If no value is lost in the process, blocks of value can be freely shifted around from one person to another, one time to another, and one possibility to another without thereby changing how good things are overall. This invites criticisms like the following: “It is as if sentient beings are receptacles of something valuable and it does not matter if a receptacle gets broken, so long as there is another receptacle to which the contents can be transferred without any getting spilt” (Singer, 1993, 121). “Recipients of good and evil function merely as “vessels” into which value may be poured. The theory [utilitarianism] implies that the value should be poured out in whatever way will yield the greatest total.” (Feldman, 1995a, 582). We encounter, in these statements, the first of several versions of the separateness of persons objection we’ll consider in this paper—the idea that where value is located matters: 1. Separateness of Persons as Patternism The pattern in which value is distributed can make a prospect better or worse. CHAPTER 3. PATTERNISM ABOUT VALUE 61 Understood this way, the objection generalises beyond total utilitarianism to a much broader class of views I’m going to call ‘Anti-Patternist’. Before defining Patternism more precisely, let’s start with a rough characterisation and some examples. Consider the three diagrams in Tradeoffs. They are what we might call permutation equivalent—each can be obtained from the others by shifting around the entries in the cells across people, time, and equiprobable possibilities. Permutation equivalent prospects differ at most in how they distribute value. According to Anti-Patternism, patterns don’t matter—permutation equivalent prospects are always equally good. Total utilitarianism is a paradigmatic example of an Anti-Patternist theory. Total expected value doesn’t change with a change in distributional pattern. But there are Anti-Patternist theories besides total utilitarianism, some of which are related to views endorsed by the most prominent champions of the separateness of persons objection. Here are some examples: Example 1. A trivial example is the theory according to which all prospects are equally good. This theory is maximally insensitive—insensitive not just to distributional patterns but to all other distributional features, like how good things are for each person. Example 2. A less trivial example is what I’m going to call 3D-Maximin. Maximin says to maximise the worst possible value. For instance, Maximin about the interpersonal dimension says to maximise value for the worst-off person. Maximin about risk says to maximise the value of the worst possible outcome. Similarly, Maximin about time says to maximise the value at the worst time. 3D-Maximin can be thought of as the combination of Maximin about all three of those dimensions. On this view, we simply compare prospects by looking at the smallest number. The prospect whose smallest number is largest among all the prospects is the best. For instance the smallest number in prospects A1, A2, and A3 above is 0. So, they are all equally good according to 3D-Maximin. More generally, shifting numbers around clearly doesn’t change what the smallest number is. So, 3D-Maximin is indifferent between permutation equivalent prospects—it’s Anti-Patternist. Example 3. Leximin modifies Maximin in order to avoid some well-known problems. Instead of just comparing the worst outcomes, Leximin compares secondworst possible outcome if there’s a tie in the worst, the third worst if there’s a tie 62 PATTERNISM, A FIRST PASS in the worst and second worst, and so on. It’s easy to see that 3D-Leximin, like 3D-Maximin, is Anti-Patternist. Example 4. On theories that some have called ‘generalised utilitarian’, we first apply some transformation f : R → R to the numbers before comparing prospects by the total expectation of the transformed values.4 There are many possible forms of generalised utilitarianism depending on the choice of transformation. For instance, if f is a strictly increasing concave transformation, like the squareroot function, then while the total expectation of prospect A1 above is 4 + 4 + 0 + 0 = 8, the total expectation of its transformed values is √ 4 + √ 4 + √ 0 + √ 0 = 4. The theory that maximises total expectation under this transformation, as we’ll see in §3.4, is a form of Prioritarianism. Alternatively, if f is the identity function, then the resulting form of generalised utilitarianism is just total utilitarianism. Or, if f is a function that maps every number to the same number (like f : x 7→ 0), then the resulting form of generalised utilitarianism is the aforementioned trivial theory that deems every prospect equally good. And so on. Generalised utilitarian theories like these are all Anti-Patternist. Whatever the transformation, permutation equivalent prospects have the same total expectation after the transformation. These theories, like total utilitarianism, are indifferent to where value resides. The objection against utilitarianism, it seems, should thus generalise to all AntiPatternist theories. But the metaphor of value as liquid and people as containers is evocative but imprecise. Tangled in it are many distinct objections, which will be disambiguated through the course of the paper. For now, two such objections will be sufficient in motivating the need for a more general framework in which Patternism can be stated precisely. First, many statements of the objection aren’t neutral about what the value to be distributed—the “liquid”—is. Often, it’s taken to be pleasure, happiness, well-being, and so on: “The total version of utilitarianism regards sentient beings as valuable only in so far as they make possible the existence of intrinsically valuable experiences like pleasure [own emphasis]” (Singer, 1993, 121)). “When we decide actions based on aggregated sums of happiness [own emphasis], we no longer think about individuals as individuals. Instead, they are treated more like happiness containers.” (Slater, 2023). 4See, for instance, Blackorby et al. (2005). CHAPTER 3. PATTERNISM ABOUT VALUE 63 “[critics argue] utilitarian treats people as mere containers for well-being [own emphasis] and not separate individuals” (Bykvist, 2009, 5). The focus on pleasure or welfare to the exclusion of other considerations is the target of criticisms like the following: “utilitarianism advocates the distribution of goods and evils in whatever way will maximize total utility—with no regard for the character or the past behavior of the various recipients. Recipients of good and evil function merely as “vessels” into which value may be poured.” (Feldman, 1995a, 582). From the statements above, it’s unclear if the problem is that utilitarianism treats people as mere containers of value, abstractly conceived, or that it treats them as mere containers of pleasure or welfare.5 Suppose it’s possible to incorporate the relevant non-welfarist considerations into our interpretation of the “liquid”—for instance, by adjusting for desert.6 In that case, would any objection to an insensitivity to how the liquid is dispersed remain? If not, then the criticism might be less of an objection against an indifference to distributional patterns per se but more of an objection against an overly narrow conception of what value fundamentally consists in: 2. Separateness of Persons as Anti-Hedonism/Welfarism The value of a social prospect depends on more than just the amount of pleasure or welfare for each individual. This issue is separate from Patternism. To be sure, diagrams like the ones above use numbers to represent value—numbers which are naturally interpreted as representing pleasure or welfare. But as we’ll see in §3.3, the issue of Patternism can be formulated in a framework that doesn’t presuppose hedonism or welfarism and leaves the question of what value consists in fairly open. But even if we interpret the numbers as representing amounts of pleasure, it’s clear that Anti-Patternism doesn’t entail a neglect for non-hedonist or nonwelfarist considerations. As we saw, there are Anti-Patternist theories—generalised utilitarian theories—on which the quantity whose total expectation is maximised 5Brink insists not: “The separateness of persons objection is usually applied to hedonistic or desire-satisfaction versions of utilitarianism, but it is supposed to apply in virtue of the utilitarian, consequentialist, or teleological structure that these theories possess” (2011, 253). 6Desert-adjusted utilitarianism and prioritarianism have been defended by Feldman (1995a,b) and Arneson (2006, 2019, 2022). For further discussion of the issue of adjusting for desert and justice, see Temkin (1993, 273-276), Carlson (1997), Arrhenius (2003), and Adler (2018). 64 PATTERNISM, A FIRST PASS by better prospects isn’t pleasure but a quantity that’s related to pleasure by some relevant transformation. Another issue that the “containers of value” objection might be latching on to is that liquids are transferable. This makes comparisons of volume, like those of the form ‘there’s exactly as much liquid in this container as there is liquid in that container’ or ‘there’s twice as much liquid in this container as in that container’, possible. To check if there are equal amounts of liquid in two containers, A and B, simply check if each exactly fills some third container C. To check if there’s twice as much liquid in A as in B, find some container C with exactly as much liquid as B and check if B and C together exactly fill up A. Part of the problem with utilitarianism’s alleged treatment of people as containers and value as liquid might be the assumption that value, like liquid, is transferable and that interpersonal comparisons of value of various kinds are possible. 3. Separateness of Persons as Interpersonal Incomparability Certain kinds of interpersonal comparisons of value are not possible. Some think that respecting the separateness of persons mandates certain kinds of value incomparability:7 “The traditional utilitarian practice of assigning exact numerical values might also seem inconsistent with respecting the separateness of persons” (Chappell, 2015, 323). “The value of persons merits a kind of regard—“respect” we could call it— that forbids comparisons. . . when we arrive at inequalities through aggregation. . . we start to feel the incipient dread of commodification. But aggregation is not the real culprit (or at least not the only one)—it’s the making of comparisons” (Walden, 2020, 107). Recall the rough idea behind Patternism: prospects that differ only in distributional pattern are equally good. In a sense to be fully unpacked later, stating this requires only equivalence comparisons (this is or isn’t exactly as good as that). It doesn’t require any appeal to ordinal comparisons (this is better or worse than that), much less to comparisons involving magnitudes like ratio comparisons (this is twice as good as that) or interval comparisons (the difference in 7Though some reject this as an interpretation of the separateness of persons objection: “I believe that those who advocate the notion of the separateness of persons are not concerned with interpersonal incomparability or incommensurability.” (Hirose, 2013, 186). CHAPTER 3. PATTERNISM ABOUT VALUE 65 value between these two things is twice that of those two things). So, Patternism and Anti-Patternism are consistent with a fair amount of value incomparability. We’ll be in a better position to expand on this once we the issue of Patternism is formulated precisely—a task to which we now turn. 3.3 Framework This section formulates Patternism precisely in a qualitative framework that makes only minimal assumptions about the underlying theory of value and the extent of interpersonal comparisons of value. At a very general level, the problem of aggregation concerns how the value of a whole relates to the value of its parts. This requires the notion of parts (§3.3.1), the notion of wholes (§3.3.2), and the possibility of value comparisons (§3.3.3). These components of the framework are introduced in turn. 3.3.1 Outcomes For us, the parts of interest are what I’ll call outcomes. These are meant to be the qualitative analogue of the numbers in diagrams like the ones above. We can remain fairly neutral on what exactly outcomes are. We require only that they satisfy two structural properties: (i) localisability and (ii) modularity. The requirement that an outcome be localisable means that it can potentially be associated with some person, some period of time, and some possibility. For example, John being surprised on his birthday is localisable. It might happen to John on his birthday in the event that he’s thrown a surprise party. On the other hand, a distant star exploding isn’t localisable—there’s no particular person with which that can be naturally associated. Similarly, classical logic being the one true logic isn’t localisable—it’s not something that’s located in time.8 The second structural assumption is that outcomes are modular in value. To illustrate the idea of modularity, here’s an analogy. A collection of short stories is made up of individual stories. Those stories are, in turn, made up of sequences of words. In judging the literary value of a collection, there can be disagreement about how that relates to how good the stories in the collection are. Some might 8Why can’t we say that a distant star exploding is something that’s associated with everyone? Or that classical logic being the one true logic is something that’s located at all times? Getting precise on what it is for something to be “naturally associated” with some person, time, or possibility is a somewhat difficult task. But I’ll assume that we have some intuitive grasp of this notion and that it’s in good standing. (That doesn’t mean that there aren’t borderline cases or that it’s always easy to tell whether something is localisable.) 66 FRAMEWORK place more weight on holistic criteria like cohesion or thematic unity, in which case a collection could be greater or lesser than the sum of its parts. Such disagreements notwithstanding, judgments of how good each individual story is, on its own, are intelligible. By contrast, it doesn’t make sense to talk about how good each individual word in a short story collection is in isolation from the words that precede or follow it. The stories in a collection are standalone units of literary evaluation. The words in a collection are not. Stories but not words are modular in literary value. Similarly, for outcomes to be modular is for there to be meaningful talk of how good they are independent of their place in relation to other outcomes. No further extrinsic information is required to specify the value of an outcome. Modularity is a purely structural constraint. Exactly how coarsely outcomes have to be individuated for them to be modular will depend on the underlying theory of value which we don’t want to prejudge. For instance, suppose I achieve a personal life goal after a period of strife. Hedonists might decompose this into fairly fine-grained outcomes. Simplifying, perhaps it’s many short episodes of unhappiness followed by instances of happiness. Non-hedonists might instead countenance other sources of fundamental value like the value of achievement or personal development. On such views, focusing on short hedonic moments is like focusing on the individual words in a collection. For them, outcomes might have to be carved up more coarsely. Striving-then-achieving can only be assigned value as a whole. We can’t judge how good the strife is in itself until we figure out whether it’s followed by success or failure. Given hedonism, being happy from achieving some goal is modular. On other views, it’s not. These two structural properties—localisability and modularity—completely characterise the abstract basic units of evaluation I’m calling outcomes. In relation to the separateness of persons objection as anti-hedonism or anti-welfarism, our characterisation of outcomes leaves room for a wide range of substantive views about the fundamental sources of value. Friendship, achievement, desert, knowledge, and so on can all, in principle, be built into whatever makes an outcome better or worse—so long as they are localisable and modular. This flexibility in this qualitative framework will help distill out objections against AntiPatternism from objections aimed at overly narrow conceptions of value or at issues related to the use of numbers to represent value. CHAPTER 3. PATTERNISM ABOUT VALUE 67 3.3.2 Prospects Recall: the problem of aggregation is that of determining how the value of a whole relates to the value of its parts. If the parts of interest for us are outcomes, then the wholes they make up are what I’m going to call social prospects (or prospects, for short). Roughly, prospects are distributions of outcomes across people, time periods, and possible states. But not just any distribution goes. For various reasons, we’ll only be interested in certain kinds of distributions—the finitary, realisable ones—to which we bestow the name ‘prospects’. Let’s make this precise. First, we assume a finite set of individuals and a finite set of time intervals (of possibly variable length). Assume there are at least two people and two time intervals. Otherwise, the problem of aggregating across people or time becomes trivial. For simplicity, most of the concrete examples I’ll consider later will involve exactly two people and two time intervals. On top of the dimensions of people and time, there’s a third dimension of risk, which informally consists of possible events with probabilities attached to them. These ways of carving up the dimensions of people, time, and risk are assumed to be fixed. For instance, we are interested only in comparing distributions defined on the same set of individuals. In other words, the framework is a fixed-population framework rather than a variable-population one. The latter might involve comparisons of distributions involving populations of different sizes or different compositions. That raises thorny issues to do with the nonidentity problem, the value of existence, and so on. One such issue to do with the value of existence concerns replacement cases. If total value remains the same regardless, should we be indifferent between extending the life of an existing person who would otherwise cease to exist and creating a new person altogether? Nestled within some formulations of the “containers of value” objection is the expression of an aversion to theories that answer in the affirmative: “[On total utilitarianism, it] is as if sentient beings are receptacles of something valuable and it does not matter if a receptacle gets broken [own emphasis], so long as there is another receptacle to which the contents can be transferred without any getting spilt” (Singer, 1993, 121). “Total utilitarianism has been much criticised for being an impersonal theory. Most notably, it has been objected that it treats persons as mere containers of utility that are replaceable and do not matter in their own right [own emphasis].” (Bader, 2022). 68 FRAMEWORK Thus, yet another interpretation of the objection is as follows: 4. Separateness of Persons as Non-Replaceability All else fixed, we shouldn’t be indifferent between extending an existing person’s life and creating a new person. As formulated in our fixed-population framework, the issue of Patternism is independent from the issue of replacement. Next, a distribution is any arbitrary assignment of an outcome to each person, time, and state. A social outcome is an assignment of an outcome to each person and time. Distributions can be represented by diagrams like the following: 1/2 1/2 ←− probability distribution −→ B E1 E2 ←− possible event person −→ Ann (a , b) (c , d) ←− outcome social outcome −→ Ben (e , f) (g , h) t1 t2 t1 t2 ←− time interval Among the possible distributions are those we’re interested in, which I’ll call social prospects. These are the distributions that are finitary and realisable. First, infinity raises difficult technical problems that distract from the moral issues that are of primary interest in this paper. So, among the possible distributions, we focus only on the finitary ones. Roughly, these are those distributions that can be represented by diagrams with finitely many columns, like the diagram above. Second, not all distributions are realisable. For instance, let E1 above be a world in which no one is ever happy. Then, if a is an outcome which involves Ann being happy, then distribution B above is not realisable. It’s an assignment of outcomes that’s combinatorially possible but metaphysically impossible. Arguably, non-realisable distributions like these are not proper objects of value comparisons. Social prospects are those distributions that are finitary and realisable. While not all distributions are realisable, I’ll assume that sufficiently many of them are. Specifically, I’ll make the following richness assumption about the space of social prospects. Call two distributions equivalent if they differ at most in the substitution of each outcome with an equally good outcome (see below). I’ll assume: CHAPTER 3. PATTERNISM ABOUT VALUE 69 Replacement. For each distribution, there’s an equivalent distribution that’s realisable. Something like this is an implicit assumption of the standard numerical framework, where it’s taken for granted that for any possible level of value, any combinatorially possible pattern of assignment of numbers to people, times, and states is possible. 3.3.3 Axiology Again, the problem of aggregation concerns the relationship between the value of a whole and the value of its parts. Having identified the parts and wholes of interest, the next course of action is to explain value comparisons. Formulating Patternism requires only minimal kinds of value comparisons. We just have to be able to say, for any pair of outcomes, whether they are equally good. And similarly for any pair of prospects. Thus, we just have to grant two exactly as good as relations—one relating outcomes and another relating prospects. Both are assumed to be equivalence relations.9 Two forms of the separateness of persons objection possibly crop up here. The first is the aforementioned skepticism about interpersonal comparisons. Some such skeptics are skeptics of comparisons of magnitude, like those of the form ‘this is x times as good for me as that is for you’. Others are skeptics of full comparability, arguing that there are cases where one thing is not better, not worse, but also not exactly as good as another.10 As Hirose (2013, 186) puts it: “There is general scepticism about interpersonal comparability in both economics and philosophy. In economics, it is claimed that there is no scientific or objective basis for interpersonal comparisons of utility as utility represents a person’s mental state. In ethics, some people are sceptical about interpersonal comparison because of incommensurability of values.” (2013, 186). Both kinds of skepticism about interpersonal comparisons are compatible with the present framework. The possibility of equally good comparisons of outcomes and of prospects leaves open the impossibility of magnitudinal comparisons and the possibility of widespread value incommmensurability. 9That is to say that the relations are reflexive, symmetric, and transitive. 10See, for instance, Chang (2002). 70 FRAMEWORK A more thoroughgoing skepticism would deny even the possibility of equivalence comparisons. One basis for this might be a skepticism about the notion of impersonal goodness. Consider, for instance: “Individually, we each sometimes choose to undergo some pain or sacrifice for a greater benefit or to avoid a greater harm. . . Why not, similarly, hold that some persons have to bear some costs that benefit other persons more? But there is no social entity with a good that undergoes some sacrifice for its own good.” (Nozick, 1974, 32-33). “[Utilitarianism’s] view of social cooperation is the consequence of extending to society the principle of choice for one man, and then, to make this extension work, conflating all persons into one. . . Utilitarianism does not take seriously the distinction between persons” (Rawls, 1971, 27). “[theories like utilitarianism] suppose that mankind is a super-person, whose greatest satisfaction is the objective of moral action. . . But this is absurd. Individuals have wants, not mankind; individuals seek satisfaction, not mankind. A person’s satisfaction is not part of any greater satisfaction.” (Gauthier, 1963, 126). “[consequentialism] treats the desires, needs, satisfactions, and dissatisfactions of distinct persons as if they were the desires, etc., of a mass person.” (Nagel, 1970, 134). One way to interpret these statements is as ascribing to utilitarians the need to posit some supra-being—the mereological composite of a group of people— whose ontological status is on a par with individual persons. This claim about the ontological commitments of utilitarianism has been rightly criticised for being wrongheaded.11 A more charitable interpretation is in terms of a rejection of the notion of impersonal or overall value. The thought is that everything that’s better is better for something or another. And while individuals are proper bearers of value, groups of people aren’t. Each individual is a separate moral unit and they don’t unite to form any morally significant unit: 5. Separateness of Persons as Skepticism about Overall Value There is no social entity for which a prospect is better, worse, or exactly as good for. Therefore, overall comparisons of the value of prospects are unintelligible.12 11See Norcross (2009). CHAPTER 3. PATTERNISM ABOUT VALUE 71 McKerlie summarises the objection as follows: “when we judge that one person’s claim outweighs the claim of someone else we are assuming an impersonal point of view. We are not just looking at things from one person’s point of view and registering a loss, and then looking at things from another person’s point of view and registering a gain. We make a comparison that includes both people and their points of view. We judge that it is more important to help the first person, and this judgment is not made from any individual’s point of view.” (1988, 222) I don’t find this to be a compelling reason for refusing to grant some kind of useful overall comparison. For heuristic purposes, grant the numerical representability of value for a moment and consider a simple case of interpersonal trade-off: C1 1 C2 1 Ann (1 , 1) Ann (0 , 0) Ben (0 , 0) Ben (1 , 1) t1 t2 t1 t2 Anti-Patternism says that prospect C1 is exactly as good as prospect C2, since they differ only in the pattern of distribution. Skeptics of impersonal value will retort: exactly as good for whom? Not for Ann—for her, C1 is better than C2. And not for Ben—for him, C1 is worse than C2. And not for the group, for that’s not an appropriate bearer of value. To see where this objection fails, compare ascriptions of preferences and beliefs to groups. Suppose the election results in a tie between Biden and Trump. A headline that reads “America prefers neither Biden to Trump nor Trump to Biden” would not be surprising. And that’s so even if, in fact, every American is strongly pro-Biden or pro-Trump. There is a legitimate sense in which America can be indifferent even if no American is. Or, consider belief. Suppose that when it comes to Newcomb’s problem, every philosopher is either a staunch one-boxer or a firm believer in two-boxing. If the two camps are equally divided, we might say that the philosophical community is unsure about Newcomb cases, even if no philosopher is. 12Besides those cited above, others who press a similar objection against utilitarianism include Taurek (1977), Foot (2003), and Thomson (2008, 2009). 72 FRAMEWORK Now, there’s a debate about whether groups have the sort of unified agency or cognition required for genuine belief or preference.13 Even those who think not can grant that there are weaker notions extending the ordinary notions of belief and preference that can be applied to groups. (We can call them ‘g-beliefs’ and ‘g-preferences’ if the hang-up is on nomenclature). These extended notions may be interpreted much less robustly than genuine belief of preference. Perhaps, to say that philosophers are unsure is simply shorthand for saying that there’s no consensus among them. These notions can play useful practical and normative roles in guiding action and belief. A group of experts being unsure might mean that deference to experts requires suspension of judgment. I suggest thinking of overall goodness in a similar way. The need to make trade-offs is an ineliminable feature of social life. To aid us in the balancing of trade-offs, there’s a useful extension of exactly as good as relations from individuals to aggregates. For skeptics, the aggregate notion can be understood in a highly non-committal way. Whenever I say that two prospects are equally good, they may, for instance, choose to simply interpret that as saying that nothing favours making one trade-off over another (bracketing off certain deontic considerations like partial obligations or obligations of justice). Accepting the legitimacy of some useful notion of overall equivalence doesn’t commit us to thinking of groups as appropriate bearers of value—no more than accepting group “belief” or “preference” necessitates thinking of Americans and philosophers as hive minds capable of cognition and wants.14 3.3.4 Patternism To sum up, our framework consists of the following components: (i) a set of outcomes, (ii) a set of prospects, and (iii) two equivalence relations, one relating outcomes and another relating prospects. With these, Patternism can be stated precisely. Let’s define the expected occurrence of an outcome in a prospect to be the probability-weighted sum of the number of people and times to which it’s assigned. For instance, the expected occurrence of outcome a in prospect D below is 1/3 + 1/3 + 2/3 = 4/3. That of b is 1/3 + 2/3 + 2/3 = 5/3. And that of c is 1/2 + 2/3 = 1. In the standard numerical framework, equally good outcomes are conflated— they would be assigned the same number. The analogue of a number in our 13See, for instance, List & Pettit (2011). 14A similar response is found in Bader (2022, 14). CHAPTER 3. PATTERNISM ABOUT VALUE 73 1/3 2/3 D E1 E2 Ann (a , a) (b , c) Ben (b , c) (a , b) t1 t2 t1 t2 qualitative framework is an outcome level. An outcome level is the equivalence class of equally good outcomes—for instance, the outcome level [a] is the class of all outcomes exactly as good as a. The expected occurrence of an outcome level is the sum of the expected occurrence of the outcomes in it. Two prospects are then said to be permutation equivalent if for each outcome level, its expected occurrence is the same in both prospects. Permutation equivalence is our precise definition of what it means for two social prospects to differ only in their pattern of distribution. Patternism and its converse can thus be stated as follows: Patternism. Some permutation equivalent prospects are not equally good. Anti-Patternism. Permutation equivalent prospects are equally good. A final point of clarification is that these are purely axiological claims. Being Anti-Patternist doesn’t immediately mean that we ought to ignore distributional patterns in deciding which policies to enact or that we ought to circumvent people’s autonomy in order to impose the best prospects. Non-axiological considerations or side constraints might come into play. Thus, the issue of Patternism is independent of the following interpretation of the separateness of persons objection: 6. Separateness of Persons as Non-Fungibility We should respect each person’s autonomy and not treat them as a mere means to an end.15 We find suggestions of this objection in statements such as Nozick’s: “To use a person in this way [have them bear some costs that benefit other persons more] does not sufficiently respect and take account of the fact that 15On more in-depth discussion of this interpretation of the objection, see Chappell (2015). 74 PRIORITARIANISM he is a separate person. . . no one is entitled to force this upon him” (Nozick, 1974, 32-33). The bite of objections like this will depend on the further questions of whether something being best always means that we should forcefully impose it or that we should “use” people to achieve it. More generally, we can distinguish between axiological and deontic versions of Patternism and Anti-Patternism. We are primarily interested in the axiological question of whether prospects that differ only in pattern are equally good. This can come apart from the deontic or normative question about whether we ought, or have reasons, to realise one pattern over another. Whether the questions come apart will depend on one’s view about the connection between what’s good and what’s right or ought to be done. 3.4 Prioritarianism The case against Patternism will be that it’s inconsistent with three plausible principles about how the dimensions of people, time, and risk interact. Before probing these principles, it’ll be helpful to have concrete Patternist and AntiPatternist theories against which we can test these principles. These theories will help structure the upcoming discussion, illustrating how Patternism runs into problems and revealing some possible paths of resistance. For reasons that will become clear, especially suited for this dialectical purpose are various forms of a theory called ‘Prioritarianism’. Going forward, we will often freely avail ourselves of resources not officially available within the framework—such as the use of numbers to track value and the introduction of theories via functional equations defined on real numbers. This is merely to ease exposition and to have a more concrete grasp of theories and cases against which to test our intuitions. Where the extra assumptions might actually make a difference, I’ll flag them as such. In any case, most parties to the debate will countenance more than what the framework allows—for instance, ordinal comparisons of outcomes and prospects in addition to mere equivalence relations. Keeping the official framework bare and invoking extra assumptions as they become relevant will simply help make it clear, at each point, exactly what is up for contention and what isn’t. To introduce Prioritarianism, consider the following case, suppressing for now the dimensions of time and risk: Interpersonal Trade-offs. Ann has a cold and Ben has a headache. Totalax, a CHAPTER 3. PATTERNISM ABOUT VALUE 75 cold medication, would be good only for Ann. Priotin, on the other hand, partially treats both colds and headaches but is less than half as effective. Given a choice of only one medication to administer both Ann and Ben, which should we choose? Totalax Ann 9 Priotin Ann 4 Ben 0 Ben 4 A maximiser of total welfare would prefer Totalax. But many think there’s something to be said in favour of Priotin. Despite the slight decrease in total welfare, it distributes welfare equally. One prominent view that incorporates considerations like this is Prioritarianism. According to Prioritarians, we ought to give priority to the worse-off.16 One way of understanding what this means begins with the following observation: to choose Priotin over Totalax is, in effect, to have Ann give up on 5 units of welfare that she could have had in order that Ben benefit by 4: Ann 9 -5 units −−−−→ Ann 4 Ben 0 −−−−→ +4 units Ben 4 Totalax Priotin To accord egalitarian considerations some weight is to think that leaky transfers of welfare like these—with a net loss in total welfare—can nevertheless sometimes make things better.17 Central to Prioritarianism is the idea that the worse-off an individual is to begin with, the more value there is to be gained from improving their welfare by a fixed amount. According to Prioritarians, the relationship between welfare and moral value is like that between money and utility. Utility doesn’t increase linearly with wealth. The more money one already has, the less that an extra dollar is worth. Similarly, according to Prioritarians, welfare has diminishing marginal moral value. This relationship can be visualised using a strictly increasing concave function, like the square-root function: 16See, for instance, Parfit (2002), McCarthy (2006, 2008), Holtug (2006), Rabinowicz (2002), Fleurbaey (2010), Otsuka (2012, 2015), and Broome (2015). 17More precisely, see the (strong) Pigou-Dalton condition (Adler, 2019). 76 PRIORITARIANISM 0 1 2 3 0 √ 1 √ 2 √ 3 Welfare Moral Value This gives more weight to the worse-off in that the more welfare that an individual already has, the less that an extra unit will contribute to overall moral value. For instance, the corresponding increase in moral value going from 0 units of welfare to 1 is greater than that going from 1 to 2. Prioritarians would first apply the transformation to the welfare values to obtain the moral values before summing those up. For instance, using the squareroot function, the overall value of Totalax would be √ 9 + √ 0 = 3 whereas that of Priotin would be √ 4 + √ 4 = 4. So, Prioritarianism does indeed accord weight to egalitarian considerations—it prefers the equally distributed distribution with slightly less total welfare. But welfare can be distributed not just across people but across time. And the dimension of time raises some questions for Prioritarians. Consider: Interpersonal-Intertemporal Trade-offs. On the first day, Ann has a cold and Ben a headache. But their symptoms reverse the next day—Ann has a headache and Ben a cold. As before, Totalax only treats colds while Priotin partially treats both colds and headaches. Given only one medication to administer to both Ann and Ben both days, which should we choose? Totalax Ann (9 , 0) Priotin Ann (4 , 4) Ben (0 , 9) Ben (4 , 4) t1 t2 t1 t2 In this case, Totalax would make Ann worse-off than Ben one day but better-off the next. The inequalities cancel out across time so that their lifetime welfare is equalised. So, the Prioritarian’s exhortation to prioritise the worse-off can be understood in one of two ways: prioritise the worse-off at each time (Timeslice Prioritarianism) or prioritise the worse-off across time (Lifetime Prioritarianism). CHAPTER 3. PATTERNISM ABOUT VALUE 77 Timeslice and Lifetime Prioritarians might disagree about which of Totalax and Priotin is better. The difference comes down to where the transformation is applied. Timeslice Prioritarians first apply the transformation to the welfare values at each time and sum those up across people before summing up across time. So, Totalax (√ 9 + √ 0 + √ 0 + √ 9 = 6) is worse than Priotin (√ 4 + √ 4 + √ 4 + √ 4 = 8). Lifetime Prioritarians, on the other hand, first sum up welfare for each person before applying the transformation to their lifetime welfare and adding those up. So, Totalax (√ 9 + 0 + √ 0 + 9 = 6) is better than Priotin ( √ 4 + 4 + √ 4 + 4 ≈ 5.66). Besides the dimensions of people and time, welfare can be also be distributed across different possibilities under conditions of uncertainty. Consider: Interpersonal-Risky Trade-offs. One of Ann and Ben has a cold and the other a headache. But we don’t know who has which. Given the choice of exactly one of Totalax and Priotin to administer to both Ann and Ben, which should we choose? Ann cold Ann headache Ann cold Ann headache Ben headache Ben cold Ben headache Ben cold Ann 9 0 Ann 4 4 Ben 0 9 Ben 4 4 Totalax Priotin In this case, if Ann has the cold, then Totalax will make her better-off than Ben. Otherwise, she’ll be worse-off. Either way, there’s bound to be inequality in the eventual distribution of welfare with Totalax. However, each of Ann and Ben has a fair shot at being the beneficiary of Totalax. In terms of expected welfare, neither would be worse-off than the other. So, when the Prioritarian says to prioritise the worse-off, that can be understood as prioritising the worse-off in eventual welfare (Ex-Post Prioritarianism) or in expected welfare (Ex-Ante Prioritarianism). Again, it can make a difference whether the Prioritarian penalises inequalities in eventual welfare or expected welfare. Ex-Post Prioritarians penalise inequalities in eventual welfare. So, they first add up the transformed welfare values across people before taking the expectation of those sums. So, Totalax (1/2( √ 9 + √ 0) + 1/2( √ 0 + √ 9) = 3) is worse than Priotin (1/2( √ 4 + √ 4) + 1/2( √ 4 + √ 4) = 4). 78 PATTERNISM DECOMPOSED Ex-Ante Prioritarians, on the other hand, are concerned with inequalities in expected welfare. This means taking each person’s expected welfare before applying the transformation to the expected welfare and adding those up across people. So, Totalax (√1/29 + 1/20 + √1/20 + 1/29 = 2 √ 4.5 ≈ 4.24) is better than Priotin (√1/24 + 1/24 + √1/24 + 1/24 = 4). Putting the dimensions of time and risk together, there are at least, in total, four possible disambiguations of Prioritarianism, each corresponding to a different order in which the dimensions of people, time, and risk can be resolved and a difference in where the transformation is applied. These serve as a good template for structuring the upcoming discussion. That’s because one of these Prioritarian theories is Anti-Patternist and the other three are Patternist—each illustrating the violation of one of the three principles that jointly contradict Patternism. Timeslice Lifetime Ex-Ante prioritise the worse-off in prioritise the worse-off in expected welfare at each time in expected lifetime welfare (risk → people → time) (time → risk → people) violates Temporal Neutrality (§3.6.1) violates Statewise Anonymity (§3.6.2) Ex-Post prioritise the worse-off in prioritise the worse-off in eventual welfare at each time in actual lifetime welfare (people → risk → time) (time → people → risk) is Anti-Patternist violates Timeslice Stochasticism (§3.6.3) 3.5 Patternism Decomposed This section introduces the three principles that are jointly equivalent to AntiPatternism. Consider: Ann and Ben will be sick for the next two days, Monday and Tuesday. We have only two doses of a medication—a weaker dose and a stronger one. The effects of each dose will only last a day. Compare two plans on how to distribute the doses based on the outcome of a coin toss: E1: Either way, give Ann the stronger dose and Ben the weaker dose. But if heads, give the doses on Tuesday. If tails, give them on Monday. E4: Either way, give one person both the stronger dose on Monday and the weaker dose on Tuesday. But if heads, that person is Ann. If tails, it’s Ben. CHAPTER 3. PATTERNISM ABOUT VALUE 79 E1 Heads Tails E4 Heads Tails Ann (0 , 6) (6 , 0) Ann (6 , 2) (0 , 0) Ben (0 , 2) (2 , 0) Ben (0 , 0) (6 , 2) t1 t2 t1 t2 t1 t2 t1 t2 The two prospects are equally good according to Patternism. We can get each prospect from the other by simply shifting fixed amounts of value around— across people, time, and equiprobable possibilities. In this section, we’ll see that any arbitrary permutation like the one above can, in fact, be achieved via a sequence of three fundamental kinds of permutation. Patternism is thus equivalent to the conjunction of three principles, each claiming that each kind of permutation preserves overall value. For the first kind of permutation, compare the following plans: E1: Either way, give Ann the stronger dose and Ben the weaker dose. But if heads, give the doses on Tuesday. If tails, give them on Monday. E2: Regardless of how the coin lands, give Ann the stronger dose on Monday and Ben the weaker dose on Tuesday. E1 Heads Tails E2 Heads Tails Ann (0 , 6) (6 , 0) Ann (6 , 0) (6 , 0) Ben (0 , 2) (2 , 0) Ben (0 , 2) (0 , 2) t1 t2 t1 t2 t1 t2 t1 t2 The difference is merely temporal. On either plan, Ann gets the stronger dose and Ben the weaker one. E2 differs only in that Ann gets the dose earlier than she would in E1 given heads (the blue and red entries are permuted) and Ben gets the dose later than he would in E1 given tails (the green and purple entries are permuted). In a sense made precise later, prospects that differ only in this way are equivalent modulo temporal order. Intuitively, mere differences in exactly when fixed amounts of value obtain shouldn’t matter: Temporal Neutrality. Prospects that are equivalent modulo temporal order are equally good. 80 PATTERNISM DECOMPOSED Timeslice Ex-Ante Prioritarianism, which prioritises the worse-off in expected welfare at each time, violates Temporal Neutrality. In E1, Ann’s daily expected welfare is 3 (the average of 0 and 6) and Ben’s is 1 (the average of 2 and 0). In E2, Ann’s expected welfare is 6 on Monday and 0 on Tuesday whereas Ben’s is 0 on Monday and 2 on Tuesday. Applying a concave transformation like the squareroot function to the expected welfare at each time and adding them up, the result would be 2√ 3 + 2 ≈ 5.46 for E1 and √ 6 + √ 2 ≈ 3.86 for E2. So, Timeslice Ex-Ante Prioritarianism violates Temporal Neutrality and is sensitive to exactly when fixed amounts of value obtain. Temporal Neutrality and exactly how bad violations of it are is investigated further in §3.6.1. For the second fundamental kind of permutation, compare the following plans: E2: Regardless of how the coin lands, give Ann the stronger dose on Monday and Ben the weaker dose on Tuesday. E3: Give one person the stronger dose on Monday and the other person the weaker dose on Tuesday. But if heads, Ann gets the stronger dose. If tails, Ben does. E2 Heads Tails E3 Heads Tails Ann (6 , 0) (6 , 0) Ann (6 , 0) (0 , 2) Ben (0 , 2) (0 , 2) Ben (0 , 2) (6 , 0) t1 t2 t1 t2 t1 t2 t1 t2 On both plans, one person will receive the stronger dose on Monday and the other person the weaker dose Tuesday. The only difference lies in how the beneficiary of the stronger dose given heads is related to the beneficiary of the stronger dose given tails. They are the same person in E2 but different people in E3. (The blue and red rows are permuted). Prospects that differ only in this way are what we will later call ‘statewise anonymous pairs’. Such differences shouldn’t matter according to: Statewise Anonymity. Statewise anonymous pairs of prospects are equally good.18 Consider Lifetime Ex-Ante Prioritarianism, which prioritises the worse-off in expected lifetime welfare. In E2, Ann’s expected lifetime welfare is 6 whereas 18McCarthy et al. (2020) call a similar principle in their framework ‘Two-Stage Anonymity’. CHAPTER 3. PATTERNISM ABOUT VALUE 81 Ben’s is 2. In E3, both Ann and Ben have an expected lifetime welfare of 4 (the average of 6 and 2). Applying a concave transformation like the square-root function to each person’s expected lifetime welfare and adding them up, the result would be √ 6 + √ 2 ≈ 3.86 for E2 and 2√ 4 = 4 for E3. So, Lifetime Ex-Ante Prioritarianism violates Statewise Anonymity. Some arguments for Statewise Anonymity are explored in §3.6.2. For the final kind of permutation, compare the plans: E3: Give one person the stronger dose on Monday and the other person the weaker dose on Tuesday. But if heads, Ann gets the stronger dose. If tails, Ben does. E4: Either way, give one person both the stronger dose on Monday and the weaker dose on Tuesday. But if heads, that person is Ann. If tails, it’s Ben. E3 Heads Tails E4 Heads Tails Ann (6 , 0) (0 , 2) Ann (6 , 2) (0 , 0) Ben (0 , 2) (6 , 0) Ben (0 , 0) (6 , 2) t1 t2 t1 t2 t1 t2 t1 t2 On either plan, Ann has a fifty-fifty chance of receiving the stronger dose on Monday and a fifty-fifty chance of receiving the weaker dose on Tuesday. And so does Ben. The only difference lies in how the outcomes are correlated across the two days. In E3, the chancy event that gives one person the stronger dose will give the other person the weaker dose the next day. On the other hand, the same person is guaranteed to get both doses on E4, though both Ann and Ben get a fair shot at being that person. (The two diagrams differ simply by a permutation of the blue and red columns above). In a sense to be explained later, prospects that differ only in this way are “stochastically equivalent at each time”. The third principle says that such prospects should be equally good: Timeslice Stochasticism. Prospects that are stochastically equivalent at each time are equally good. Consider Lifetime Ex-Post Prioritarianism, which prioritises the worse-off in actual lifetime welfare. In E3, for each person, there’s an equal chance of their lifetime welfare being 6 and 2. In E4, for each person, there’s an equal chance of their lifetime welfare being 8 and 0. Applying a concave transformation like the 82 PATTERNISM DECOMPOSED square-root function to the lifetime welfare and taking their total expectation, the result would be √ 6 + √ 2 ≈ 3.86 for E3 and √ 8 + √ 0 ≈ 2.83 for E4. So, Lifetime Ex-Post Prioritarians prefer E3, violating Timeslice Stochasticism. The case for Timeslice Stochasticism is explored in §3.6.3. According to Temporal Neutrality, plans E1 and E2, which differ simply by exactly when fixed amounts of value obtain, are equally good. According to Statewise Anonymity, plans E2 and E3, which differ simply by how the beneficiaries of lifetime outcomes are related across possibilities, are equally good. Finally, according to Timeslice Stochasticism, plans E3 and E4, which differ simply by how the social outcomes at each time are correlated, are equally good. So, by the transitivity of the equally good relation between prospects, E1 and E4 must be equally good—exactly as Anti-Patternism requires. In fact, any permutation can be achieved by chaining together permutations of the three fundamental kinds in one way or another. Let’s call a prospect uniform if in each possibility, the outcomes are the same for every person at every time. The key observation is that the fundamental permutations allow us to “equalise” the outcomes across time and people so that every prospect can be reduced to a uniform prospect. For instance, F1 can be reduced to F7, which we call its uniform reduction: F1 p . . . F7 p/4 p/4 p/4 p/4 . . . A (a , c) . . . A (a , a) (b , b) (c , c) (d , d) . . . B (b , d) . . . B (a , a) (b , b) (c , c) (d , d) . . . To show that, first note that the fundamental permutations allow us to “equalise” outcomes across time: F1 p/2 p/2 . . . F2 p/2 p/2 . . . A (a , c) (a , c) . . . ∼ A (a , c) (c , a) . . . B (b , d) (b , d) . . . 1 B (b , d) (d , b) . . . ∼ 2 F3 p/2 p/2 . . . 1 by Temporal Neutrality A (a , a) (c , c) . . . 2 by Timeslice Stochasticism B (b , b) (d , d) . . . CHAPTER 3. PATTERNISM ABOUT VALUE 83 The fundamental permutations also then allow us to “equalise” outcomes across people: F3 p/4 p/4 . . . F4 p/4 p/4 . . . A (a , a) (a , a) . . . ∼ A (a , a) (b , b) . . . B (b , b) (b , b) . . . 4 B (b , b) (a , a) . . . t1 t2 t1 t2 t1 t2 t1 t2 ∼ 5 F6 p/4 p/4 . . . F5 p/4 p/4 . . . A (b , a) (a , b) . . . ∼ A (a , b) (b , a) . . . B (b , a) (a , b) . . . 6 B (b , a) (a , b) . . . t1 t2 t1 t2 t1 t2 t1 t2 ∼ 7 4 by Statewise Anonymity F7 p/4 p/4 . . . 5 by Timeslice Stochasticism A (a , a) (b , b) . . . 6 by Temporal Neutrality B (a , a) (b , b) . . . 7 by Timeslice Stochasticism t1 t2 t1 t2 These chains of equivalence allow us to collapse each prospect to its uniform reduction. Clearly, if two prospects are permutation equivalent, then they have the same uniform reduction. So, if each prospect is exactly as good as its uniform reduction, permutation equivalent prospects must be equally good—as AntiPatternism requires. So, the three principles entail Anti-Patternism. It’s easy to see that the converse also holds. So, Anti-Patternism is equivalent to the conjunction of Temporal Neutrality, Statewise Anonymity, and Timeslice Stochasticism. There are thus, at base, three ways of being Patternist—one corresponding to the rejection of each of the three principles. Decomposing Anti-Patternism in this way thus provides a further disambiguation of the separateness-of-persons objection. 3.6 Principles This section states the principles that lead to Anti-Patternism precisely and explores their plausibility. I’ll suggest that there are serious costs attached to rejecting any of them, though some avenues of resistance are more plausible than others. 84 PRINCIPLES 3.6.1 Temporal Neutrality Let’s begin with some clarifications. Clearly, temporal order matters. It’s better to have a big meal after a run rather than before. It’s better to have orange juice before brushing your teeth rather than after. Timing matters. But that’s only because the effects might depend on the order—running on a full stomach causes cramps and toothpaste makes juice taste worse. Intuitively, without a difference in effect, no temporal order is intrinsically better than another. As Parfit puts it: “Most of us believe that a mere difference in when something happens, if it does not affect the nature of what happens, cannot be morally significant. Certain answers to the question ‘When?’ are of course important. We cannot ignore the timing of events. . . But we aim for [a certain timing] only because of its effects. We do not believe [for instance] that the equality of benefits at different times is, as such, morally important.” (1984, 340). When using numbers to represent value, any difference in effect should be reflected in the numbers. If juice-then-brush is (5, 0), then brush-then-juice should be, say, (0, 4) on account of the juice tasting worse. Temporal Neutrality doesn’t apply here—it doesn’t mandate neutrality about whether brushing happens before or after drinking juice. On our initial gloss of Temporal Neutrality, we said that the order in which fixed amounts of value obtain shouldn’t matter. Holding value fixed requires controlling for any downstream effects of a difference in temporal order—only then does Temporal Neutrality apply. Formulating Temporal Neutrality precisely in our qualitative framework requires unpacking this idea of holding value fixed. Consider the following prospects, where each x and x ∗ is a pair of equivalent outcomes (e.g. a and a ∗ are equally good): G1 . . . E . . . G2 . . . E . . . Ann . . . (a , b) . . . Ann . . . (b ∗ , a ∗ ) . . . Ben . . . (c , d) . . . Ben . . . (c ∗ , d ∗ ) . . . t1 t2 t1 t2 Prospects that differ in this way are said to differ only in temporal order or to be equivalent modulo temporal order. Temporal Neutrality says that this kind of difference can’t make for a difference in overall value: CHAPTER 3. PATTERNISM ABOUT VALUE 85 Temporal Neutrality. Prospects that are equivalent modulo temporal order are equally good. Note that the outcomes x and x ∗ just have to be equally good—they needn’t be the same. As we saw, certain outcomes might only be instantiable under certain conditions—only at some times, for some people, in some possibilities, in conjunction with certain other outcomes, and so on. For instance, suppose a represents a happy childhood and b a troubled adolescence. Then, (a, b) is a possible life but (b, a) isn’t. In that case, simply permuting the blue and red entries in G1 above wouldn’t result in a realisable prospect. But Replacement guarantees that there’s a realisable prospect like G2. That’s obtained by permuting the blue and red entries in G1, while possibly replacing each outcome with an equally good one. For instance, there might be some kind of childhood b ∗ that’s exactly as good an outcome as a troubled adolescence and an adolescence a ∗ that’s exactly as good as a happy childhood, such that (b ∗ , a ∗ ) is a possible life. According to Temporal Neutrality, changing the temporal order of outcomes within a life, while possibly substituting outcomes with equally good ones shouldn’t affect how good things are overall. The clarification about controlling for the effects of temporal changes helps address many objections against Temporal Neutrality. One is that temporal order matters because a positive life trajectory is better than a negative one (Velleman, 1991). Another is that even if a life’s trajectory doesn’t affect how good it is, temporal order matters because how the trajectories of different lives align can matter. For instance, that can affect whether there’s inequality within a given time period (McKerlie, 1989). I think the intuitions favouring certain temporal orders dissipate once we properly control for side-effects to ensure that the difference is purely temporal. Take Velleman’s case: “Consider two different lives that you might live. One life begins in the depths but takes an upward trend: a childhood of deprivation, a troubled youth, struggles and setbacks in early adulthood, followed finally by success and satisfaction in middle age and a peaceful retirement. Another life begins at the heights but slides downhill: a blissful childhood and youth, precocious triumphs and rewards in early adulthood, followed by a midlife strewn with disasters that lead to misery in old age. . . after the tally of good times and bad times had been rung up, the fact would remain that one life gets progressively better while the other gets progressively worse” (1991, 86 PRINCIPLES 50). What’s the basis for the intuitive preference for upwards-trending lives like the one Velleman describes? A natural answer is that upwards-trending lives often contain more of the things many find valuable—the satisfaction of lifelong goals, happiness that’s derived from hard work rather than circumstance, achievements possible only with sufficient maturity (like intellectual or creative achievements), and so on. An overly narrow theory of value, like hedonism, might not accommodate some of these sources of value. And maybe it can’t distinguish between different kinds of happiness—deserved vs. undeserved, instant gratification vs. the deep satisfaction of a lifelong ambition achieved, and so on. But our framework is flexible. If desert or intellectual achievement affects value, then it should be reflected in the outcomes. Once the sources of value that typically accompany a positive life trajectory are properly accounted for, it’s questionable whether an upwards trend has intrinsic value. Similarly, consider McKerlie’s case for the badness of temporal inequality: “[imagine] a feudal society in which peasants and nobles exchange roles every ten years. The result is that people’s lives as wholes are equally happy. Nevertheless during a given time period the society contains great inequality. . . If equality between complete lives were all that mattered, an egalitarian could not object to it. But I think that many egalitarians would find it objectionable” (McKerlie, 1989, 479). We can easily explain why it’s bad for some people to be in subordinate positions at some time—even if the inequality is later cancelled out by inequality in the opposite direction. And that’s because inequality can cause resentment and envy, a lack of self worth, a sense of insecurity from the potential for abuses of power (even if never exercised), and so on. I think the intuitions favouring certain temporal orders are entirely explained by the good things that typically come with those orders. Once we control for the typical associations, the intuitions disappear. For instance, suppose we’re at dinner. I’m the sort of person who starts with the part of the meal I like least, saving what I like best for last. You’re the opposite. Surely, there isn’t a case to be made for you to eat like me instead (on account of an increase in goodness over time being better than a decrease). Neither is there a case in favour of us coordinating how we eat (lest there be a difference in how well we fare at a given time). CHAPTER 3. PATTERNISM ABOUT VALUE 87 I think Patternists will find more promise in resisting one of the next two principles instead. In any case, rejecting Temporal Neutrality isn’t the most natural way of cashing out the separateness of persons objection—at least not without also rejecting one of the other principles. 3.6.2 Statewise Anonymity For the second principle, consider the following prospects, where each x is exactly as good an outcome as the corresponding x ∗ : H1 E F H2 E F Ann (a , b) (e , f) Ann (a ∗ , b ∗ ) (g ∗ , h ∗ ) Ben (c , d) (g , h) Ben (c ∗ , d ∗ ) (e ∗ , f ∗ ) t1 t2 t1 t2 t1 t2 t1 t2 Call two lives equivalent if they contain the same amount of value in each period. In qualitative terms, (x, y) and (x ∗ , y ∗ ) are equivalent if x and x ∗ are equally good, as are y and y ∗ . Whichever possibility obtains, the two prospects above bring about similar social outcomes. Given E, there’ll be an (a, b)-equivalent life and a (c, d)- equivalent life. And given F, there’ll be an (e, f)-equivalent life and a (g, h)- equivalent life. The key difference lies in how the lives in different states are related. In H1 but not H2, the person who would have an (a, b)-equivalent life given E is the one would have a (c, d)-equivalent life given F. Call prospects that differ as H1 and H2 do statewise anonymous pairs. Such pairs must be equally good according to: Statewise Anonymity. Statewise anonymous pairs of prospects are equally good. Heuristically, this allows us to permute who gets which life within each possibility (like the blue and red rows above), while possibly replacing each life with an equivalent one. Previous caveats about controlling for side-effects and holding value fixed apply here as well. Suppose Ann loves nuts but Ben’s allergic. Then, of course it matters who has the life with a nut-rich diet, just as it matters when orange juice is had. A happy nut-laden life (e, f) isn’t possible for Ben in possibilities in which he’s allergic. Nevertheless, Replacement guarantees that there’s always 88 PRINCIPLES an equivalent life (e ∗ , f ∗ ) that Ben can have. Perhaps Ben loves fruits as much as Ann loves nuts—then, (e ∗ , f ∗ ) might be a happy fruit-laden life that’s possible for Ben. The point is that outcomes sometimes have to be substituted for equivalent ones to account for differences in value that different experiences might have for different people. Only once we hold value “fixed” by controlling for such differences is Statewise Anonymity applicable. This caveat should ward off more naive objections against Statewise Anonymity. A more serious objection stems from considerations to do with fairness in the distribution of chances. Compare a prospect that randomises the recipient of a benefit with one that doesn’t: I1 Heads Tails I2 Heads Tails Ann 1 0 Ann 1 1 Ben 0 1 Ben 0 0 (The temporal dimension doesn’t play a huge role in assessing Statewise Anonymity, so it’ll often be suppressed in this section).19 While I1 gives Ann and Ben an equal shot at receiving the benefit, I2 doesn’t. That might make I1 preferable on grounds of fairness.20 There are a few possible ways to render Statewise Anonymity consistent with a concern for fairness. One way is to recognise the importance of fairness but relegate it to some non-axiological domain. Perhaps fairness is an important component of what’s right or what’s just but not of what’s good. (Again, recall that we’re primarily interested in the axiological version of Patternism). Another approach maintains that the value of fairness is actually fully accounted for once we properly interpret the numbers above. Unfairness can have all sorts of bad effects. Holding the numbers in the diagrams above fixed requires adjusting for these negative consequences—resentment, envy, distrust, political instability. But there seems to be a limit to this response. Many typical negative consequences of unfairness arise from perceptions of unfairness or ingrained systemic bias. Absent such features, some will still find an unfair distribution of chances objectionable. 19Numbers, like the ones above, can be interpreted as representing aggregate lifetime value or value within some period, assuming that value is perfectly uniform for everyone at all other times. 20See, for instance, Diamond (1967). CHAPTER 3. PATTERNISM ABOUT VALUE 89 But depending on our underlying theory of value, even the badness of unperceived, one-off bias could be reflected in the value of each individual outcome. To explain, theories like hedonism are internalist—outcomes can’t be better or worse without a difference in felt experience.21 Internalist theories, in turn, belong to a broader class of what we might call ‘amodal’ theories. On such theories, the value of outcomes is determined entirely by what in fact happens. On modal theories, on the other hand, facts about what would or could have been can have a bearing on how good things are. This isn’t so implausible. There might, for instance, be intrinsic value in fulfilling one’s potential or doing the best with what’s been given.22 Perhaps being some way when that’s the best things could have been is a better outcome than being that same way given that one could have done much better. And that’s so independent of any difference in levels of regret. On modal theories, outcomes might have to be individuated quite finely since it isn’t always enough to specify what in fact happens. This allows for the value of fairness to be captured locally in the outcomes, even absent any actual adverse effects. To illustrate, let: a = having $10 when one would otherwise have $5; b = having $5 when one would otherwise have $10; a − = having $10 when one would have $10 regardless; b + = having $5 when one would have $5 regardless. On a toy modal theory where the value of what happens can depend counterfactually on whether things would have been better or worse, a might be better than a − and b worse than b +. Given this background theory, it’s perfectly consistent with Statewise Anonymity to prefer randomising who gets the tenner and the fiver: J1 Heads Tails J2 Heads Tails Ann $10 (a) $5 (b) Ann $10 (a −) $10 (a −) Ben $5 (b) $10 (a) Ben $5 (b +) $5 (b +) Statewise Anonymity applies only when the outcomes are substituted for equivalent ones to compensate for any losses or gains in value incurred by the 21Some call these ‘mentalist’ or ‘experientialist’ theories of welfare. 22See, for instance, Pettit (2015) and Masny (2023). 90 PRINCIPLES modal differences. This means adjusting a − up and b + down. Perhaps the equivalent of having $10 when the alternative is $5 (a) is having $15 for sure (a ∗ ). And the equivalent of having $5 when the alternative is $10 (b) is having $4 for sure (b ∗ ). Statewise Anonymity would then deem the following prospects equally good: J1 Heads Tails J3 Heads Tails Ann $10 (a) $5 (b) Ann $15 (a ∗ ) $15 (a ∗ ) Ben $5 (b) $10 (a) Ben $4 (b ∗ ) $4 (b ∗ ) This judgment isn’t so obviously contrary to fairness. There are many similar moves available to proponents of Statewise Anonymity. For instance, there are a variety of other possible modal theories. Perhaps there’s intrinsic value not just in my experiences being good but in them being safely so (in the sense that they couldn’t easily have been worse). This can help explain why the potential for perceptions of bias, an unstable political system, or a fragile social fabric can be bad even if bad effects never actually come to fruition. In general, more permissive views about what can factor into the value of individual outcomes might help reconcile Statewise Anonymity with concrete judgments about fairness. How good of a defence strategy this is will depend partly on the independent plausibility of the background theory of goodness appealed to. Turning now from defence to offence, there are two strong positive arguments in favour of Statewise Anonymity. The first is that it follows from two highly plausible principles—one ethical and another decision-theoretic. The first principle is an impartiality principle. Roughly, it says that in judging the value of a social outcome, it doesn’t matter exactly who gets which life, all else equal. More precisely, suppose Ann has an (a, b) life and Ben has a (c, d) life in some social outcome. In another, Ann has an (c, d)-equivalent life and Ben has an (a, b)-equivalent life. These two social outcomes are anonymous pairs—they contain the same lives (modulo the substitution of equivalent outcomes), differing only in who has which life. Such a difference can’t make for a difference in value according to: Anonymity. Anonymous pairs of social outcomes are equally good. The second decision-theoretic principle says that how good a prospect is depends only on how good its possible outcomes are—not what those outcomes CHAPTER 3. PATTERNISM ABOUT VALUE 91 happen to be exactly. Suppose that whatever happens, the social outcomes that would result from two prospects are always equally good. Call prospects like these equivalent modulo the substitution of equivalent social outcomes. Intuitively, such prospects should be equally good: Substitution of Equivalents. Prospects that are equivalent modulo the substitution of equivalent social outcomes are equally good. This is a very weak form of the Dominance principle that’s a standard feature of most decision theories. Clearly, Anonymity and Substitution of Equivalents jointly entail Statewise Anonymity, so deniers of the latter will have to reject one of the former. A second argument for Statewise Anonymity arises from an observation of Mahtani’s (2017; 2021). In cases of uncertainty about identity, the description of a prospect can depend on the choice of designators. For instance, consider: You have two patients—Ann and Bea—who are both sick. Unfortunately, only one dose of medicine remains. One possible plan is to give the dose to whoever was the first to arrive. According to the receptionist, that’s Ms. Carlson. But you’re not sure who that is. One of Ann and Bea is Ms. Carlson and the other Ms. Daniels but you have no idea which last name is whose. Since you’re uncertain whether Ann is Carlson and Bea Daniels or the other way around, the plan can be represented in two different ways: K1 Ann is Carlson Ann is Daniels L1 Ann is Carlson Ann is Daniels Bea is Daniels Bea is Carlson Bea is Daniels Bea is Carlson Ann 1 0 Carlson 1 1 Bea 0 1 Daniels 0 0 Compare this with an alternative plan: You know Ann to be a major donor to the hospital. Her donations have helped pay for the treatment of many low-income patients. Instead of giving the medicine to the first to arrive, you can give the medicine to the biggest donor. (While you know that Ann has donated more than Bea, you’re completely in the dark about which of Ms. Carlson and Ms. Daniels is the bigger donor, since you’re unsure which of them is Ann). 92 PRINCIPLES K2 Ann is Carlson Ann is Daniels L2 Ann is Carlson Ann is Daniels Bea is Daniels Bea is Carlson Bea is Daniels Bea is Carlson Ann 1 1 Carlson 1 0 Bea 0 0 Daniels 0 1 This plan can again be represented in two ways: A concern for fairness in the distribution of chances will lead to opposing conclusions about which plan is better depending on the choice of designators. On the one hand, the first-come first-served plan seems to be better. It gives Ann and Bea an equal shot at receiving the medicine, since you’re unsure which of them was the first to arrive. Conversely, the biggest-donor plan is certain to benefit Ann at the expense of Bea, since you’re sure Ann is the bigger donor. On the other hand, the biggest-donor plan seems to be better. It gives Ms. Carlson and Ms. Daniels an equal shot at receiving the medicine, since you’re unsure which of them is the bigger donor. Conversely, the first-come first-served plan is certain to benefit Ms. Carlson at the expense of Ms. Daniels, since you’re sure Ms. Carlson was the first to arrive. Plausibly, it can’t be that one plan is better when we think about it in terms of Ann and Bea, and worse when we think about it in terms Ms. Carlson and Ms. Daniels. The plan is either better or it isn’t—betterness isn’t relative to the choice of designators. Reference Invariance. How prospects compare can’t depend on the choice of designators. That is to say that K1 is better than K2 just in case L1 is better than L2. This means that K1 and K2 must be equally good, as must L1 and L2—exactly as Statewise Anonymity requires. For, it can’t be that somehow a fair distribution of chances makes things better when it comes to Ann and Bea but worse when it comes to Ms. Carlson and Ms. Daniels. The upshot then is that fairness in the distribution of chances can’t make a difference to overall value, at least in cases of uncertainty about identity. Of course, not all uncertainty is uncertainty about identity. But a bedrock assumption of much of decision theory is that it doesn’t matter what the content of the uncertainty is—all that matters are the probabilities. More precisely, call two prospects stochastically equivalent if for each social outcome, the probability of it obtaining is exactly the same in one prospect as in the other. Then, Statewise CHAPTER 3. PATTERNISM ABOUT VALUE 93 Anonymity follows from the conjunction of Reference Invariance and: Stochasticism. Stochastically equivalent prospects are equally good. Rejecting Statewise Anonymity thus requires rejecting either Reference Invariance or Stochasticism.23 3.6.3 Timeslice Stochasticism For the final principle, consider the following two prospects where each outcome is exactly as good as its asterisked counterpart: 1/2 1/2 1/2 1/2 M1 E F M2 E F Ann (a , b) (c , d) Ann (a ∗ , d ∗ ) (c ∗ , b ∗ ) Ben (e , f) (g , h) Ben (e ∗ , h ∗ ) (g ∗ , f ∗ ) t1 t2 t1 t2 t1 t2 t1 t2 These two prospects are stochastically equivalent at each time. In each time interval t and for any outcomes x and y, the probability that Ann gets an outcome equivalent to x and Ben gets an outcome equivalent to y at t is the same in M1 and M2. Besides the potential substitution of outcomes with equally good ones, the prospect differ only by a permutation of the blue and red columns. Such differences can’t make a difference according to: Timeslice Stochasticism. Prospects that are stochastically equivalent at each time are equally good. Previous caveats about substituting outcomes for equivalent ones and controlling for downstream effects apply here as well. How the distribution of value is correlated across time can differ for prospects that are stochastically equivalent at each time. Such differences in correlation can be grounds for rejecting Timeslice Stochasticism. Consider: Covax vs. Contravax. Two kids, Ann and Ben, have a cough. One of them has a gene that the other lacks. Unfortunately, their medical files have been 23There are ways to press on Reference Invariance—for instance, by arguing that there’s some privileged choice of designators (see Mahtani (2017, 2021)). I don’t find this path of resistance particularly promising. For further discussion, see Gustafsson & Kowalczyk (forthcoming). 94 PRINCIPLES mixed up and we don’t know who has the gene. One cough medicine, Covax, is more effective for those with the gene. Another, Contravax, is more effective for those without the gene. But evidence also links the gene to slightly higher dopamine levels in adulthood. Given a choice of only one kind of medication to administer to both Ann and Ben, which should we pick? Covax Ann has Ben has Contravax Ann has Ben has the gene the gene the gene the gene Ann (1 , 1) (0 , 0) Ann (0 , 1) (1 , 0) Ben (0 , 0) (1 , 1) Ben (1 , 0) (0 , 1) The key feature of the case is that Covax is better now for whoever has the gene. And whoever has the gene will also be better-off in the future. So, Covax is better now for whoever will also be better-off later. This means it’s sure to compound inequalities in lifetime welfare. No matter who has the gene, with Covax, one person will end up with a total lifetime welfare of 2 and the other 0. Conversely, Contravax is better now for whoever doesn’t have the gene. And whoever doesn’t have the gene will be worse-off in the future. So, Contravax is sure to be better now for whoever will be worse-off later and worse now for whoever will be better-off later. This has the effect of cancelling out inequalities in lifetime welfare. Either way, both Ann and Ben will have an equal total lifetime welfare of 1. So, Lifetime Ex-Post Prioritarians and others who care about equality in eventual lifetime welfare would prefer Contravax. But the choice of Covax or Contravax affects only which of Ann and Ben will benefit more from the medicine. Either way, exactly one person will benefit more and Ann and Ben both have an equal shot at being that person. So, in terms of their direct effects, nothing tells in favour of one medication over another. The only difference is that responding well to Covax is an indicator of higher dopamine levels in the future, by way of a common factor being the presence of the gene. Conversely, a good response to Contravax is correlated with lower future dopamine levels, in virtue of both being correlated with the absence of the gene. Intuitively, this difference in mere correlation between a drug’s effectiveness and future dopamine levels, by way of a common factor being the presence or the absence of a gene, shouldn’t make a moral difference. To test this intuition, we can consider several variations of the case. If Contravax were truly better than Covax, then making Contravax slightly worse or CHAPTER 3. PATTERNISM ABOUT VALUE 95 Covax slightly better in some respect shouldn’t change that. But that’s not what intuition suggests. For instance, suppose: Covax vs. Itchax. There’s a third cold medication—Itchax. Like Contravax, it’s more effective for whoever doesn’t have the gene. But unlike Contravax, it has the side-effect of causing mild rashes. Covax Ann has Ben has Itchax Ann has Ben has the gene the gene the gene the gene Ann (1 , 1) (0 , 0) Ann (0 − δ , 1) (1 − δ , 0) Ben (0 , 0) (1 , 1) Ben (1 − δ , 0) (0 − δ , 1) For those who prefer Contravax to Covax, there should be some sufficiently small uniform reduction in welfare δ to Contravax, as in Itchax, they should tolerate and still deem Covax worse. But intuitively, Itchax is a strictly inferior medicine. Like Covax, it will relieve exactly one person’s symptoms. But unlike Covax, it’ll give both Ann and Ben a rash. Consider a second variation: Covax vs. Contaminax. Yet another cold medicine, Contaminax, works exactly like Contravax. But a recent contamination at the Contaminax factory means there’s a small chance that the prescribed batch would make Ann and Ben severely ill. (1 − δ)/2 (1 − δ)/2 δ/2 δ/2 Conta- Ann has the gene Ben has the gene Ann has the gene Ben has the gene minax & good batch & good batch & bad batch & bad batch Ann (0 , 1) (1 , 0) (−N , 1) (−N , 0) Ben (1 , 0) (0 , 1) (−N , 0) (−N , 1) If Covax is worse than Contravax, then it’s worse by some margin. For any arbitrarily catastrophic outcome (−N), there’s a small enough risk (δ) of that outcome that would only make Contaminax worse than Contravax by less than that margin. But should we really prefer a drug with a risk of severe negative effects that, only in the best case scenario, will benefit exactly one of Ann and 96 PRINCIPLES Ben to one that carried no such risk and is sure to benefit exactly one of Ann and Ben? Here’s a third variation: Equalax vs. Contravax. Yet another cold medicine, Equalax, partially relieves cold symptoms regardless of the presence or the absence of the gene. It’s a little more than half as effective as a full treatment. Equalax Ann has Ben has Contravax Ann has Ben has the gene the gene the gene the gene Ann (0.5 + δ , 1) (0.5 + δ , 0) Ann (0 , 1) (1 , 0) Ben (0.5 + δ , 0) (0.5 + δ , 1) Ben (1 , 0) (0 , 1) For a small enough δ, those who prefer Contravax to Covax would prefer Contravax to Equalax for the same reason. Equalax benefits both Ann and Ben equally now. But simply because the person with the gene will have higher dopamine levels in the future independently of Equalax’s effects, the world that Equalax brings about is sure to be one with lifetime welfare distributed unequally. In contrast, Contravax’s effectiveness is anti-correlated with higher future dopamine levels. So, Contravax results in an equal distribution of lifetime welfare, effectively by preemptively canceling out future inequalities through an unequal distribution of benefits in the opposite direction now. But in terms of their immediate effects, the difference between Equalax and Contravax is similar to the difference between Priotin and Totalax. Except that not only does Equalax distribute the benefits equally between Ann and Ben, it also brings about greater total benefit. So, the original intuition supporting Prioritarianism in Priotin vs. Totalax would suggest that Equalax is better. A fourth and final variation brings out another inegalitarian aspect of Lifetime Ex-Post Prioritarianism: Covax∗ vs. Contravax∗ . Cat has a cold but we’re not sure if she has the gene. Covax would be better if she does and Contravax if not. Again, the gene is linked to higher dopamine levels in the future. But Cat will be well-off regardless. Elsewhere, Dan is a worker at the factory that produces both drugs. Due to the production of Contravax releasing mild irritants as byproducts, choosing Contravax over Covax for Carl would be the difference between Dan’s life going from barely worth living to not worth living. CHAPTER 3. PATTERNISM ABOUT VALUE 97 Covax∗ Carl has Carl doesn’t Contravax∗ Carl has Carl doesn’t the gene have the gene the gene have the gene Carl (101 , 101) (100 , 100) Carl (100 , 101) (101 , 100) Dan (+δ , +δ) (+δ , +δ) Dan (−δ , −δ) (−δ , −δ) As before, the fact that the best possible state for Carl now and her best possible future state are correlated in Covax∗ and anti-correlated in Contravax∗ means that the latter is better when it comes to Carl. This means that there is some sufficiently small uniform reduction in welfare at each time for Dan (+δ to −δ) where Contravax∗ would still be better overall. These variations illustrate some of the counterintuitive implications that come with rejecting Timeslice Stochasticism. There’s also a more direct argument in favour of Timeslice Stochasticism. Note that Covax gives Ann a fifty-fifty chance of a total lifetime welfare of 2 or 0 whereas Contravax gives her a total lifetime welfare of 1 for sure. Either way, her expected lifetime welfare is 1. Similarly with Ben. So, if Ann and Ben were maximisers of expected lifetime welfare, they would each be indifferent between Covax and Contravax. So, theories that prefer Contravax to Covax prefer a prospect that neither Ann nor Ben themselves prefer. It violates: Ex-Ante Pareto Indifference. If everyone is indifferent between two prospects, then the two prospects are equally good. In fact, we don’t need to assume that individuals maximise expected welfare. Instead, the following assumption about individual preferences suffices: Individual Timeslice Stochasticism. Everyone is indifferent between prospects that are stochastically equivalent at each time. This and Ex-Ante Pareto Indifference entail Timeslice Stochasticism. Individual Timeslice Stochasticism is, on its own, compatible with quite radical departures from expected utility theory. Take any decision theory that satisfies Stochasticism (stochastically equivalent prospects are equally good). Consider an agent whose preferences can be thought of as obtained by applying the decision theory at each time before aggregating the resulting values across time. (For instance, we might take the risk-weighted expected value at each time before adding those values up). Such preferences will satisfy Individual Timeslice Stochasticism. 98 NUMERICAL REPRESENTATIONS AND ADDITIVITY Individual Timeslice Stochasticism is, by no means, unassailable. It’s violated, for instance, by agents who are risk-averse with respect to total lifetime welfare and who maximise the risk-weighted expectation of their total lifetime welfare. But there’s a plausible case to be made that much work in decision theory betray an implicit commitment to something like Timeslice Stochasticism. Standard examples in decision theory specify only the local effects of a choice and differences in outcomes within a restricted period of time. For instance, we are told that accepting a bet means winning big on heads and losing on tails. Or, bringing an umbrella means being dry if it rains and being slightly inconvenienced if it doesn’t. We aren’t usually told exactly how good the agent’s life would be from birth to death as a result of accepting the bet or bringing the umbrella. The fact that the outcomes aren’t usually specified across a lifetime suggests that decision theorists implictly think of decision theory as applying locally within restricted periods of time. In that case, the widespread acceptance of Stochasticism for individual preferences is really widespread acceptance of Individual Timeslice Stochasticism. 3.7 Numerical Representations and Additivity This section discusses two further interpretations of the separateness of persons objection and their relation to Patternism. Along the way, questions about the consequences of Anti-Patternism, like whether it entails that value is additive, are clarified. Let’s begin by introducing some concepts to do with numerical representations of value. First, a numerical assignment is a function v that assigns each outcome a numerical value. When each outcome in a prospect N is replaced with its numerical value, the result v(N) is a numerical prospect. N 1/2 1/2 v(N) 1/2 1/2 Ann (a , b) (c , d) −→ Ann (v(a) , v(b)) (v(c) , v(d)) −→ f(v(N)) Ben (e , f) (g , h) v Ben (v(e) , v(f)) (v(g) , v(h)) f t1 t2 t1 t2 t1 t2 t1 t2 An aggregation function f then tells us how to combine the numbers in v(N) to obtain the prospect’s overall value. That is, it’s a function that assigns each numerical prospect like v(N) an overall value f(v(N)). CHAPTER 3. PATTERNISM ABOUT VALUE 99 Next, an axiology tells us how different prospects compare. Formally, it’s a binary relation ≼ on the space of social prospects, where N1 ≼ N2 means that N2 is at least as good as N1. A representation of the axiology is a pair ⟨v, f⟩ of numerical assignment and aggregation function, where N1 ≼ N2 just in case f(v(N1)) ≤ f(v(N2)). In other words, the overall numerical value of the prospects according to the representation correctly tracks which prospects are better. For most representations of interest, the aggregation function f can be thought of as the result of aggregating along each dimension individually. That is, there’s an interpersonal aggregation function fp, a temporal aggregation function ft , and a risk aggregation function fr such that f is the result of applying the three functions in some sequence or another. In that case, f is said to be decomposable. For instance, fp might be an increasing, strictly concave function, ft an additive function, and fr an expectation operator. We saw in §3.4 how different forms of Prioritarianism arise as the result of applying these functions in various orders. We are now ready to consider the first of two versions of the separateness of persons objection to be considered in this section. Consider some of the most well-known articulations of the objection: “[Utilitarianism] ignores the distinction between persons. . . To sacrifice one individual life for another, or one individual’s happiness for another’s is very different from sacrificing one gratification for another within a single life.” (Nagel, 1970, 138). “[Utilitarianism’s] view of social cooperation is the consequence of extending to society the principle of choice for one man. . . Utilitarianism does not take seriously the distinction between persons.” (Rawls, 1971, 27). “The difference between the unity of the individual and the separateness of persons requires that there be a shift in the moral weight that we accord to changes in utility when we move from making intrapersonal trade-offs to making interpersonal tradeoffs. . . some ways of balancing benefits and burdens that are appropriate when these accrue to a single individual are inappropriate when benefits accrue only to some and burdens only to others.” (Voorhoeve & Fleurbaey, 2012, 381). The idea is that there’s some difference between how to aggregate value across people and how to aggregate value within a person (like across time or under risk): 100 NUMERICAL REPRESENTATIONS AND ADDITIVITY 7. Separateness of Persons as Intrapersonal-Interpersonal Asymmetry How to balance trade-offs across time or under risk is different from how to balance trade-offs across people.24 But what exactly an interpersonal-intrapersonal asymmetry amounts to requires further clarification. I’ll consider two possible interpretations—roughly, quantitative and qualitative. First, the quantitative interpretation. A rough gloss of the supposed asymmetry is that how we ought to aggregate numerical value across people is different from how we ought to aggregate numerical value across time or uncertain possibilities. What many proponents of the asymmetry objection have in mind is something like “we should add up value across time but not across people”. To render this precise, take some representation ⟨v, f⟩ of an axiology, where f is decomposable into fp, ft , fr—one aggregation function for each dimension. On the quantitative interpretation, for there to be an interpersonal-intrapersonal asymmetry is for the interpersonal aggregation function fp to be different from the temporal ft or risk aggregation function fr . For instance, perhaps ft is additive but fp isn’t. But further questions of interpretation remain. First, what exactly is it for the functions to be different? Suppose time is partitioned into fewer or more time intervals than there are people. Then, the interpersonal and temporal aggregation functions fp and ft are functions of different variables—one takes a longer sequence of numbers as input. Strictly speaking then, the functions can’t be identical, even if they are both additive. Or, consider the fact that the symmetry objection of Rawls and others following him is directed towards classical utilitarianism. Classical utilitarians add up welfare across people and take expectations under risk. The objection is that they “extend to society the principle of choice for one man”. But in what sense exactly does an additive function for aggregating across people “extend” an expectational function for aggregating under risk? Of course, both are linear functions. But do classical utilitarians really treat both dimensions on a par simply on account of using linear functions to aggregate across both time and risk? If linearity counts, then why not other properties of functions like concavity or monotonicity? Do the forms of Prioritarianism considered earlier also disrespect the separateness of persons because they use 24This interpretation of the objection is also taken up in Otsuka (2012); Otsuka & Voorhoeve (2018). CHAPTER 3. PATTERNISM ABOUT VALUE 101 aggregation functions that are all (weakly) concave and all monotonic? Even setting aside what makes functions different in the relevant sense, a second issue is that representations of an axiology aren’t unique. It’s possible that there are representations ⟨v, f⟩ and ⟨u, g⟩ of the same axiology, where both f and g are decomposable (into fp, ft , fr and gp, gt , gr , respectively). Suppose fp, ft , fr are the “same” but gp, gt , gr aren’t. Does the axiology treat the dimensions on a par or not? In other words, does the mere existence of some representation where how we aggregate interpersonally is the same as how we aggregate intrapersonally suffice for an interpersonal-intrapersonal symmetry? Or, suppose fp is the “same” as gt and gr . That is, how we aggregate interpersonally on one representation is the same as how we aggregate intrapersonally on another representation. Perhaps interpersonal aggregation is additive on one representation and temporal aggregation is also additive on another, even though there isn’t a single representation where they are both additive. Does that suffice for symmetry? Until these issues are resolved, this quantitative interpertation of the asymmetry is too imprecise for us to properly evaluate how it relates to the issue of Patternism. A more promising interpretation circumvents the issue of numerical representations altogether. And on this interpretation, Anti-Patternism rules out any interpersonal-intrapersonal asymmetry. The thought is that the gain to one person required to compensate a loss to that same person is different from the gain to one person required to compensate a loss to another person. This can be stated in purely qualitative terms. To illustrate with the temporal dimension, suppose that all else equal, things being worse for Ann at an earlier time (a − instead of a better outcome a) can be outweighed by things being better for her at a later time (b + instead of a worse outcome b): O1 Ann (a , b) ≼ O2 Ann (a − , b +) Ben (c , d) Ben (c , d) t1 t2 t1 t2 The thought is that things being worse for Ann at an earlier time can’t be outweighed by making things better for Ben at some time in the same way: 102 NUMERICAL REPRESENTATIONS AND ADDITIVITY P1 Ann (a , c) ̸≼ P2 Ann (a − , c) Ben (b , d) Ben (b + , d) t1 t2 t1 t2 On a qualitative interpretation, for there to be an asymmetry between the interpersonal and temporal dimensions is for there to be some prospects where the pattern of inequalities above is instantiated. Similarly with the dimensions of people and risk. Clearly, an asymmetry in this sense is inconsistent with AntiPatternism. (Anti-Patternism requires O1 and P1 to be equally good, and similarly with O2 and P2). Let’s move on to the final version of the separateness of persons objection to be considered in this paper. Suppose classical utilitarians were to modify their aggregation procedure for time and risk to no longer be in line with their additive theory of interpersonal aggregation. Would this pacify defenders of the separateness of persons objection? Alternatively, do they find axiologies like 3DMaximin to disrespect the separateness of persons in the same way that classical utilitarianism does, on account of the aggregation function for each dimension being the “same”? I suspect that for many defenders of the separateness of persons objection, the answers to both questions would be “no”. If so, then rather than an objection against any interpersonal-intrapersonal symmetry, the objection is better interpreted as an outright rejection of the utilitarian’s additive theory of interpersonal aggregation. 8. Separateness of Persons as Interpersonal Non-Additivity Value isn’t interpersonally additive. More precisely, this means that the correct axiology does not have a representation ⟨v, f⟩, where f is decomposable and the interpersonal aggregation function fp is additive. Sometimes, Anti-Patternism is thought to lead directly to an additive theory of value. The choice is framed as one between either Patternism or additivity. For instance, Brink writes: “If the magnitude of goods and harms is of moral importance as such, but the location of goods and harms across lives is not, we should act so as to maximize net value rather than to achieve any particular distribution” (1993, 253). But the relationship between additivity and Patternism is a little more subtle. CHAPTER 3. PATTERNISM ABOUT VALUE 103 On the one hand, the issues are logically independent. Anti-Patternism is consistent with value not being interpersonally additive. We saw that 3D-Maximin is Anti-Patternist. And 3D-Maximin has no representation with an additive interpersonal aggregation function.25 And conversely, value being interpersonally additive is clearly consistent with Patternism, since that leaves open how value is to be aggregated across time and risk.26 On the other hand, Anti-Patternism does get us a fair bit of the way towards an additive theory of interpersonal aggregation. We saw that on Anti-Patternist theories, trade-offs along each dimension must be balanced in exactly the same way. In that sense, Anti-Patternism mandates a parity among the different dimensions. Heuristically, this parity means that if Anti-Patternism were to be supplemented with additional assumptions that suffice for aggregation to be “additive” along one dimension, then aggregation must be additive along every dimension. Here’s a more precise result to that effect. Call an axiology additive if it has a representation ⟨v, f⟩, where the aggregation function f is the result of adding up the value of each outcome across people and time, and taking the expected value under risk. It’s well-known that the following decision-theoretic axiom suffices for the ranking of prospects to have a representation in terms of expected values: Independence. For any prospect Z, prospect X is at least as good as prospect Y just in case the prospect that yields X with probability r and Z with probability 1 − r is at least as good the prospect that yields Y with probability r and Z with probability 1 − r. It can be shown that any Anti-Patternist axiology that satisfies Independence is additive.27 25The reason is that the ordering induced by 3D-Maximin isn’t separable along the interpersonal dimension (nor along the temporal or risk dimension) and separability is necessary for an additive representation. 26An interesting open question concerns what representation theorem can be proved of AntiPatternist theories. The results of Aczél & Maksa (1996); Aczel et al. (1997); Aczél (2019); Maksa (1999) seem relevant here, though such a representation theorem is left for future work. 27We saw in §3.5 how given Anti-Patternism, every social prospect can be reduced to a uniform prospect, in which the social outcome in each possibility is uniform—that is, everyone is equally well-off at every time. Given Independence, representation theorems like that found in Hara et al. (2019) furnish a function u that assigns each uniform social outcome a numerical value and the ranking of the social prospects is correctly tracked by their expectation relative to u. 104 CONCLUSION 3.8 Conclusion The aim of this paper was to explore Patternism—state it precisely, carefully distinguish it from other possible interpretations of the separateness of persons objection, reduce its truth and falsity down to that of three principles, and make a case against it by furnishing considerations for the three principles that jointly contradict it. The web of entailment relating the various principles introduced throughout this paper is summarised in the diagram below. Given the expository and exploratory nature of this paper, the case presented against Patternism is unlikely to be conclusive. The goal isn’t to foreclose discussion but to kickstart it. Hopefully, even hardened skeptics of utilitarianism will find value in the issues clarified and potential critiques of utilitarianism sharpened. Additivity Anti-Patternism Independence Temporal Neutrality Modal Anonymity Timeslice Stochasticism Individual Timeslice Stochasticism Ex-Ante Pareto Indifference Stochasticism Reference Invariance Anonymity Substitution of Equivalents Chapter 4 Additive Relations on Cartesian Products 4.1 Introduction Many problems in philosophy, economics, mathematics, and elsewhere involve comparisons of tuples: (ai)i∈I = (. . . , ai , aj , . . .) ≼ (bi)i∈I = (. . . , bi , bj , . . .). Consider, for instance, social welfare theory, where each i ∈ I corresponds to an individual, (ai)i∈I is the distribution where each individual i receives goods ai , and ≼ is a ranking of the possible distributions. Or, consider decision theory, where each component i ∈ I corresponds to a state in a sample space, (ai)i∈I is the risky prospect on which outcome ai obtains given state i, and ≼ is a ranking of prospects. In fact, as we’ll see, many other domains—ranging from social choice theory to abstract algebra and probability theory—involve similar structures. In many cases, the objects ai of interest are, first and foremost, qualitative things—like the goods allocated to an individual or the outcome of a lottery. But often, it would be desirable to be able to assign numerical values to these objects and to think of the ordering as generated by a simple comparison of the sums of the values of the elements of the respective tuples. The primary concern of this paper is: under what conditions is an additive representation like this possible? To state this more precisely, let ≼ be a binary relation on a Cartesian product A = ∏i∈I Ai of sets. So, A consists of tuples of the form (ai)i∈I with ai ∈ Ai for each i ∈ I. The relation ≼ is additive if for each i ∈ I, there exists a function ϕi that assigns each element in Ai a “numerical value” such that for any 105 106 INTRODUCTION (ai)i∈I ,(bi)i∈I ∈ A: (ai)i∈I ≼ (bi)i∈I ⇐⇒ ∑ i∈I ϕi(ai) ≤ ∑ i∈I ϕi(bi). An additive representation isn’t always possible—not all relations on Cartesian products are additive. Thus arises the question: what conditions are necessary and sufficient for such a relation to be additive? This has been the subject of much investigation, especially under the banner of “conjoint measurement”.1 Specific versions of the question as it arises in restricted contexts (like decision theory and comparative probability theory) have also been studied extensively, yielding well-known representation theorems such as von-Neumann Morgenstern’s (1944) axiomatic characterisation of expected utility theory. The main results of this paper, roughly, characterise the additive relations in terms of the existence of what I’ll call a “symmetric and monotonic aggregator”. The intended contribution of this result to the existing literature is two-fold. First, this novel characterisation in terms of aggregator supplements existing characterisations (in terms of cancellation or separability axioms) by providing a new, productive perspective on additive relations. Existing representation theorems in various areas can easily be recast in terms of aggregators. The general result linking additivity to the existence of certain aggregators has as simple corollaries many results that otherwise require bespoke proofs. As we’ll see, this not only helps shed light on the conceptual connections unifying these results but also serves as a general recipe for generating new representation theorems. Second, with some exceptions, most existing work is focused on real-valued representations. This requires various restrictions in scope. For instance, there might be infinitely many possible values that a component might be allowed to take. That is, some Ai might be infinite, in which case the Cartesian product ∏i∈I Ai is said to be infinitary. (The products is finitary otherwise). Real numbers have certain convergence and limit properties. So, real-valued additive representations of relations on infinitary products generally require extra constraints to ensure that such properties are respected. Such constraints often come in the form of Archimedean axioms or continuity conditions (which requires positing extra topological structure on the sets). Also, real numbers are always comparable (for any two real numbers x and y, either x ≤ y or y ≤ x). This means that any relation ≼ that has a real-valued ad1See, for instance, Debreu (1960), Gorman (1968), Krantz et al. (1971), Wakker (1988, 2013). CHAPTER 4. ADDITIVE RELATIONS ON CARTESIAN PRODUCTS 107 ditive representation must, correspondingly, be total—for any two tuples (ai)i∈I and (bi)i∈I , either (ai)i∈I ≼ (bi)i∈I or (bi)i∈I ≼ (ai)i∈I . This excludes relations on which some tuples might be incomparable. So, the focus on real-valued representations requires various restrictions— like that the Cartesian product be finitary or otherwise that extra continuity conditions are satisfied, or that the relation be total. In many applications, these requirements are overly restrictive and lack motivation. Fortunately, real numbers aren’t the only things that we can add and compare in a well-behaved manner. General mathematical structures called preordered Abelian groups share many of the desirable features of the real numbers. Another main contribution of this paper is the extension of real-valued representation theorems to infinitary products and relations that aren’t total. The paper will proceed as follows. §4.2 first introduces the general idea of the main results by outlining a simple derivation of additivity in the specific setting of social welfare theory. §4.3 then formally introduces key concepts and defintions. §4.4 then introduces the main results of the paper. These results are then applied in §4.5 to various domains—social welfare theory, social choice theory, abstract algebra, decision theory, and probability theory. §4.6 discusses the relationship between the approach of this paper and existing approaches to the problem of additive representations. 4.2 Primer Let’s begin with a simple derivation of an additive representation in the case of a total relation ≼ on Rn . For concreteness, we can think of Rn as the space of possible welfare distributions for n individuals. For instance, the vector (a1, . . . , an) ∈ Rn represents the social distribution that confers the first individual a1 units of welfare, the second a2 units, and so on. The relation ≼ on Rn can then be thought of a ranking that tells us which distributions are at least as good as which others. For instance, classical utilitarianism ranks the distributions by their total welfare. So, (a1, . . . , an) ≼ (b1, . . . , bn) just in case a1 + · · · + an ≤ b1 + · · · + bn. This is an example of an additive total relation on Rn . Below is one possible justification of classical utilitarianism that’s reminiscent of Harsanyi’s (1955) axiomatisation of classical utilitarianism. Begin with a distribution a 1 = (a1, a2, . . . , an−1, an). Now, consider the distribution a 2 = (a2, a3, . . . , an, a1) obtained by shifting the entries in a 1 one-step leftwards, wrapping around the edges. That is, a 2 is the distribution in which 108 PRIMER individual i has however much welfare individual i + 1 (for each i ∈ Z mod n) would have in distribution a 1 . Similarly, a 3 = (a3, a4, . . . , a1, a2) is the distribution that differs from a 1 only in that the entries are shifted two-steps leftwards. And so on, until a n = (an, a1, . . . , an−2, an−1). Similarly, the sequence of distributions b 1 = (b1, b2, . . . , bn−1, bn), b 2 = (b2, b3, . . . , bn, b1), . . . , b n = (bn, b1, . . . , bn−2, bn−1) are related by a gradual shift of the entries leftwards. Now, suppose that a 1 ≼ b 1 . Then, plausibly, also a j ≼ b j for each j = 2, . . . , n: a 1 = (a1, a2, . . . , an−1, an) ≼ (b1, b2, . . . , bn−1, bn) = b 1 a 2 = (a2, a3, . . . , an, a1) ≼ (b2, b3, . . . , bn, b1) = b 2 a 3 = (a3, a4, . . . , a1, a2) ≼ (b3, b4, . . . , b1, b2) = b 3 . . . a n = (an, a1, . . . , an−2, an−1) ≼ (bn, b1, . . . , bn−2, bn−1) = b n After all, for each level of welfare, we ought only care about how many people are at that level in a distribution and not exactly who those people happen to be. More generally, in many contexts, a reasonable case could be made for the following constraint on ≼: (1) For any (x1, . . . , xn),(y1, . . . , yn) ∈ Rn and permutation π on {1, . . . , n}, if (x1, . . . , xn) ≼ (y1, . . . , yn), then (xπ(1) , . . . , xπ(n) ) ≼ (yπ(1) , . . . , yπ(n) ). Next, for any sequence of distributions x 1 , . . . , x n ∈ Rn , let avg(x 1 , . . . , x n ) ∈ Rn be the distribution in which the welfare of each individual i is the average of their welfare in distributions x 1 , . . . , x n . That is, avg is the operation of component-wise averaging. Plausibly, whenever each y j is at least as good as the corresponding x j (j = 1, . . . , m), the “average” of the y j ’s should also be at least as good as that of the x j ’s: (2) For any x j , y j ∈ Rn (j = 1, . . . , m), if x j ≼ y j for each j, then avg(x 1 , . . . , x m) ≼ avg(y 1 , . . . , y m). One way to motivate this in the social welfare setting is to consider plans that randomise which distributions get brought about. Suppose a fair m-sided die is to be tossed. Consider two plans. On Plan X, if the die lands on j, then distribution x j comes about. On Plan Y, if the die lands on j, then distribution y j comes about. Suppose that x j ≼ y j for each j = 1, . . . , m. So, no matter how the die lands, the distribution that would result on plan Y would be at least as good as that which would result on plan X. Intuitively, then plan Y should be at least CHAPTER 4. ADDITIVE RELATIONS ON CARTESIAN PRODUCTS 109 as good as plan X. Of course, each individual’s expected welfare given plans X and Y are their welfare in distributions avg(x 1 , . . . , x m) and avg(y 1 , . . . , y m), respectively. So, the fact that plan Y is at least as good as plan X provides some support for avg(x 1 , . . . , x m) ≼ avg(y 1 , . . . , y m). So, given that a j ≼ b j for each j = 1, . . . , n, (2) implies that: avg(a 1 , . . . , a n ) = (a, . . . , a) ≼ (b, . . . , b) = avg(b 1 , . . . , b n ), where a = (a1 + · · · + an)/n and b = (b1 + · · · + bn)/n. This must mean that a ≤ b. More generally: (3) For any (x1, . . . , xn),(y1, . . . , yn) ∈ Rn , if (x1, . . . , xn) ≼ (y1, . . . , yn), then xi ≤ yi for some i = 1, . . . , n. In other words, a distribution y can’t be at least as good as another x unless at least one person’s welfare is as great in y as in x. Now, given a ≤ b, multiplying by n on both sides yields (a1 + · · · + an) ≤ (b1 + · · · + bn). Thus, from the supposition that (a1, . . . , an) ≼ (b1, . . . , bn), it followed from constraints (1), (2), and (3) that (a1 + · · · + an) ≤ (b1 + · · · + bn). Equally, replacing ≼ and ≤ with their strict counterparts ≺ and < in (1), (2), and (3), a similar argument would show that (a1, . . . , an) ≺ (b1, . . . , bn) implies (a1 + · · · + an) < (b1 + · · · + bn). If ≼ is total, then (b1, . . . , bn) ̸≼ (a1, . . . , an) is equivalent to (a1, . . . , an) ≺ (b1, . . . , bn). In that case, it follows that: (a1, . . . , an) ≼ (b1, . . . , bn) ⇐⇒ (a1 + · · · + an) ≤ (b1 + · · · + bn). Thus, constraints (1)-(3) and their strict counterparts suffice for an additive representation of a total relation ≼ on Rn . A brief moment’s reflection reveals various generalisations of the derivation. For instance, instead of the component-wise average avg(a 1 , . . . , a m) of vectors, we could have considered their component-wise sum tot(a 1 , . . . , a m). The derivation would go through with little modification if we were to replace constraint (2) with: (2’) For any x j , y j ∈ Rn (j = 1, . . . , m), if x j ≼ y j for each j = 1, . . . , m, then tot(x 1 , . . . , x m) ≼ tot(y 1 , . . . , y m), and the strict counterpart of (2) with the strict counterpart of (2’). This allows us to extend the derivation above beyond Rn to total relations on any Cartesian power of sets with a sufficiently well-behaved ordering and addition operation— like Zn , Nn , or Gn (where G is a preordered Abelian group). 110 OVERVIEW More significant hurdles present themselves in attempting to generalise the argument to relations on arbitrary Cartesian products A = ∏i∈I Ai . Unlike Rn , A might not be uniform—it could be that Ai ̸= Aj for some i, j ∈ I and so, reordering the elements of a tuple (ai)i∈I ∈ A might result in a tuple (aπ(i) )i∈I that doesn’t belong to A, contrary to what (1) presupposes. Similarly, the formulation of constraints like (2) and (2’) relied on our ability to average and add up real numbers. And constraint (3) presupposes an ordering on the real numbers. But the Ai ’s in an arbitrary Cartesian product ∏i∈I Ai aren’t generally assumed to have an internal algebraic or order structure that would allow us to define operations like avg and tot. But, in fact, the derivation above contains the kernels of ideas that, once suitably generalised, characterise the additive relations over arbitrary Cartesian products and their subsets. Of particular importance are functions like avg and tot which are examples of what I’ll call aggregators. Specifically, those are examples of aggregators with two crucial features—they are symmetric and monotonic. The existence of such aggregators suffices for a relation on finite-dimensional Cartesian products to be additive (Theorem 1). 4.3 Overview This section defines some key notions. First: Definition 1. A binary relation ≼ on a set X is total if for all x, y ∈ X, either x ≼ y or y ≼ x. It is reflexive if x ≼ x for all x ∈ X. It is transitive if x ≼ y and y ≼ z imply that x ≼ z, for all x, y, z ∈ X. A preorder is a reflexive and transitive binary relation. By convention, for a relation ≼ on X, we’ll let x ≺ y if x ≼ y and not y ≼ x. And let x ∼ y if x ≼ y and y ≼ x. Similarly for a binary relation ⊑, ⊏ and ≡ are its strict and equivalence counterparts. Next: Definition 2 (Cartesian Product). For any index set I and sets Ai (i ∈ I), the Cartesian product A = ∏i∈I Ai is the set of all tuples (ai)i∈I where ai ∈ Ai for each i ∈ I. A is finite-dimensional if I is finite and infinite-dimensional if I is infinite. A is finitary if each Ai is finite and infinitary if some Ai is infinite. The product is uniform if Ai = Aj for all i, j ∈ I. CHAPTER 4. ADDITIVE RELATIONS ON CARTESIAN PRODUCTS 111 For instance, Rn is a finite-dimensional, infinitary, uniform Cartesian product. Going forward, to avoid notational clutter, members of ∏i∈I Ai will often be written as a = (ai), b = (bi), . . . instead of (ai)i∈I ,(bi)i∈I , and so on. We are interested in the conditions under which a binary relation on a Cartesian product has an additive representation. The desired representation needn’t take values in the real numbers but can be allowed to take values in any algebraic structure with a sufficiently well-behaved addition operation and ordering. Such structures are: Definition 3 (Preordered Abelian Group). An Abelian group G = ⟨G, +⟩ consists of a set G and a binary operation + : G × G → G such that: 1. (Associativity) (x + y) + z = x + (y + z) for all x, y, z ∈ G; 2. (Commutativity) x + y = y + x for all x, y ∈ G; 3. (Identity) there exists an identity element 0 ∈ G such that x + 0 = x for all x ∈ G; 4. (Inverse) for each x ∈ G, there exists y ∈ G such that x + y = 0, where 0 is the identity element. A preordered Abelian group is an Abelian group with a preorder ≤ such that: 5. (Order) x ≤ y if and only if x + z ≤ y + z for all x, y, z ∈ G. Since the order of addition doesn’t matter in an Abelian group, we can define ∑i∈I xi = xi1 + · · · + xin for any finite set I = {i1, . . . , in} without any ambiguity. Then, we can define: Definition 4 (Additivity). A binary relation ≼ on a finite-dimensional A = A1 × · · · × An is additive if there exists a preordered Abelian group G and for each i ≤ n, a function ϕi : Ai → G such that for all (ai)i≤n,(bi)i≤n ∈ A: (ai)i≤n ≼ (bi)i≤n ⇐⇒ n ∑ i=1 ϕi(ai) ≤ n ∑ i=1 ϕi(bi). If G is the real numbers with the usual ordering and addition operation, then ≼ is said to be real additive. 112 OVERVIEW The additive relations can be characterised in terms of the existence of aggregators (like avg and tot) satisfying certain properties. Definition 5 (Aggregator). Let ≼ be a binary relation on a Cartesian product A = ∏i∈I Ai and ⊑ a binary relation on a set B. Let A<ω= S n∈N An be the set of finite sequences of elements in A. An aggregator is a function F : A<ω → B. (Note that each aggregator is defined relative to particular binary relations on A and B. So, for instance, if ≼ and ≼∗ are different relations on A, then there are strictly speaking two aggregators corresponding to the same function F : A<ω → B, one relative to ≼ and another relative to ≼∗ . This becomes important later as certain properties of aggregators are defined relative to the relations on its domain and codomain.) Elements in A<ω can be thought of as matrices with |I|-many columns and arbitrarily finitely many rows, where the entries in the i-th column are in Ai and each row is a tuple in the Cartesian product A = ∏i∈I Ai . For instance, if a 1 = (a 1 i )i∈I , . . . , a m = (a m i )i∈I ∈ A for any arbitrarily large finite m, then the following m × |I| matrix is an element of A<ω: a 1 a 2 . . . a m = . . . a 1 i . . . a 1 j . . . . . . a 2 i . . . a 2 j . . . . . . . . . . . . . . . . . . . . . a m i . . . a m j . . . An aggregator thus assigns each such matrix a value in a set B. For instance, consider the Cartesian product Rn . The set (Rn ) <ω = (Rn ) 1 ∪ (Rn ) 2 ∪ (Rn ) 3 ∪ . . . is the set of matrices with n columns and arbitrarily finitely many rows. An example of an aggregator is the function avg : (Rn ) <ω → Rn we saw in the previous section, which takes the average of each column. So, for any (a 1 1 , . . . , a 1 n ), . . . ,(a m 1 , . . . , a m n ) ∈ Rn : avg a 1 1 . . . a 1 n a 2 1 . . . a 2 n . . . . . . . . . a m 1 . . . a m n = a 1 1 + a 2 1 + · · · + a m 1 m , . . . , a 1 n + a 2 n + · · · + a m n m ! . Similarly, the component-wise sum function tot : (Rn ) <ω → Rn is an aggregator. CHAPTER 4. ADDITIVE RELATIONS ON CARTESIAN PRODUCTS 113 Both of these are of the form F : A<ω → A but, in general, the codomain of an aggregator is allowed to be any arbitrary set B. The aggregators avg and tot have two important properties: Definition 6 (Symmetric Aggregator). Let ≼ be a binary relation on a Cartesian product A = ∏i∈I Ai and ⊑ a binary relation on a set B. An aggregator F : A<ω → B is symmetric if for any [a 1 , . . . , a m] ∈ A<ω and set {πi : {1, . . . , m} → {1, . . . , m}}i∈I of permutations: F . . . a 1 i . . . a 1 j . . . . . . a 2 i . . . a 2 j . . . . . . . . . . . . . . . . . . . . . a m i . . . a m j . . . ≡ F . . . a πi (1) i . . . a πj (1) j . . . . . . a πi (2) i . . . a πj (2) j . . . . . . . . . . . . . . . . . . . . . a πi (m) i . . . a πj (m) j . . . . That is, the matrices differ only by a vertical rearrangement of the entries in each column. (The rows don’t have to be permuted uniformly—that is, the permutations πi and πj for columns i and j can be different). A symmetric aggregator is one that assigns each such matrix an “equally good” value in B. To illustrate, since the standard addition of reals is commutative and associative, avg : (Rn ) <ω → Rn is symmetric. For instance: avg 0 2 4 6 8 10 = (6, 9) = avg 0 10 8 6 4 2 For the same reason, tot is a symmetric aggregator. Another important property that aggregators can have is monotonicity: Definition 7 (Dominance). Let ≼ be a binary relation on A = ∏i∈I Ai . A sequence a 1 , . . . , a m ∈ A<ω is dominated by another sequence b 1 , . . . , b m ∈ A<ω of the same length if there exists some proper subset J ⊂ {1, . . . , m} such that: (i) a j ≼ b j for each j ∈ J; and (ii) a j ̸≽ b j for each j ̸∈ J. 114 OVERVIEW Definition 8 (Monotonic Aggregator). Let ≼ be a binary relation on A = ∏i∈I Ai and ⊑ a binary relation on a set B. An aggregator F : A<ω → B is monotonic if whenever a 1 , . . . , a m ∈ A<ω is dominated by b 1 , . . . , b 1 ∈ A<ω, then F a 1 , . . . , a m ̸⊒ F b 1 , . . . , b m . To illustrate the requirement of monotonicity diagrammatically with an example involving sequences of length 4: a 1 a 2 a 3 a 4 ≼ ≼ ̸≽ ̸≽ b 1 b 2 b 3 b 4 ⇒ F a 1 a 2 a 3 a 4 ̸⊒ F b 1 b 2 b 3 b 4 . The system of inequalities on the left captures the fact that the a’s are dominated by the b’s. Possibly, some of the b’s are at least as good as the corresponding a’s. But even for those b’s where that’s not the case, it’s still not the case that the a’s are at least as good as the corresponding b’s. In that case, monotonicity requires that the aggregator not assign the a’s a value that’s at least as good as what it assigns the b’s. The monotonicity property is much easier to grasp in the case where both ≼ and ⊑ are total. In that case, one thing not being at least as good as another means that the latter thing is strictly better. Monotonicity then amounts to the condition that whenever every b is at least as good as the corresponding a and some b is strictly better, then the b’s must be strictly better according to the aggregator: a 1 a 2 a 3 a 4 ≼ ≼ ≺ ≺ b 1 b 2 b 3 b 4 ⇒ F a 1 a 2 a 3 a 4 ⊏ F b 1 b 2 b 3 b 4 . For instance, the requirement that avg : (Rn ) <ω → Rn be monotonic is a strengthened version of requirement (2) in §4.2. We saw in §4.2 how avg and tot can be used to derive an additive representation. In fact, as we’ll now see, any symmetric and monotonic aggregator can play the same role. The existence of a symmetric and monotonic aggregator suffices for the additivity of relations on finite-dimensional Cartesian products. CHAPTER 4. ADDITIVE RELATIONS ON CARTESIAN PRODUCTS 115 4.4 Results Let’s say that a binary relation ≼ on a Cartesian product A has an aggregator satisfying certain properties if there exist a set B, a binary relation ⊑ on it, and an aggregator F : A<ω → B (relative to ≼ and ⊑) with those properties. Our first main result shows that for finite-dimensional products, the existence of a symmetric and monotonic aggregator suffices for the existence of an additive representation: Theorem 1. If a binary relation on a finite-dimensional Cartesian product has a symmetric and monotonic aggregator, then it is additive. (Most proofs are relegated to the Appendix). A few things are worth noting here. First, recall that for a binary relation ≼ on a finite-dimensional product A to be additive is for it to have an additive representation in some preordered Abelian group. Clearly then, an additive relation must be a preorder—that is to say, reflexive and transitive. But note that neither of these are assumed in the theorem above. Instead, they follow from the existence of a symmetric and monotonic aggregator for ≼. To demonstrate the strength of the existence of such aggregators and to familiarise ourselves with those properties, it will be instructive to briefly illustrate how reflexivity and transitivity can be derived. Consider reflexivity. Take any tuple a ∈ A (so, [a] is a sequence of length 1 in A<ω). If F : A<ω → B is a symmetric aggregator, then F[a]≡F[a] (consider the identity permutations). But if a ̸≼ a and F is monotonic, then F[a] ̸⊑ F[a]— contradiction. So, having a symmetric and monotonic aggregator implies that ≼ must be reflexive. Next, consider transitivity. Suppose a ≼ b and b ≼ c but a ̸≼ c. Then, for any monotonic aggregator F : A<ω → B, F[a, b, c] ̸≼ F[b, c, a]. But if F is symmetric, then F[a, b, c]≡F[b, c, a]. So, having a symmetric and monotonic aggregator implies that ≼ must be transitive. The second thing worth noting is that the theorem applies to binary relations that aren’t total. However, for total relations, the converse of the theorem also holds. The existence of a symmetric and monotonic aggregator isn’t just sufficient for additivity, it’s necessary: 116 RESULTS Theorem 2. If a total binary relation on a finite-dimensional Cartesian product is additive, then it has a symmetric and monotonic aggregator. Unlike the proof of the other direction, the proof here is simple. An additive representation itself provides us with an aggregator with the right properties. Proof. Let ≼ be a total additive binary relation on a finite-dimensional product A = A1 × · · · × An. Then, there exists a preordered Abelian group G and functions ϕi : Ai → G for each i = 1, . . . , n such that for any (a1, . . . , an),(b1, . . . , bn) ∈ A: (a1, . . . , an) ≼ (b1, . . . , bn) ⇐⇒ n ∑ i=1 ϕi(ai) ⩽ n ∑ i=1 ϕi(bi). Define the aggregator Φ : A<ω → G, where: Φ a 1 1 . . . a 1 n . . . . . . . . . a 1 m . . . a m n = m ∑ j=1 n ∑ i=1 ϕi(a j i ). That is, each ϕi assigns a value in G to each entry in the i-th column. Φ simply assigns each m × n matrix (for any finite m) the sum of the values of the entries according to the ϕi ’s in G. Since addition in G is commutative and associative, Φ is symmetric. To show that it is also monotonic, suppose a 1 , . . . , a m ∈ A<ω is dominated by b 1 , . . . , b m ∈ A<ω. Without loss of generality, suppose that a j ≼ b j for j = 1, . . . , k and a j ̸≽ b j for j = k + 1, . . . , m. Since ≼ is total, a j ̸≽ b j means that a j ≺ b j . And so for j = 1, . . . , k: n ∑ i=1 ϕj(a j i ) ⩽ n ∑ i=1 ϕj(b j i ). And for j = k + 1, . . . , m: n ∑ i=1 ϕj(a j i ) < n ∑ i=1 ϕj(b j i ). It then follows from the order property of G that: m ∑ j=1 n ∑ i=1 ϕi(a j i ) < m ∑ j=1 n ∑ i=1 ϕi(b j i ). CHAPTER 4. ADDITIVE RELATIONS ON CARTESIAN PRODUCTS 117 And thus, exactly as is required for Φ to be monotonic: Φ a 1 . . . a m = Φ a 1 1 . . . a 1 n . . . . . . . . . a m 1 . . . a m n < Φ b 1 1 . . . b 1 n . . . . . . . . . b m 1 . . . b m n = Φ b 1 . . . b m . This isn’t generally true for non-total relations. A non-total relation that is additive might not have a symmetric and monotonic aggregator—for instance: Example 5. Let ≼ be the binary relation on R2 , where (a, b) ≼ (c, d) just in case a ≤ c and b ≤ d. Clearly, this has an additive representation. But note that for any monotonic aggregator F : (R2 ) <ω → B: (1, 1) ̸≽ (2, 2) (2, 0) ̸≽ (0, 1) (0, 2) ̸≽ (1, 0) ⇒ F 1 1 2 0 0 2 ̸⊒ F 2 2 0 1 1 0 which implies that F can’t be symmetric. Thus, for a complete characterisation of the additive binary relations that aren’t necessarily total, weaker constraints on the aggregator are required. I have some conjectures but proving them is left for future work. A third thing worth noting is that in general, it’s crucial that the additive representations be allowed to take values in arbitrary preordered Abelian groups rather than just the reals. The restriction to real-valued representations is innocuous in the case of finitary Cartesian products (recall this means each Ai is finite), where we have the following characterisation: Proposition 1. A total binary relation on a finitary finite-dimensional Cartesian product is real additive if and only if it has a symmetric and monotonic aggregator. But in the case of relations that aren’t total or products that are infinitary, relations that have additive representations might not have any real-valued additive representations. This is obvious in the case of non-total relations. The usual ordering of the real numbers is total, so any real additive relation must itself be total. But for finitary products, we can give a straightforward description of the sorts of preordered Abelian groups in which additive non-total relations are ad- 118 RESULTS ditively representable. Let Rm be the product group containing m copies of the reals, with addition defined componentwise and the product preorder (where (x1, . . . , xm) ≤ (y1, . . . , ym) if and only if xj ≤ yj for every j ≤ m). Say that a binary relation is real-product additive if it has an additive representation taking values in such a product group of reals. Then, the generalisation of the right-to-left direction of Theorem 1 to relations that aren’t necessarily total is: Proposition 2. If a binary relation on a finitary finite-dimensional Cartesian product has a symmetric and monotonic aggregator, then it is real-product additive. For infinitary products, even total relations that are additive needn’t be real additive. This is obvious even in the case of a one-dimensional product (where |I| = 1) when the set is “too big”: Example 6. Let κ be an infinite cardinal and A the set of cardinalities smaller than it. Let ≼ on A be the usual ordering of cardinalities. A realvalued additive representation of ≼ on A would just be an order embedding ϕ : A → R. Clearly, such an embedding doesn’t exist for a sufficiently large κ where there are “more” cardinalities smaller than κ than there are real numbers. For, if |A| > 2 ℵ0 , then no ϕ : A → R is a surjection, which means that ϕ(τ) = ϕ(λ) for some τ, λ ∈ A even though τ ̸∼ λ. Even in the case of a finite-dimensional product where each component Ai is no bigger than the reals, additivity doesn’t always imply real-additivity. In fact, this is true even in the case of a two-dimensional product A1 × A2 with A1 the cardinality of the reals and A2 finite: Example 7. The lexicographic order ≼ on A = R × {0, 1} is such that for all (a, b),(c, d) ∈ A: (a, c) ≼ (b, d) ⇐⇒ either a < b or a = b and c ⩽ d. This lexicographic order isn’t real additive. Suppose for a contradiction that ϕ1 : R → R and ϕ2 : {0, 1} → R is such that for any (a, c),(b, d) ∈ A, CHAPTER 4. ADDITIVE RELATIONS ON CARTESIAN PRODUCTS 119 (a, c) ≼ (b, d) if and only if ϕ1(a) + ϕ2(c) ≤ ϕ1(b) + ϕ2(d). Let ε = ϕ2(1) − ϕ2(0). Since (a, c) ≼ (b, d) for any c, d whenever a < b, this means that |ϕ1(a) − ϕ1(b)| > ϵ for any a ̸= b ∈ R. Now, consider the real intervals Xn = [n,(n + 1)ϵ) for each n ∈ Z. Since the intervals are of ϵ length, for each n ∈ Z, there must be a unique an ∈ R such that ϕ1(an) ∈ Xn. But that is impossible, since {Xn : n ∈ Z} is a countable partition of R. While the relations in these examples don’t have real-valued additive representations, it’s not difficult to find symmetric and monotonic aggregators in both cases. So, it follows from Theorem 1 that they have additive representations in some preordered Abelian group. Again, we can give a concrete description of the sorts of preordered Abelian group that will serve the required purpose. At a very general level, the two examples above suggest that the inadequacy of the reals is that it lacks infinitely big elements and infinitesimally small ones. There is a standard method of enriching the reals with such elements in a way that still preserves the desirable algebraic and order properties of the reals. These enriched structures, which are preordered Abelian groups, are called hyperreals. An exposition of these structures is left for the Appendix. Let’s say that a binary relation is hyperreal-additive if it has an additive representation taking values in a hyperreal space. And say that it’s hyperreal-product additive if it has an additive representation taking values in a direct product of hyperreals with the product preorder. Then, we have the following more concrete representation theorems for Cartesian products that are not necessarily finitary: Proposition 3. If a total binary relation on a finite-dimensional Cartesian product has a symmetric and monotonic aggregator, then it is hyperreal additive. Proposition 4. If a binary relation on a finite-dimensional Cartesian product has a symmetric and monotonic aggregator, then it is hyperrealproduct additive. Collecting into a single statement these results about the kinds of preordered Abelian groups in which an additive representation can be found: 120 APPLICATIONS Theorem 3. A binary relation ≼ on a finite-dimensional Cartesian product A that has a symmetric and monotonic aggregator is: • real additive if A is finitary and ≼ is total; • real-product additive if A is finitary; • hyperreal additive if ≼ is total; • hyperreal-product additive, in general. In the case of a uniform Cartesian product A = A I , the relation ≼ might be permutation equivalent in the sense that for any permutation π : I → I, (ai)i∈I ∼ (aπ(i) )i∈I for all (ai)i∈I ∈ A I . In that case, if ≼ is additive, the numerical assignments can be chosen to be the same for each component—that is, ϕi = ϕj = ϕ for all i, j ∈ I and (ai)i∈I ≼ (bi)i∈I just in case ∑i∈I ϕ(ai) ≤ ∑i∈I ϕ(bi). 2 4.5 Applications To illustrate the usefulness of the results in the previous section, this section will apply them to problems in social welfare (§4.5.1), social choice (§4.5.2), the theory of ordered algebraic structures (§4.5.3), decision theory (§4.5.4), and probability (§4.5.5). 4.5.1 Social Welfare Let I = {1, . . . , n} be a finite set of individuals. For each i ∈ I, let Ai be the set of possible outcomes for individual i. A tuple a = (a1, . . . , an) represents the social outcome in which outcome ai obtains for individual i. Assuming that the outcome that obtains for one person doesn’t constrain which outcomes can obtain for others, the set of possible social outcomes is the Cartesian product A = ∏ n i=1 Ai . Sometimes, we’re not sure exactly which social outcome an act or policy will lead to. A social prospect is an assignment of a social outcome to each possible state. For current purposes, we’re interested only in equiprobable social prospects where each social outcome has an equal probability of obtaining. More precisely, for each m ∈ N, suppose we have a partition of the sample space into m equiprobable events E1, . . . , Em. Then, for any possible social outcomes 2See Krantz et al. (1971, 303-305). CHAPTER 4. ADDITIVE RELATIONS ON CARTESIAN PRODUCTS 121 a 1 , . . . , a m ∈ A, (a 1 ⊕ · · · ⊕ a m) is the prospect with outcome a 1 given E1, outcome a 2 given E2, and so on. (Note that this does not assume, for instance that a 1 ⊕ a 2 = a 2 ⊕ a 1 ). Equiprobable social prospects can be visualised as m × n matrices (for arbitrarily large finite m), like: a 1 a 2 . . . a m = 1 ... n E1 a 1 1 . . . a 1 n E2 a 2 1 . . . a 2 n . . . . . . . . . . . . Em a m 1 . . . a m n where a j i is the outcome for individual i if the actual state is in Ej . Let A∗ = {(a 1 ⊕ · · · ⊕ a m) : a 1 , . . . , a m ∈ A, m ∈ N} be the set of all possible equiprobable social prospects. Let ≼∗ be a binary relation on A∗ , where X ≼∗ Y means that the social prospect Y is at least as good as the social prospect X. (Note that even though it’ll be convenient to refer to ≼∗ as the “social prospect ordering” going forward, nothing is assumed about ≼∗ so far, not even that it’s reflexive or transitive). Among the equiprobable social prospects are the “degenerate” prospects which bring about the same social outcome no matter what happens. Each social outcome can be identified with the degenerate prospect that’s sure to bring it about and so, A ⊂ A∗ . Consider the binary relation ≼ that’s the restriction of ≼∗ to A. This relation compares the possible social outcomes. It’s said to be generalised utilitarian if there exists a preordered Abelian group G and functions ϕi : Ai → G such that for any social outcomes (ai)i∈I ,(bi)i∈I ∈ A: (ai)i∈I ≼ (bi)i∈I ⇐⇒ ∑ i∈I ϕi(ai) ≤ ∑ i∈I ϕi(bi). As an easy application of the results in the previous section, we can show that two weak constraints on the relation ≼∗ on equiprobable social prospects suffices for its restriction ≼ to social outcomes to be generalised utilitarian. The two constraints correspond to monotonicity and symmetry. Suppose that depending on how things turn out, it’s possible that the result given prospect X isn’t at least as good as that given prospect Y. And in all possibilities where that’s not so, the result given Y is at least as good as that given X. Then, it must be that X isn’t at least as good as Y: 122 APPLICATIONS Social Statewise Dominance. For every a 1 , . . . , a m, b 1 , . . . , b m ∈ A, if there’s some proper subset J ⊂ {1, . . . , m} such that: (i) a j ≼ b j for each j ∈ J; and (ii) a j ̸≽ b j for each j ̸∈ J, then (a 1 ⊕ · · · ⊕ a m) ̸≽∗ (b 1 ⊕ · · · ⊕ b m). The second constraint is: Minimal Pareto. For any X ∈ A∗ and set {πi : {1, . . . , m} → {1, . . . , m}}i∈I of permutations: X = a 1 1 . . . a 1 n . . . . . . . . . a m 1 . . . a m n ∼ ∗ a π1(1) 1 . . . a πn(1) n . . . . . . . . . a π1(m) 1 . . . a πn(m) n = X π The two matrices corresponding to X and X π differ only by a permutation of the entries in each column (the permutation for each column can be different). Minimal Pareto says that any two such social prospects must be equally good. These two conditions suffice to ensure that the relation on social outcomes is generalised utilitarian: Theorem 4. If ≼∗ on A∗ satisfies Minimal Pareto and Social Statewise Dominance, then its restriction ≼ on A is generalised utilitarian. Proof. Define the aggregator F : A<ω → A∗ so that F[a 1 , . . . , a m] = (a 1 ⊕ · · · ⊕ a m). It’s easy to check that if ≼∗ satisfies Minimal Pareto and Social Statewise Dominance, then F is symmetric and monotonic. Here’s one possible justification for Minimal Pareto. Each i-th column in matrices like the ones above corresponds to an individual prospect for individual i. The set of equiprobable prospects for individual i is A ∗ i = {(a 1 i ⊕ · · · ⊕ a m i ) : a 1 i . . . , a m i ∈ Ai , m ∈ N}. Let’s posit for each individual i a binary relation ≼∗ i on A ∗ i which represents their preferences over the equiprobable prospects for themselves. Towards justifying Minimal Pareto, we can make two assumptions—one about each individual ordering and another about the relationship between the indi- CHAPTER 4. ADDITIVE RELATIONS ON CARTESIAN PRODUCTS 123 vidual orderings and the social ordering. The first assumption is that each individual is indifferent between prospects that confer the same probabilities to each outcome: Individual Stochasticism. For any a 1 , . . . , a m ∈ Ai and permutation π : {1, . . . , m} → {1, . . . , m}: (a 1 ⊕ · · · ⊕ a m) ∼ ∗ i (a π(1) ⊕ · · · ⊕ a π(m) ). For instance, each individual should be indifferent between winning on heads and losing on tails, on the one hand, and winning on tails and losing on heads, insofar as the coin is fair. This minimal assumption about individual preference is widely taken for granted and is baked into standard decision-theoretic frameworks (like the von Neumann & Morgenstern (1944) framework where the objects of preferences are simply taken to be probability distributions over outcomes). Next, for each social prospect X ∈ A∗ , let Xi ∈ A ∗ i be the individual prospect for i that corresponds to the projection of X to the i-th column. The second assumption says that two social prospects must be equally good if every individual is indifferent between them. Ex-Ante Pareto Indifference. For any X,Y ∈ A∗ , if Xi ∼∗ i Yi for all i ∈ I, then X ∼∗ Y. Clearly, Individual Stochasticisim and Ex-Ante Pareto Indifference entail Minimal Pareto. And so, we have: Corollary 1. If ≼∗ on A∗ and each ≼∗ i on A ∗ i satisfy Individual Stochasticism, Ex-Ante Pareto Indifference, and Social Statewise Dominance, then the restriction ≼ of ≼∗ to A is utilitarian. This result can be seen as a dramatic generalisation of Harsanyi’s (1955) characterisation of utilitarianism (though it is also, in some ways, weaker). Harsanyi derived utilitarianism from three assumptions: (i) Ex-Ante Pareto Indifference; (ii) the assumption that each individual prospect ordering ≼∗ i satisfies the axioms of von Neumann & Morgenstern (1944) expected utility theory; and (iii) the assumption that the social prospect ordering ≼∗ also satisfies the vNM-axioms. Corollary 1 generalises Harsanyi’s result in various ways. First, it focuses 124 APPLICATIONS only on equiprobable prospects rather than all prospects. Second, assumption (ii) about the structure of individual preference is weakened all the way to Individual Stochasticism. Individual Stochasticism is embedded into the vNmframework from the outset as a modelling assumption. Assuming it doesn’t assume that each individual’s preference satisfy any of the vNM-axioms, including reflexivity or transitivity. Third, assumption (iii) about the structure of the social ordering of prospects is weakened to Social Statewise Dominance, which is satisfied by some non-expected utility theories. Of course, Harsanyi’s stronger assumptions would allow us to sharpen the result in various ways. Given the Archimedean or continuity assumptions in the vNM-axioms, the additive representation of the ordering of social outcomes can be made to take values in the real numbers. (As an application of Theorem 1, Theorem 4 and Corollary 1 could be sharpened in the same way under the assumption that the set of outcomes Ai for each individual is finite). The vNM-axioms would also allow us to strengthen the additive representation of the social outcome ordering ≼ to an expected utility representation of the social prospect ordering ≼∗ . The latter representation could well follow from the weakened assumptions of Theorem 4 and Corollary 1 but that remains to be shown.3 4.5.2 Social Choice A second application is in social choice theory. Let A be the set of possible preferences that a voter can register as their vote in an election (Pivato (2013) call this a “signal”). This can be any set—the expressions of preference that are allowed in an election can range from very coarse-grained to very fine-grained. For an election that only allows binary votes, A = {yes, no}. Alternatively, an election might ask voters to rate a proposal or alternative on a scale 1-5, in which case we can let A = {1, . . . , 5}. Or, voters might be asked to rank n candidates, in which case we can let A = {1 st, 2nd , . . . , n th}. We might even let A = R (or A = G for some preordered Abelian group G) in social choice situations which require taking into account each voter’s degree of preference (as represented by their 3Pivato & Tchouante (2023), expanding on Mongin & Pivato (2015), prove a related result. They derive an abstract expected utility representation in a Savage framework from axioms similar to our axioms of Social Statewise Dominance and Ex-Ante Pareto Indifference, though their results also require various strengthenings of our axioms and background assumptions. For instance, they assume that the social and individual preferences are total preorders. They assume Ex-Ante Pareto, instead of merely the “indifference” version of the principle. They also make various richness assumptions (their “Ex post social offset”, “Individual countervailing risk”, and “Certainty equivalence” conditions) roughly to the effect that whenever a prospect is better than another, there’s some change that can perfectly offset that difference. CHAPTER 4. ADDITIVE RELATIONS ON CARTESIAN PRODUCTS 125 vNM-utilities, for instance). Fixing A, a tuple of the form (a1, . . . , an) ∈ A n represents a hypothetical candidate or alternative in an election with n-many voters. It is the candidate or alternative that receives vote ai from voter i. The set A n is thus the set of hypothetical candidates or alternatives for an n-sized electorate. A binary relation ≼n on A n is then a voting rule for elections with n voters, where (a1, . . . , an) ≼n (b1, . . . , bn) means that the second candidate is socially preferred at least as much as the first or that the second doesn’t lose to the first. Voting rules can take many forms. A voting rule is called a scoring rule if there exists a ϕ n : A → G such that for any candidates (a1, . . . , an),(b1, . . . , bn) ∈ A n : (a1, . . . , an) ≼ n (b1, . . . , bn) ⇐⇒ n ∑ i=1 ϕ n (ai) ≤ n ∑ i=1 ϕ n (bi). That is, a scoring rule is a voting rule where each possible expression of preference is given a “score” and the candidates are ranked by their total score. An example of a voting rule that’s a scoring rule is the Borda count. Applying the results in §4.4, we can characterise the scoring rules with some simple constraints in a variable electorate setting. For each finite n ∈ N, we consider a voting rule ≼n defined on A n . For any a = (a1, . . . , am) ∈ A m and b = (b1, . . . , bn) ∈ A n , we define a ◦ b = (a1, . . . , am, b1, . . . , bn) ∈ A m+n to be their concatenation. That is, consider a constituency with m + n many voters. Suppose this constituency can be divided into two disjoint sub-constituencies, one with a population of m and the other n. Were these sub-constituencies to run their own elections, candidate a ◦ b would be represented as a and b in each of them. For the first constraint, consider an electorate with kn-many voters, which can be divided into k-many disjoint sub-electorates with n voters each. Consider two hypothetical candidates, a and b. Suppose that some sub-electorates don’t like a at least as much as b. And of the other sub-electorates (if any) where that’s not the case, they like b at least as much as a. The first constraint then says that the electorate as a whole can’t like a at least as much as b: Multiple Expansion. For any n ∈ N and a i , b i ∈ A n (i = 1, . . . , m), if for some J ⊂ {1, . . . , m}: (i) a i ≼n b i for each i ∈ J; and 126 APPLICATIONS (ii) a i ̸≽n b i for each i ̸∈ J, then (a 1 ◦ · · · ◦ a k ) ̸≽kn (b 1 ◦ · · · ◦ b k ). The second constraint is: Anonymity. For any n ∈ N and (a1, . . . , an) ∈ A n and permutation π : {1, . . . , n} → {1, . . . , n}, (a1, . . . , an) ∼n (aπ(1) , . . . , aπ(n) ). That is, the voting rules shouldn’t assign greater importance to one voter over another. Candidates that receive the same number of votes of the same kind should be tied. These two constraints suffice to characterise the scoring rules: Theorem 5. If a set of voting rules {≼n : n ∈ N} satisfies Multiple Expansion and Anonymity, then each ≼n is a scoring rule. Proof. Fix a particular ≼n on A n . Let ⊑ be the binary relation on S k∈N A kn, where a ⊑ b just in case a, b ∈ A kn for some k ∈ N and a ≼kn b. Define the concatenation aggregator F : (A n ) <ω → S k∈N A kn where F[a 1 , . . . , a k ] = (a 1 ◦ · · · ◦ a k ). Clearly, F is monotonic if {≼n : n ∈ N} satisfies Multiple Expansion and symmetric if it satisfies Anonymity. This result can be seen as a generalisation of some of the results in Smith (1973). Our setup is slightly different but three points of generalisation are salient. The first is that Smith assumes that the set of candidates is finite whereas the result above doesn’t (in fact, A n can be of arbitrary cardinality). The second is that Smith assumes the preferences to be total preorders whereas the result above assumes nothing about the binary relations (though Multiple Expansion will entail that they are preorders). Finally, Multiple Expansion is a restricted version of the separability axiom in Smith, which concerns combinations of sub-electorates of varying sizes rather than just those of equal sizes.4 4Other related results include Young (1975) and Pivato (2013). Those results are concerned with a procedure that selects a subset of the candidates as the “winners” of the election, whereas our result is concerned with a procedure that yields a collective preference ranking of all the candidates. CHAPTER 4. ADDITIVE RELATIONS ON CARTESIAN PRODUCTS 127 4.5.3 Ordered Algebraic Structures This section uses the results in §4.4 to prove some embedding theorems for ordered algebraic structures in the vein of Holder’s Embedding Theorem, the Hahn Embedding Theorem, and theorems regarding the embedding of weaker algebraic structures into stronger ones (e.g. preordered commutative monoids into preordered groups).5 Let us first define generalisations of some common algebraic properties. For instance, instead of requiring addition to be commutative, we might impose the weaker requirement that it be commutative relative to some binary relation ≼. That is, adding two elements in a different order might not produce the same result but should produce elements that are, at the very least, equally good according to ≼. Definition 9. Let X be a set and ⊕ : X × X → X a binary operation. For a binary relation ≼ on X, define: • (≼-Associativity) ((a ⊕ b) ⊕ c) ∼ (a ⊕ (b ⊕ c)), for all a, b, c ∈ X; • (≼-Zero) there exists 0 ∈ X such that (a ⊕ 0) ∼ (0 ⊕ a) ∼ a for all a ∈ X. • (≼-Commutativity) (a ⊕ b) ∼ (b ⊕ a), for all a, b ∈ X; • (≼-Cancellation) a ≼ b if and only if (a ⊕ c) ≼ (b ⊕ c) and (c ⊕ a) ≼ (c ⊕ b), for all a, b, c ∈ X. If X = ⟨X, ⊕⟩ is ≼-associative, then it is a ≼-semigroup. If it is ≼- associative and satisfies ≼-zero, then it is a ≼-monoid. Note that familiar algebraic structures are subsumed under this definition. Letting ≼ be the identity relation, a =-semigroup is just a semigroup, a =-monoid is just a monoid, and so on. We can also consider multiple binary relations. For instance,where ≼ is a preorder, a ≼-cancellative =-semigroup is just a preordered semigroup, a ≼-cancellative =-monoid is just a preordered monoid, and so on. Then, as an application of our results, we have the following embedding theorem: 5See Clifford (1954), Conrad (1953), Hausner & Wendel (1952). 128 APPLICATIONS Theorem 6. Let ≼ and ⊑ be transitive binary relations on a set X such that ⊑ entails ≼ (i.e. a ≼ b whenever a ⊑ b, for all a, b ∈ X). Let X = ⟨X, ⊕⟩ be a ≼-cancellative ⊑-commutative ⊑-monoid. Then, there exists a function ϕ : X → G such that: (i) ϕ(a ⊕ b) = ϕ(a) + ϕ(b), for all a, b ∈ X; (ii) a ≼ b if and only if ϕ(a) ⩽ ϕ(b). Furthermore, G can be chosen to be the reals if X is finite and ≼ is total, a product of reals if X is finite, a hyperreal space if ≼ is total, and a product of hyperreals, otherwise. Proof. Let ≼ and ⊑ be transitive binary relations on X such that such that ⊑ entails ≼. Let X = ⟨X, ⊕⟩ be ≼-cancellative ⊑-commutative ⊑-monoid. Define the relation ≼∗ on X × X as follows: (a, b) ≼∗ (c, d) if (a ⊕ b) ≼ (c ⊕ d), for all a, b, c, d ∈ X. Then, define the “component-wise addition” aggregator F : (X × X) <ω → (X × X) (relative to ≼∗ ), where for any (a1, b1), . . . ,(an, bn) ∈ X × X: F (a1, b1) . . . (an, bn) = (a1 ⊕ · · · ⊕ an, b1 ⊕ · · · ⊕ bn). where by convention, brackets associate to the left so that a1 ⊕ a2 ⊕ · · · ⊕ an = (((a1 ⊕ a2) ⊕ . . .) ⊕ an). Given ⊑-associativity and ⊑-commutativity: (a1 ⊕ · · · ⊕ an) ≡ (aπ1(1) ⊕ · · · ⊕ aπ1(n) ) (b1 ⊕ · · · ⊕ bn) ≡ (bπ2(1) ⊕ · · · ⊕ bπ2(n) ) for any permutations π1, π2 on {1, . . . , n}. Since ⊑ entails ≼: (a1 ⊕ · · · ⊕ an) ∼ (aπ1(1) ⊕ · · · ⊕ aπ1(n) ) (b1 ⊕ · · · ⊕ bn) ∼ (bπ2(1) ⊕ · · · ⊕ bπ2(n) ) So, given ≼-cancellation: (a1 ⊕ · · · ⊕ an, b1 ⊕ · · · ⊕ bn) ∼ ∗ (aπ1(1) ⊕ · · · ⊕ aπ1(n) , bπ2(1) ⊕ · · · ⊕ bπ2(n) ). CHAPTER 4. ADDITIVE RELATIONS ON CARTESIAN PRODUCTS 129 So, F is symmetric. Similarly, the monotonicity of F is a simple consequence of ≼-cancellation and the transitivity of ≼. So, by Theorem 3, there exists a ϕ : X → G (where G can be chosen as specified above) such that for any a, b, c, d ∈ X: (a ⊕ c) ≼ (b ⊕ d) ⇐⇒ ϕ(a) + ϕ(c) ≤ ϕ(b) + ϕ(d). (∗) To check that ϕ has the desired properties, first note that we can find a ϕ such that ϕ(0) = 0. Then, since X is a ⊑-monoid, it follows by ⊑-zero that ((a ⊕ b) ⊕ 0)≡(a ⊕ b). Since ⊑ entails ≼, ((a ⊕ b) ⊕ 0) ∼ (a ⊕ b). So, by (∗), ϕ(a ⊕ b) + ϕ(0) = ϕ(a) + ϕ(b) and since ϕ(0) = 0: (i) ϕ(a ⊕ b) = ϕ(a) + ϕ(b). Next, by ≼-cancellation, a ≼ b if and only if (a ⊕ c) ≼ (b ⊕ c). Furthermore, in any preordered Abelian group, ϕ(a) + ϕ(c) ≤ ϕ(b) + ϕ(c) if and only if ϕ(a) ≤ ϕ(b). So: (ii) a ≼ b if and only if ϕ(a) ≤ ϕ(b). With appropriate choices of binary relations, we obtain as simple corollaries some useful representation theorems. Corollary 2. Every preordered commutative monoid is an ordered submonoid of a preordered Abelian group. In particular: • every finite totally preordered commutative monoid is an ordered submonoid of the reals; • every finite preordered commutative monoid is an ordered submonoid of a product of reals; • every totally preordered commutative monoid is an ordered submonoid of a hyperreal group; • every preordered commutative monoid is an ordered submonoid of a product of hyperreals. Proof. Let ⊑ in Theorem 6 be the identity relation. The requirement that ⊑ entail ≼ amounts to ≼ being reflexive (which it is if ≼ is a preorder). Properties (i) and (ii) of ϕ are exactly the properties of an order embedding. 130 APPLICATIONS 4.5.4 Decision Theory Decision theory is, amongst other things, concerned with preferences under uncertainty. The standard framework of von Neumann & Morgenstern (1944) takes the objects of preferences to be lotteries. These are probability distributions over a set of outcomes. An example is the distribution that assigns an equal probability to the outcomes of winning $5 and losing $5. von Neumann & Morgenstern (1944) showed that preferences over lotteries satisfying certain structural constraints can be thought of as expected-utility maximising. However, embedded in this framework are some assumptions that restrict its application. For instance, it is ill-suited for modelling preferences where numerical probabilities aren’t readily available. In collapsing uncertain prospects with the same probability distribution (e.g winning on heads and losing on tails vs. winning on tails and losing on heads), it also rules out from the outset “nonstochastic” preferences that distinguish between such prospects. The alternative framework of Savage (1954) instead takes the objects of preference to be acts. Formally, these are functions from a set Ω of possible states (things beyond the agent’s control) to a set O of possible outcomes (things that matter to the agent). For instance, the act of bringing an umbrella assigns to every state in which it rains the outcome of being dry and every state in which it doesn’t rain the outcome of a heavier bag. Savage showed that from preferences over acts satisfying certain constraints, a probability function and utility function can be recovered. And relative to these functions, the preferred acts are precisely those with greater expected utility. But Savage’s framework also contains some questionable assumptions. Foremost of which is the assumption that preferences are defined over all possible functions from states to outcomes. Not all such functions correspond to possible acts, intuitively conceived. Consider, for instance, the act that associates the outcome of me enjoying a nice meal tomorrow to states in which I live past tonight and the outcome of me no longer being alive tomorrow to states where I don’t live past tonight. The complementary function assigns the states in which I live past tonight the outcome of me no longer being alive tomorrow and states in which I don’t live past tonight the outcome of me enjoying a nice meal tomorrow. That doesn’t correspond to any possible act, however broadly “possible” is understood.6 Savage’s framework might thus be modified so that an outcome that’s possible given one state needn’t always be possible given another. For each pos6See Joyce (1999, 65-67). CHAPTER 4. ADDITIVE RELATIONS ON CARTESIAN PRODUCTS 131 sible state x ∈ Ω, we have a set Ox of possible outcomes given x. This allows for considerable flexibility. Any two sets Ox and Oy might be disjoint, the same, or partially overlap. An act in this modified framework is thus a tuple (ox)x∈Ω ∈ ∏x∈Ω Ox, the act that results in outcome ox given state x. A preference relation is a binary relation ≼ on the space of possible acts O = ∏x∈Ω Ox, where a ≼ b means that act b is preferred at least as much as a. 7 A representation theorem is possible in this framework if we posit, for each state x ∈ Ω and natural number m ∈ N, an m-ary operation ⊕m x : (Ox) m → Ox. Each such operation is commutative in the sense that for any a 1 , . . . , a m ∈ Ox and permutation π on {1, . . . , m}: ⊕ m x (a 1 , . . . , a m) = ⊕ m x (a π(1) , . . . , a π(m) ). The outcomes and operations can be interpreted in various ways. An outcome a j might be some quantity of goods (e.g. amount of money), in which case ⊕m x (a 1 , . . . , a m) might be the addition of those quantities. An outcome a j might be an event, in which case ⊕m x (a 1 , . . . , a m) might be the conjunction of those events. An outcome a j might be a lottery, in which case ⊕m x (a 1 , . . . , a m) might be the mixed outcome with an equal chance of each a j . And so on. For any acts a j = (a j x)x∈Ω ∈ O (j = 1, . . . , m), we define the componentwise sum ⊕m(a 1 , . . . , a m) = (⊕m x (a 1 x , . . . , a m x ))x∈Ω. Consider the following constraint on the agent’s preferences: Addition Invariance. For any a 1 , . . . a m, b 1 , . . . b m ∈ O, if there exists some proper subset J ⊂ {1, . . . , m} such that: (i) a j ≼ b j for all j ∈ J; and (ii) a j ̸≽ b j for all j ̸∈ J, then ⊕m(a 1 , . . . , a m) ̸≽ ⊕m(b 1 , . . . , b m). For instance, let the outcomes be monetary amounts and the operations be addition. Each act is a gamble that yields possibly varying monetary payoffs depending on what happens. Suppose there are various separate occasions in which an agent can choose between two gambles. On some occasions, she doesn’t like 7Embedded in this modified framework are still some assumptions—for instance, that whether an outcome is possible given a state is independent of what outcomes are associated with other states. For future work, it would be interesting to relax this assumption by investigating preference relations on suitable subsets of the full Cartesian product O = ∏x∈Ω Ox. 132 APPLICATIONS gamble a j at least as much as b j . And on the occasions where that’s not the case, she likes b j at least as much as a j . Then, according to Addition Invariance, it can’t be that she likes the bundle of the a j ’s at least as much as the bundle of the b j ’s. This constraint alone suffices for the preference relation to be additive (note again that not even reflexivity or transitivity needs to be assumed): Theorem 7. A preference relation ≼ on O = ∏x∈Ω Ox (where Ω is finite and each ⊕m x is commutative) that satisfies Addition Invariance is additive. where being additive means that there exist functions ϕx : Ox → G (where G is a preordered Abelian group) such that (ax)x∈Ω ≼ (bx)x∈Ω just in case ∑x∈Ω ϕx(ax) ≤ ∑x∈Ω ϕx(bx). Proof. Define the aggregator F : O<ω → O where for any a 1 , . . . , a m ∈ O, F[a 1 , . . . , a m] = ⊕m(a 1 , . . . , a m). The symmetry of F follows from the fact that each ⊕m x is commutative. F being monotonic follows from Addition Invariance. It’s worth noting that Addition Invariance is debatable depending on the interpretation of the operations—but especially so in the context of non-total preference relations. Example 8. The prizes of a raffle are fruit baskets with unspecified numbers of apples and oranges. Let O = N2 be the set of outcomes, where (a, b) stands for a basket with a apples and b oranges. The set of acts is O = OΩ. Suppose that all else equal, more apples and more oranges are preferable. But apples and oranges are incomparable—a basket with more apples but fewer oranges or more oranges but fewer apples is neither better, worse, nor exactly as good. So, a raffle ticket a 1 that guarantees two oranges is incomparable with a ticket b 1 that guarantees one apple. Similarly, a raffle ticket a 2 that guarantees two apples is incomparable with a ticket b 2 which guarantees one orange. a 1 = (0, 2) Ω ̸≽ (1, 0) Ω = b 1 a 2 = (2, 0) Ω ̸≽ (0, 1) Ω = b 2 But contra Addition Invariance, one might prefer the bundle of the a’s, CHAPTER 4. ADDITIVE RELATIONS ON CARTESIAN PRODUCTS 133 since two apples and two oranges is preferable to just one of each. ⊕ 2 (a 1 , a 2 ) = (2, 2) Ω ≽ (1, 1) Ω = ⊕ 2 (b 1 , b 2 ). Similar problems for Addition Invariance in the context where the operation is interpreted as a risky mixture are cases of opaque sweetening.8 4.5.5 Probability One event being more or less likely than another is often represented using numerical probabilities. What justifies this use of numbers? Let an event space Σ be a Boolean algebra over a finite sample space Ω. A comparative probability relation is a binary relation ≼ on Σ (not assumed to be reflexive, transitive, total, and so on), where A ≼ B means that event B is at least as likely as event A. This relation is probabilistic if it’s representable by a probability function. That is to say there exists a probability function P : Σ → R such that for any A, B ∈ Σ: A ≼ B if and only if P(A) ≤ P(B). Under what conditions is a comparative probability relation probabilistic? Naturally, de Finetti (1937) posited conditions that are the qualitative analogue of the axioms of numerical probability: Order. ≼ is a total preorder; Non-Triviality. ∅ ≺ Ω; Non-Negativity. For all A ∈ Σ, ∅ ≼ A; Simple Additivity. For all A, B, C ∈ Σ, if A ∩ C = ∅ and B ∩ C = ∅, then A ≽ B implies A ∪ C ≽ B ∪ C and A ≻ B implies A ∪ C ≻ B ∪ C. Unfortunately, Kraft et al. (1959) showed de Finetti’s conjecture to be false even when Σ is finite. Instead, necessary and sufficient conditions for a finite Σ were identified by Kraft et al. (1959) and improved upon by Scott (1964). Scott’s result replaces the Simple Additivity axiom with the much more complicated: 8See Hare (2010). 134 APPLICATIONS Finite Cancellation. Let 1A : Ω → R be the indicator function of A ∈ Σ. For any A1, . . . , An, B1, . . . , Bn ∈ Σ, if: n ∑ i=1 1Ai = n ∑ i=1 1Bi , then Ai ≼ Bi for all i = 1, . . . , n − 1 implies that An ≽ Bn. The motivation for the axiom is as follows. The sum of the indicator functions being equal, as above, means that each state x ∈ Ω is in exactly as many of the A’s as the B’s—that is, |i : x ∈ Ai | = |i : x ∈ Bi |. In other words, it’s “impossible” for more of the A’s than the B’s to obtain, or vice versa. In that case, it shouldn’t be that each B is at least as likely as the corresponding A and, furthermore, that some B is strictly more likely. Scott showed that: Theorem 8 (Scott (1964)). Let Σ be a finite Boolean algebra over a set Ω. A total, reflexive binary relation ≼ on Σ is probabilistic if and only if it satisfies Non-Triviality, Non-Negativity, and Finite Cancellation. This is a simple corollary of our results. Proof. Define the aggregator F : Σ <ω → RΩ so that for any A1, . . . , An ∈ Σ, F[A1, . . . , An] = 1A1 + · · · + 1An .Define the relation ⊑ on RΩ as that induced by ≼. Clearly, F is symmetric and, given Finite Cancellation, it is monotonic. Given Non-Triviality and Non-Negativity, the additive representation can be renormalised to be a probability function. Scott’s characterisation is wanting in some respects. One is the little resemblance that the Finite Cancellation axiom bears to the additivity property of numerical probabilities. Insofar as one of the aims is to characterise probabilistic relations by intuitive properties that lie at the core of how we understand probabilities, it would be desirable to have constraints that are more obviously comparative analogues of the axioms for numerical probabilities. This limitation is related to Scott’s own observation that the axiom cannot be formulated purely in the language of the theory of Boolean algebras: “The unpleasant feature of [the Finite Cancellation axiom] is that it is not a strictly Boolean condition: x0 + · · · + xn−1 [∑ n i=1 1Ai in our notation] means CHAPTER 4. ADDITIVE RELATIONS ON CARTESIAN PRODUCTS 135 the algebraic sum of characteristic functions and does not stand for the union of the xi [Ai in our notation]” (1964, 247). In fact, an axiom equivalent to Finite Cancellation that is more obviously a qualitative analogue of additivity can be found. To introduce it, it would be helpful to first reformulate the finite additivity property of probability functions. First, note the following elementary consequence of finite additivity: P(A1) + P(A2) = P(A1 ∪ A2) + P(A1 ∩ A2) for any A1, A2 ∈ Σ. This is easily appreciated by inspecting the Venn diagram below: A1 A2 A1 A2 A1 A2 A1 A2 + = + The sum of the probabilities of two events double-counts the probability of their intersection. So, it’s equal to the probability of their union plus that of their intersection. Less widely recognised but equally elementary is the observation that the sum of the probability of three events double-counts the probability that at least two of the three events obtain and triple-counts the probability that all three obtain. So: A1 A2 A3 A1 A2 A3 A1 A2 A3 A1 A2 A3 A1 A2 A3 A1 A2 A3 + + = + + Thus, on any probability function P: P(A1) + P(A2) + P(A3) = P(A1 ∩ A2 ∩ A3) + P((A1 ∪ A2) ∩ (A1 ∩ A3) ∪ (A2 ∩ A3)) + P(A1 ∪ A2 ∪ A3). Generalising to the sum of n events, it’s clear that the probability that all n of them obtains is counted n times over, the probability that at least n − 1 of them obtains is counted n − 1 times over, and so on. To simplify notation, for any A1, . . . , An ∈ Σ, let A ( n k ) be the event that at least k-many of the n A’s obtain. That 136 APPLICATIONS is: A ( n 1 ) = A1 ∪ · · · ∪ An A ( n 2 ) = (A1 ∩ A2) ∪ (A1 ∩ A3) ∪ · · · ∪ (An−1 ∩ An) A ( n 3 ) = (A1 ∩ A2 ∩ A3) ∪ (A1 ∩ A2 ∩ A4) ∪ · · · ∪ (An−2 ∩ An−1 ∩ An) . . . A ( n n ) = A1 ∩ · · · ∩ An More generally: A ( n i ) = [ π∈Sn \ j≤i Aπ(j) , where Sn is the set of permutations on {1, . . . , n}. Generalising the previous observations in the case of n = 2 and n = 3, we have the following property of probability functions: P(A1) + · · · + P(An) = P A ( n 1 ) + · · · + P A ( n n ) , (*) for any A1, . . . , An ∈ Σ. Indeed, property (∗) is easily seen to be equivalent to Finite Additivity given the assumption that P(∅) = 0, since whenever A1, . . . , An are pairwise disjoint, A ( n k ) = ∅ for all k ̸= 1. Now, suppose P(Ai) ≤ P(Bi) for each i = 1, . . . , n and P(Ai) < P(Bi) for some i. Then, given (∗), it must be that P(A ( n i ) ) < P(B ( n i ) ) for some i. So, the qualitative analogue of (∗) is: Compositionality. For any A1, . . . , An, B1, . . . , Bn ∈ Σ, if: (i) Ai ≼ Bi for all i = 1, . . . , n; and (ii) Ai ≺ Bi for some i = 1, . . . , n, then A ( n i ) ≺ B ( n i ) for some i. As a simple application of our results, we can show: Theorem 9. Let Σ be a finite Boolean algebra over a set Ω. A binary relation ≼ on Σ is probabilistic if and only if it satisfies Non-Triviality, NonNegativity, and Compositionality. CHAPTER 4. ADDITIVE RELATIONS ON CARTESIAN PRODUCTS 137 Proof. Every finite Boolean algebra is isomorphic to the powerset algebra over some finite set. So, without loss of generality, let’s just consider a binary relation ≼ on P(Ω), where Ω is a finite set. In turn,identifying each set with its indicator function, P(Ω) is isomorphic to the Cartesian product {0, 1} Ω. Now, define the aggregator F : (P(Ω))<ω → (P(Ω))<ω, where P(Ω) is equipped with the relation ≼ and the codomain (P(Ω))<ω is equipped with the relation ⊑ where [A1, . . . , An] ⊑ [B1, . . . , Bm] if and only if m = n and Ai ≼ Bi for all i ≤ n. The aggregator is then defined: F A1 A2 . . . An = A ( n 1 ) A ( n 2 ) . . . A ( n n ) The symmetry of F follows from the commutativity and associativity of the Boolean operations of union and intersection. The monotonicity of F amounts exactly to the Compositionality condition on ≼. Given Non-Triviality and Non-Negativity, the additive representation can be renormalised to be a probability function. 4.6 Connections to Existing Work This section compares the results in this paper with existing characterisations of additive relations. One existing characterisation is in terms of strong separability. Definition 10 (Separability). A binary relation ≼ on a Cartesian product A = ∏i∈I Ai is separable on a subset J ⊆ I if for all (ai)i∈I ,(bi)i∈I ,(ci)i∈I ,(di)i∈I ∈ A such that ai = ci and bi = di for all i ∈ J and ai = bi and ci = di for all i ̸∈ J: (ai)i∈I ≼ (bi)i∈I ⇐⇒ (ci)i∈I ≼ (di)i∈I ≼ is weakly separable if it’s separable on each singleton {i} ⊆ I. It is strongly separable if it’s separable on each subset J ⊆ I. Put differently, for any binary relation ≼ on ∏i∈I Ai , we can define for each subset J ⊆ I, a relation ≼J on ∏i∈J Ai by holding fixed the components outside of J. However, in general, the definition could depend on how exactly those components are fixed. The relation that are separable on J are exactly those where that’s 138 CONNECTIONS TO EXISTING WORK not the case and the restriction ≼J can be defined without ambiguity. For finite-dimensional Cartesian products, separability on every subset of the components suffices for additivity: Theorem 10. A binary relation on a finite-dimensional Cartesian product that is strongly separable is additive. Where the relation is total and the product finitary, this follows from the results of Debreu (1960). That can then be generalised to non-total relations and infinitary products by the proof methods used in the Appendix. Given their sufficiency for additivity, one might wonder about the relationship between strong separability and the existence of a symmetric and monotonic aggregator. For total binary relations, we saw in Theorem 2 that the latter isn’t just sufficient for additivity, it’s also necessary. The same is true of strong separability. So, for total relations on finite-dimensional products, the two conditions are equivalent: Corollary 3. A total binary relation on a finite-dimensional Cartesian product is strongly separable if and only if it has a symmetric and monotonic aggregator. In the general setting, it can be shown that strong separability is at most as strong a requirement as the existence of a symmetric and monotonic aggregator (this holds even for infinite-dimensional products): Proposition 5. If a binary relation on a Cartesian product has a symmetric and monotonic aggregator, then it is strongly separable. Proof. Suppose a binary relation ≼ on a Cartesian product isn’t strongly separable. Let (ai),(bi),(ci),(di) ∈ A be such that for some J ⊂ I, ai = bi and ci = di for all i ∈ J and ai = ci and bi = di for all i ̸∈ J. And suppose that (ai) ≼ (bi) but (ci) ̸≼ (di). So, for any monotonic aggregator F : A<ω → B, it must be that F[(ai),(di)] ̸⊒ F[(bi),(ci)]. But, for each i ∈ I, consider the permutation πi : {ai , di} → {ai , di}: πi(ai) = ai if i ∈ J; di if i ̸∈ J. πi(di) = di if i ∈ J; ai if i ̸∈ J. CHAPTER 4. ADDITIVE RELATIONS ON CARTESIAN PRODUCTS 139 Since ai = bi for all i ∈ J and di = bi for all i ̸∈ J, this means that (πi(ai))i∈I = (bi)i∈I . Similarly, since di = ci for all i ∈ J and ai = ci for all i ̸∈ J, this means that (πi(di))i∈I = (ci)i∈I . So, for any symmetric aggregator F, it must be that F[(ai),(di)]≡F[(bi),(ci)], which contradicts F[(ai),(di)] ̸⊒ F[(bi),(ci)]. So, if ≼ isn’t strongly separable, then it can’t have a symmetric and monotonic aggregator. Whether the converse holds for non-total relations or products that are not finitedimensional remains to be shown. Despite their logical equivalence under certain conditions, the two approaches differ conceptually. To illustrate with the case of social welfare, let I be a finite set of individuals, Ai the set of possible outcomes for individual i, A = ∏i∈I Ai the set of possible distributions, and ≼ a comparative relation on the possible distributions. Under what conditions is this relation additive? On the separability approach, we would consider whether it’s possible to decompose the components along the interpersonal dimension in a modular way. For the relation to be strongly separable is for the comparison of distributions to depend only on affected subpopulations. For unaffected subpopulations, exactly how they fare shouldn’t make a difference. When that’s the case, we can unambiguously make local comparisons about whether some distribution is better or worse for a given subpopulation. The aggregator approach instead first prompts us to consider a second dimension along which the outcomes might be distributed. For instance, besides the interpersonal dimension, there might be an additional temporal dimension. The outcomes can then be arranged on a two-dimensional grid, like: p1 ... pn t1 a 1 1 . . . a 1 n t2 a 2 1 . . . a 2 n . . . . . . . . . . . . tm a m 1 . . . a m n Heuristically, what an aggregator does is condense this temporal dimension, collapsing the rows to construct a composite that can often be thought of as a “summary” or “consolidation” of what happens across time. The relation is additive (along the interpersonal dimension) if this composite or summary has certain characteristics. Roughly, the aggregator being monotonic amounts to the composite being unanimity preserving—in the sense that one composite can’t be better than another unless some part of it is. The symmetry of the aggrega- 140 FURTHER RESEARCH tor amounts to the composite being minimally “modular”—in the sense that the construction of the composite can’t be overly sensitive to the order in which the elements in each column are arranged. So, while the separability approach is decompositional, the aggregator approach is compositional. The aggregator approach is also inherently multi-dimensional. It characterises the relations that are additive along one dimension (the columns) in terms of the possibility of composing the relata along a second dimension (the rows) in a particular way. The applications in the previous section all leveraged one dimension to shed light on another. Row Dimension Column Dimension Social Welfare risk/time people Social Choice electorate size voters Abstract Algebra an algebraic operation an algebraic operation Decision Theory addition/conjunction/ possible states mixture of outcomes Probability Theory addition of indicator functions/ sample space a Boolean operation Of course, this description of the difference in ethos behind the two approaches is somewhat impressionistic. Given their close connection, it’s no surprise that the divisions can be blurred with a slight change in perspective. Nevertheless, hopefully, the usefulness of the aggregator approach as supplementary perspective to the problem of additive representations is clear. 4.7 Further Research The results and discussion of this paper raises some further questions that I hope to explore in the future: 1. What weakening of the symmetry and monotonicity conditions are required to fully characterise the additive relations on finite-dimensional products when the relation isn’t necessarily total? 2. What are ways of extending the notion of additivity to relations on infinitedimensional products? What are the necessary and conditions for a relation to be additive in those ways? 3. In some applications, the relations aren’t defined on a Cartesian product but some proper subset of it. Under what conditions can our results be generalised to relations on subsets of Cartesian products? CHAPTER 4. ADDITIVE RELATIONS ON CARTESIAN PRODUCTS 141 4. What properties of aggregators correspond to the existence of other kinds of representations, e.g. subadditive or superadditive representations? 5. Instead of requiring aggregators to be defined on matrices with arbitrarily finitely many rows, is there some large enough N ∈ N such that an aggregator of the form F : A<N → B would suffice? Under what conditions? Might N vary with certain parameters, like the size of the Cartesian product? 4.8 Appendix The goal of this Appendix is to prove: Theorem 3. A binary relation ≼ on a finite-dimensional Cartesian product A that has a symmetric and monotonic aggregator is: (i) real additive if A is finitary and ≼ is total; (ii) real-product additive if A is finitary; (iii) hyperreal additive if ≼ is total; (iv) hyperreal-product additive, in general. We’ll first prove (ii), from which (i) is easily derived as a special case. Then, (iii) and (iv) are proved by induction, using (i) and (ii), respectively, as the base case. Let A = A1 × · · · × An be a finitary finite-dimensional Cartesian product. Consider the finite-dimensional real vector space V = R|A1 | × · · · × R|An| with the elements of the disjoint union of A1, . . . , An as its basis vectors (so, the dimension of the vector space is |A1| + · · · + |An|). Now, each tuple a = (a1, . . . , an) ∈ A can be identified with a vector in V—specifically, the vector 1a with 1’s in the a1, . . . , an places and 0 elsewhere. For any subset X ⊆ V of vectors, let C(X) be the smallest convex cone containing X. And for any binary relation ≼ on A, define the subsets of V: [≼] = {1b − 1a : a ≼ b}; [≺] = {1b − 1a : a ≺ b}. The following lemma will be useful. : 142 APPENDIX Lemma 1. Let ≼ be a binary relation on a finitary finite-dimensional Cartesian product A = A1 × · · · × An. If there exists a symmetric and monotonic aggregator F : A<ω → B, then for any a, b ∈ A such that a ̸≼ b: X = C([≺] ∪ {1a − 1b}) and Y = −C([≼]) are disjoint. Proof. Let ≼ be a binary relation on a finitary finite-dimensional Cartesian product A = A1 × · · · × An. We’ll prove the contrapositive of the lemma. Suppose that a ̸≼ b but the sets X and Y, as defined above, are not disjoint. (We then want to show that ≼ can’t have a symmetric and monotonic aggregator). Let a 0 = a and b 0 = a. An arbitrary vector in X has the form: j ∑ i=0 λi (1a i − 1bi), where λ0 ≥ 0 and λi > 0 and a i ≺ b i for all i = 1, . . . , j. And with a judicious choice of labeling in anticipation of what’s to come, an arbitrary vector in Y has the form: k ∑ i=j+1 λi (1bi − 1a i) where λi > 0 and a i ≼ b i for all i = j + 1, . . . , k. The fact that X and Y aren’t disjoint means that there exist a i , b i ∈ A (i = 1, . . . , j, . . . , k) such that: j ∑ i=0 λi (1a i − 1bi) = k ∑ i=j+1 λi (1bi − 1a i). where the vector on the left is an arbitrary element of X and the one on the right is an arbitrary element of Y. Rearranging: k ∑ i=0 λi 1a i = k ∑ i=0 λi 1bi . By a theorem of Gauss, there’s a rational solution, so by cancelling out the smallest common denominator, we can turn the scalars λi into natural numbers and CHAPTER 4. ADDITIVE RELATIONS ON CARTESIAN PRODUCTS 143 positive integers: k ∑ i=0 pi 1a i = k ∑ i=0 pi 1bi , (∗) where p0 ∈ N and pi ∈ Z+. Now, consider the sequence x 1 , . . . , x p0 , x p0+1 , . . . , x p0+···+pk containing p0 copies of a 0 , p1 copies of a 1 , and so on. Similarly for y and b. x 1 = a 0 ̸≽ b 0 = y 1 . . . x p0 = a 0 ̸≽ b 0 = y p0 x p0+1 = a 1 ≼ b 1 = y p0+1 . . . x p0+p1 = a 1 ≼ b 1 = y p0+p1 x p0+p1+1 = a 2 ≼ b 2 = y p0+p1+1 . . . x p0+···+pk = a k ≼ b k = y p0+···+pk Suppose for a contradiction that there exists an aggregator F : A<ω → B that’s symmetric and monotonic. Since F is monotonic: F x 1 . . . x p0+···+pk ̸⊒ F y 1 . . . y p0+···+pk However, (∗) implies that any a ∈ A1, . . . , An appears exactly as many times in the sequence x 1 , . . . , x p0+···+pk as in the sequence y 1 , . . . , y p0+···+pk . So, if F : A<ω → B is a symmetric aggregator, then: F x 1 . . . x p0+···+pk ⊒ F y 1 . . . y p0+···+pk So, ≼ does not have an aggregator that’s both symmetric and monotonic. Proof of (ii) of Theorem 3. Let ≼ be a binary relation on a finitary finite-dimensional Cartesian product A = A1 × · · · × An, for which there exists a symmetric and monotonic aggregator. Let Z = {⟨a, b⟩ ∈ A × A : a ̸≼ b}. By the previous lemma, for each ⟨a, b⟩ ∈ Z, X = C([≺] ∪ {1a − 1b}) and Y = −C([≼]) are disjoint. So, by a hyperplane separation theorem, there exists a linear functional 144 APPENDIX ϕ ⟨a,b⟩ : (R|A1 | × · · · × R|An| ) → R such that: ϕ ⟨a,b⟩ (Y) > 0 ≥ sup ϕ ⟨a,b⟩ (X). Let R Z be the product-reals with |Z| many copies of the reals and the dominance ordering. Then, for each i = 1, . . . , n, define ϕi : Ai → RZ such that for any ai ∈ Ai : ϕi(ai) = ϕ ⟨a,b⟩ i (1ai ) ⟨a,b⟩∈X . Note that it follows from this definition that for any a = (a1, . . . , an) ∈ A: n ∑ i=1 ϕi(ai) = ϕ ⟨x,y⟩ i (1a) ⟨x,y⟩∈X . We want to show that for any a = (a1, . . . , an), b = (b1, . . . , bn) ∈ A: a ≼ b ⇐⇒ n ∑ i=1 ϕi(ai) ≤ n ∑ i=1 ϕi(bi). For the left-to-right direction, suppose a ≼ b. Then, 1a − 1b ∈ −C([≼]). And so, for all ⟨x, y⟩ ∈ Z, ϕ ⟨x,y⟩ (1a − 1b) ≤ 0 and ϕ ⟨x,y⟩ (1a) ≤ ϕ ⟨x,y⟩ (1b). Therefore, by the definition of the dominance ordering on R Z: n ∑ i=1 ϕi(ai) = ϕ ⟨x,y⟩ i (1a) ⟨x,y⟩∈X ≤ ϕ ⟨x,y⟩ i (1b) ⟨x,y⟩∈X = n ∑ i=1 ϕi(bi). Conversely, suppose a ̸≼ b. Then, ϕ ⟨a,b⟩ (C([≺] ∪ {1a − 1b})) > 0 and therefore ϕ ⟨a,b⟩ (1a − 1b) > 0. This means that ϕ ⟨a,b⟩ (1a) ̸≤ ϕ ⟨a,b⟩ (1b) Thus: n ∑ i=1 ϕi(ai) = ϕ ⟨x,y⟩ i (1a) ⟨x,y⟩∈X ̸≤ ϕ ⟨x,y⟩ i (1b) ⟨x,y⟩∈X = n ∑ i=1 ϕi(bi). Part (i) of the theorem is an easy corollary of (ii). The proof of parts (iii) and (iv) requires introducing the ultraproduct construction of preordered Abelian groups from a sequence of preordered Abelian groups. This first requires the notion of an ultrafilter. Definition 11 (Ultrafilter). An ultrafilter U on a set I is a collection of subsets of I such that: CHAPTER 4. ADDITIVE RELATIONS ON CARTESIAN PRODUCTS 145 1. (Non-Triviality) ∅ ̸∈ U; 2. (Upwards Closure) if J ∈ U and J ⊆ K, then K ∈ U, for all J, K ⊆ I; 3. (Intersections) if J, K ∈ U, then J ∩ K ∈ U, for all J, K ⊆ I; 4. (Complements) either J ∈ U or (J/I) ∈ U, for any J ⊆ I. The elements of an ultrafilter can, very roughly, be taken to be the ‘’large” subsets of I, where the large/small distinction is conceived of in analogy with the true/false distinction. The four conditions above mirror the fact that: (i) a contradiction is false, (ii) a truth entails a truth, (iii) the conjunction of two truths is a truth, and (iv) either a proposition or its negation is true. Given sequences indexed by I, if we quotient by an ultrafilter U on I by treating those sequences that don’t differ on a large subset of I as identical, we can construct a new mathematical structure from sequences of existing mathematical structures that inherit certain nice properties shared by the existing structures: Definition 12 (Ultraproduct). Let Gi = ⟨Gi , +i , ⩽i⟩ be a collection of preordered Abelian groups indexed by I. Let U be an ultrafilter on I. For each (xi)i∈I ∈ ∏∈I Gi , we define the equivalence class: [(xi)i∈I ] = {(yi)i∈I : {i ∈ I : xi = yi} ∈ U.} The ultraproduct GU = ⟨G, +, ⩽⟩ is defined as follows: • G = {[(xi)i∈I ]U : xi ∈ Gi for each i ∈ I}; • [(xi)i∈I ]U + [(yi)i∈I ]U = [(xi + yi)i∈I ]U ; • [(xi)i∈I ]U ⩽ [(yi)i∈I ]U if and only if {i ∈ I : xi ⩽i yi} ∈ U. It follows from a fundamental theorem about ultraproduct construction that GU is also a preordered Abelian group. 9 Proof of (iii) and (iv) of Theorem 3. We prove this by induction on the cardinality of the Cartesian product. Call A = ∏ n i=1 Ai κ-ary if the cardinality of each Ai is at most κ, i.e. |Ai | ≤ κ for each i ∈ I. Parts (i) and (ii) of the theorem provide the base case for finitary products. For the inductive step, suppose that (iii) and (iv) 9See Chang & Keisler (1973, 217). 146 APPENDIX hold for λ-ary products for all λ < κ. We want to show that they then hold for any κ-ary products. Let ≼ be a binary relation on a κ-ary finite-dimensional product A = ∏ n i=1 Ai , for which there exists a symmetric and monotonic aggregator. For each Ai , there is a sequence of κ-length of increasing subsets of strictly smaller cardinality approaching Ai : A 1 i ⊆ A 2 i ⊆ · · · ⊆ [ γ<κ A γ i = Ai , where each |A γ i | < κ. Each subproduct Aγ = ∏ n i=1 A γ i is then λ-ary for some λ < κ. Define ≼γ to be the restriction of ≼ to Aγ . If ≼ has a symmetric and monotonic aggregator F : A<ω → B, then each of its restriction F γ : (Aγ ) <ω → B is a symmetric and monotonic aggregator for ≼γ . So, by the inductive hypothesis, there exist functions ϕ γ i : A γ i → Gγ such that for any (ai)i≤n,(bi)i≤n ∈ Aγ : (ai)i≤n≼ γ (bi)i≤n ⇐⇒ n ∑ i=1 ϕ γ i (ai) ≤ n ∑ i=1 ϕ γ i (bi), where each Gγ is the reals if ≼ is total and a product of reals otherwise. Now, let U be a uniform ultrafilter on κ (where if X ∈ U, then |X| = κ).10 For each i ≤ n, define ϕi : Ai → GU where for each ai ∈ Ai : ϕi(ai) = (ϕ γ i (a γ i ))γ<κ U where a γ i = ai if ai ∈ A γ i and a γ i is any arbitrary element of A γ i otherwise. We want to show that for any (ai)i≤n,(bi)i≤n ∈ A: (ai)i≤n ≼ (bi)i≤n ⇐⇒ n ∑ i=1 ϕi(ai) ≤ n ∑ i=1 ϕi(bi). (†) Suppose (ai)i≤n ≼ (bi)i≤n. Let λ be the least cardinality such that (ai)i≤n,(bi)i≤n ∈ Aλ . Then, clearly, for all γ ≥ λ, (ai)i≤n,(bi)i≤n ∈ Aγ and since (ai)i≤n ≼ (bi)i≤n, it follows that (ai)i≤n ≼γ (bi)i≤n and ∑ n i=1 ϕ γ i (ai) ≤ ∑ n i=1 ϕ γ i (bi). This means that the set {γ < κ : ∑ n i=1 ϕ γ i (ai) ̸≤ ∑ n i=1 ϕ γ i (bi)} = {γ < λ} has cardinality less than κ and is not in U. This means that its complement {γ < κ : ∑ n i=1 ϕ γ i (ai) ≤ 10Such an ultrafilter always exists. In fact, there are 22 κ of them (see Jech (2006, 75)). CHAPTER 4. ADDITIVE RELATIONS ON CARTESIAN PRODUCTS 147 ∑ n i=1 ϕ γ i (bi)} ∈ U and thus: n ∑ i=1 ϕ γ i (a γ i ) ! γ<κ U ≤ n ∑ i=1 ϕ γ i (b γ i ) ! γ<κ U By definition: n ∑ i=1 ϕi(ai) = n ∑ i=1 h ϕ γ i (a γ i ) γ<κ i U = n ∑ i=1 ϕ γ i (a γ i ) ! γ<κ U n ∑ i=1 ϕi(bi) = n ∑ i=1 h ϕ γ i (b γ i ) γ<κ i U = n ∑ i=1 ϕ γ i (b γ i ) ! γ<κ U And thus, we obtain the left-to-right direction of (†). The proof of the rightto-left direction is similar, simply substituting ≤ with ̸≤ and vice versa where necessar Bibliography Aczél, János. 2019. Bisymmetry and consistent aggregation: Historical review and recent results. Pages 225–233 of: Choice, decision, and measurement. Routledge. Aczél, János, & Maksa, Gyula. 1996. Solution of the Rectangular m × n Generalized Bisymmetry Equation and of the Problem of Consistent Aggregation. Journal of Mathematical Analysis and Applications, 203(1), 104–126. Aczel, Janos, Maksa, Gyula, & Taylor, Mark. 1997. Equations of generalized bisymmetry and of consistent aggregation: weakly surjective solutions which may be discontinuous at places. Journal of Mathematical Analysis and Applications, 214(1), 22–35. Adler, Matthew D. 2007. Well-being, Inequality and Time: The Time-slice Problem and its Policy Implications. Tech. rept. Institute for Law & Economics, University of Pennsylvania Law School. Adler, Matthew D. 2018. Prioritarianism: Room for Desert? Utilitas, 30(2), 172–197. Adler, Matthew D. 2019. Measuring Social Welfare: An Introduction. Oxford University Press. Arneson, Richard. 2006. Desert and Equality. Pages 262–293 of: Holtug, Nils, & Lippert-Rasmussen, Kasper (eds), Egalitarianism: New Essays on the Nature and Value of Equality. Clarendon Press. Arneson, Richard. 2019. Individual Well-Being and Social Justice. Proceedings and Addresses of the American Philosophical Association, 93, 39–66. Arneson, Richard. 2022. Prioritarianism. Elements in Ethics. Cambridge University Press. 148 BIBLIOGRAPHY 149 Arrhenius, Gustaf. 2000. Future Generations: A Challenge for Moral Theory. Ph.D. thesis, Uppsala University. Arrhenius, Gustaf. 2003. Feldman’s Desert-Adjusted Utilitarianism and Population Ethics. Utilitas, 15(2), 225–236. Arrhenius, Gustaf. 2011. The Impossibility of a Satisfactory Population Ethics. Pages 1–26. Bader, Ralf M. 2022. Person-Affecting Utilitarianism. In: Arrhenius, Gustaf, Bykvist, Krister, Campbell, Tim, & Finneron-Burns, Elizabeth (eds), The Oxford Handbook of Population Ethics. Oxford University Press. Bentham, Jeremy (ed). 1977. A Comment on the Commentaries and a Fragment on Government. [Atlantic Highlands], N.J.: Humanities Press. Blackorby, Charles, Bossert, Walter, & Donaldson, David J. 2005. Population issues in social choice theory, welfare economics, and ethics. Cambridge University Press. Bramble, Ben. 2014. Whole-Life Welfarism. American Philosophical Quarterly, 51(1), 63–74. Bramble, Ben. 2017. The Passing of Temporal Well-Being. New York, NY: Routledge. Brännmark, Johan. 2001. Good Lives: Parts and Wholes. American Philosophical Quarterly, 38(2), 221–231. Brink, David. 1993. The Separateness of Persons, Distributive Norms, and Moral Theory. Pages 252–289 of: Frey, R. G., & Morris, Christopher (eds), Value, Welfare, and Morality. Cambridge University Press. Brink, David O. 2011. Prospects for Temporal Neutrality. In: Callender, Craig (ed), The Oxford Handbook of Philosophy of Time. Oxford University Press. Broome, John. 1991. Weighing Goods: Equality, Uncertainty and Time. WileyBlackwell. Broome, John. 2004a. The Value of Living Longer. Pages 243–260 of: Anand, Sudhir, Peter, Fabienne, & Sen, Amartya (eds), Public Health, Ethics, and Equity. Oxford University Press. Broome, John. 2004b. Weighing Lives. Oxford University Press. 150 BIBLIOGRAPHY Broome, John. 2015. Equality Versus Priority: A Useful Distinction. Economics and Philosophy, 31(2), 219–228. Bykvist, Krister. 2009. Utilitarianism: A Guide for the Perplexed. Continuum. Carlson, Erik. 1997. Consequentialism, Distribution and Desert. Utilitas, 9(3), 307. Chang, Chen Chung, & Keisler, H. Jerome. 1973. Model Theory. Amsterdam, Netherlands: North Holland. Chang, Ruth. 2002. The Possibility of Parity. Ethics, 112(4), 659–688. Chappell, Richard Yetter. 2015. Value Receptacles. Noûs, 49(2), 322–332. Clifford, Alfred H. 1954. Note on Hahn’s theorem on ordered abelian groups. Proceedings of the American Mathematical Society, 5(6), 860–863. Conrad, Paul F. 1953. Embedding theorems for abelian groups with valuations. American Journal of Mathematics, 75(1), 1–29. Crisp, Roger. 1997. Mill on Utilitarianism. Routledge. de Finetti, Bruno. 1937. La Prévision: Ses Lois Logiques, Ses Sources Subjectives. Annales de l’Institut Henri Poincaré, 17, 1–68. Debreu, Gerard. 1960. Topological methods in cardinal utility theory. Pages 16–26 of: Mathematical Methods in the Social Sciences. Proceedings of the First Stanford Symposium. Stanford University Press. Diamond, Peter A. 1967. Cardinal Welfare, Individualistic Ethics, and Interpersonal Comparison of Utility: Comment. Journal of Political Economy, 75(5), 765– 766. Dorr, Cian, Nebel, Jacob M., & Zuehl, Jake. forthcoming. The Case for Comparability. Noûs. Dorsey, Dale. 2013. Desire-Satisfaction and Welfare as Temporal. Ethical Theory and Moral Practice, 16(1), 151–171. Feldman, Fred. 1995a. Adjusting Utility for Justice: A Consequentialist Reply to the Objection From Justice. Philosophy and Phenomenological Research, 55(3), 567–585. BIBLIOGRAPHY 151 Feldman, Fred. 1995b. Justice, Desert, and the Repugnant Conclusion. Utilitas, 7(2), 189–206. Fleurbaey, Marc. 2010. Assessing Risky Social Situations. Journal of Political Economy, 118(4), 649–680. Foot, Philippa. 2003. Natural goodness. Clarendon Press. Gauthier, David P. 1963. Practical Reasoning. Oxford,: Clarendon Press. Gorman, William M. 1968. The structure of utility functions. The Review of Economic Studies, 35(4), 367–390. Gustafsson, Johan E., & Kowalczyk, Kacper. forthcoming. Ex-Ante Pareto and the Opaque-Identity Puzzle. Journal of Philosophy. Hara, Kazuhiro, Ok, Efe A, & Riella, Gil. 2019. Coalitional Expected Multi-Utility Theory. Econometrica, 87(3), 933–980. Hare, Caspar. 2010. Take the Sugar. Analysis, 70(2), 237–247. Harsanyi, John C. 1955. Cardinal Welfare, Individualistic Ethics, and Interpersonal Comparisons of Utility. Journal of Political Economy, 63(4), 309–321. Hausner, Melvin, & Wendel, James G. 1952. Ordered vector spaces. Proceedings of the American Mathematical Society, 3(6), 977–982. Hirose, Iwao. 2013. Aggregation and the Separateness of Persons. Utilitas, 25(2), 182–205. Holtug, Nils. 2006. Prioritarianism. Pages 125–156 of: Holtug, Nils, & LippertRasmussen, Kasper (eds), Egalitarianism: New Essays on the Nature and Value of Equality. Clarendon Press. Hurka, Thomas M. 1982a. Average Utilitarianisms. Analysis, 42(2), 65–69. Hurka, Thomas M. 1982b. More Average Utilitarianisms. Analysis, 42(3), 115– 119. Jech, Thomas. 2006. Set Theory: The Third Millennium Edition, revised and expanded. Springer. Joyce, James M. 1999. The Foundations of Causal Decision Theory. Cambridge University Press. 152 BIBLIOGRAPHY King, Owen C. 2018. Pulling Apart Well-Being at a Time and the Goodness of a Life. Ergo: An Open Access Journal of Philosophy, 5, 349–370. Kraft, Charles H., Pratt, John W., & Seidenberg, A. 1959. Intuitive Probability on Finite Sets. The Annals of Mathematical Statistics, 30(2), 408 – 419. Krantz, David, Luce, Duncan, Suppes, Patrick, & Tversky, Amos (eds). 1971. Foundations of Measurement, Vol. I: Additive and Polynomial Representations. New York Academic Press. List, Christian, & Pettit, Philip. 2011. Group Agency: The Possibility, Design, and Status of Corporate Agents. New York: Oxford University Press. Mahtani, Anna. 2017. The Ex Ante Pareto Principle. Journal of Philosophy, 114(6), 303–323. Mahtani, Anna. 2021. Frege’s Puzzle and the Ex Ante Pareto Principle. Philosophical Studies, 178(6), 2077–2100. Maksa, Gyula. 1999. Solution of generalized bisymmetry type equations without surjectivity assumptions. aequationes mathematicae, 57(1), 50–74. Masny, Michal. 2023. Wasted Potential: The Value of a Life and the Significance of What Could Have Been. Philosophy and Public Affairs, 51(1), 6–32. McCarthy, David. 2006. Utilitarianism and Prioritarianism I. Economics and Philosophy, 22(3), 335–363. McCarthy, David. 2008. Utilitarianism and Prioritarianism II. Economics and Philosophy, 24(1), 1–33. McCarthy, David, Mikkola, Kalle, & Thomas, Joaquin Teruji. 2020. Utilitarianism with and Without Expected Utility. Journal of Mathematical Economics, 87, 77– 113. McKerlie, Dennis. 1988. Egalitarianism and the Separateness of Persons. Canadian Journal of Philosophy, 18(2), 205–225. McKerlie, Dennis. 1989. Equality and Time. Ethics, 99(3), 475–491. McKerlie, Dennis. 2001. Dimensions of Equality. Utilitas, 13(3), 263. McKerlie, Dennis. 2012. Justice Between the Young and the Old. Oxford University Press USA. BIBLIOGRAPHY 153 Mogensen, Andreas L. 2022. The Only Ethical Argument for Positive δ? Partiality and Pure Time Preference. Philosophical Studies, 179(9), 2731–2750. Mongin, Philippe, & Pivato, Marcus. 2015. Ranking multidimensional alternatives and uncertain prospects. Journal of Economic Theory, 157, 146–171. Nagel, Thomas. 1970. The Possibility of Altruism. Oxford,: Clarendon P. Nagel, Thomas. 1979. Mortal Questions. New York: Cambridge University Press. Nagel, Thomas. 1995. Equality and Partiality. Oxford University Press. Norcross, Alastair. 2009. Two Dogmas of Deontology: Aggregation, Rights, and the Separateness of Persons. Social Philosophy and Policy, 26(1), 76–95. Nozick, Robert. 1974. Anarchy, State, and Utopia. New York: Basic Books. Otsuka, Michael. 2012. Prioritarianism and the Separateness of Persons. Utilitas, 24(3), 365–380. Otsuka, Michael. 2015. Prioritarianism and the Measure of Utility. Journal of Political Philosophy, 23(1), 1–22. Otsuka, Michael, & Voorhoeve, Alex. 2018. Equality Versus Priority. Pages 65– 85 of: Olsaretti, Serena (ed), Oxford Handbook of Distributive Justice. Oxford University Press. Parfit, Derek. 1984. Reasons and Persons. Oxford University Press. Parfit, Derek. 2002. Equality or Priority? Pages 81–125 of: Clayton, Matthew, & Williams, Andrew (eds), The Ideal of Equality. Palgrave Macmillan. Pettit, Philip. 2015. The Robust Demands of the Good: Ethics with Attachment, Virtue, and Respect. Oxford, GB: Oxford University Press. Pivato, Marcus. 2013. Variable-population voting rules. Journal of Mathematical Economics, 49(3), 210–221. Pivato, Marcus. 2014. Additive Representation of Separable Preferences Over Infinite Products. Theory and Decision, 77(1), 31–83. Pivato, Marcus, & Tchouante, Élise Flore. 2023. Bayesian social aggregation with non-Archimedean utilities and probabilities. Economic Theory, 1–35. 154 BIBLIOGRAPHY Rabinowicz, Wlodek. 2002. Prioritarianism for Prospects. Utilitas, 14(1), 2–21. Rawls, John. 1971. A Theory of Justice. Harvard University Press. Rosati, Connie S. 2021. The Normative Significance of Temporal Well-Being. Res Philosophica, 98(1), 125–139. Sarch, Alexander. 2013. Desire Satisfactionism and Time. Utilitas, 25(2), 221–245. Savage, Leonard J. 1954. The Foundations of Statistics. Wiley Publications in Statistics. Scanlon, Thomas. 1998. What We Owe to Each Other. Cambridge: Belknap Press of Harvard University Press. Scott, Dana. 1964. Measurement structures and linear inequalities. Journal of Mathematical Psychology, 1(2), 233–247. Singer, Peter. 1993. Practical Ethics, 2nd Edition. Cambridge University Press. Slater, Joe. 2023. History of Utilitarianism. In: Internet Encyclopedia of Philosophy. Slote, Michael. 2017. Goods and Lives. Pacific Philosophical Quarterly, 63(4), 311– 326. Smith, John H. 1973. Aggregation of preferences with variable electorate. Econometrica: Journal of the Econometric Society, 1027–1041. Sumner, L. W. 1996. Welfare, Happiness, and Ethics. New York: Oxford University Press. Taurek, John. 1977. Should the Numbers Count? Philosophy and Public Affairs, 6(4), 293–316. Temkin, Larry S. 1993. Inequality. Oxford University Press. Thomas, Teruji. 2022. Separability and Population Ethics. Pages 271–295 of: The Oxford Handbook of Population Ethics. Oxford University Press. Thomson, Judith Jarvis. 2008. Normativity. Open Court. Thomson, Judith Jarvis. 2009. Goodness and Advice. Princeton University Press. Velleman, J. David. 1991. Well-Being And Time. Pacific Philosophical Quarterly, 72(1), 48–77. BIBLIOGRAPHY 155 von Neumann, John, & Morgenstern, Oskar. 1944. Theory of Games and Economic Behavior. Princeton University Press. Voorhoeve, Alex, & Fleurbaey, Marc. 2012. Egalitarianism and the Separateness of Persons. Utilitas, 24(3), 381–398. Wakker, Peter P. 1988. Additive Representations of Preferences. Theory and Decision Library C. Springer. Wakker, Peter P. 2013. Additive representations of preferences: A new foundation of decision analysis. Vol. 4. Springer Science & Business Media. Walden, Kenneth. 2020. Incomparable Numbers. Oxford Studies in Normative Ethics, 10. Woodard, Christopher. 2019. Taking Utilitarianism Seriously. Oxford, UK: Oxford University Press. Young, H Peyton. 1975. Social choice scoring functions. SIAM Journal on Applied Mathematics, 28(4), 824–838.
Abstract (if available)
Linked assets
University of Southern California Dissertations and Theses
Conceptually similar
PDF
Aggregating complaints
PDF
The metaphysics of social groups: structure, location, and change
PDF
Feeling good and living well: on the nature of pleasure and its role in well-being
PDF
The grammar of individuation, number and measurement
PDF
Political decision-making in an uncertain world
PDF
Aggregating happiness: seeking and identifying a single plausible unifying theory
PDF
Pharmacokinetic modeling: ciprofloxacin in the environment and metformin PBPK model
PDF
Sound symbolism and visual categorization
PDF
The reversible ionic liquid anion intercalation/deintercalation of multilayer graphene
PDF
Visualizing the effects of cultural communication on the individual: Confucianism and new family structure
PDF
Film deposition and optoelectronic properties of low-dimensional hybrid lead halides
PDF
Protein aggregation: current scenario and recent developments
PDF
Unbounded utility
PDF
Reasons, obligations, and the structure of good reasoning
PDF
Learning about the Internet through efficient sampling and aggregation
PDF
Difficulty-as-importance and difficulty-as-impossibility: unpacking the context-sensitivity and consequences of identity-based inferences from difficulty
PDF
Measuing and mitigating exposure bias in online social networks
PDF
An account of the normative structure of human rights
PDF
A perceptual model of evaluative knowledge
PDF
A study of the effects of orthodontic appliances on the streptococcus mitis and streptococcus salivarius count of the saliva
Asset Metadata
Creator
San, Weng Kin
(author)
Core Title
Aggregation and the structure of value
School
College of Letters, Arts and Sciences
Degree
Doctor of Philosophy
Degree Program
Philosophy
Degree Conferral Date
2024-08
Publication Date
08/19/2024
Defense Date
07/19/2024
Publisher
Los Angeles, California
(original),
University of Southern California
(original),
University of Southern California. Libraries
(digital)
Tag
aggregation,OAI-PMH Harvest,repugnant conclusion,separateness of persons,totalism,utilitarianism
Format
theses
(aat)
Language
English
Contributor
Electronically uploaded by the author
(provenance)
Advisor
Russell, Jeff (
committee chair
), Hawthorne, John (
committee member
), Nebel, Jake (
committee member
), Rudin, Deniz (
committee member
), Wedgwood, Ralph (
committee member
)
Creator Email
sanwengkin@gmail.com,wsan@usc.edu
Permanent Link (DOI)
https://doi.org/10.25549/usctheses-oUC113999BT9
Unique identifier
UC113999BT9
Identifier
etd-SanWengKin-13406.pdf (filename)
Legacy Identifier
etd-SanWengKin-13406
Document Type
Dissertation
Format
theses (aat)
Rights
San, Weng Kin
Internet Media Type
application/pdf
Type
texts
Source
20240821-usctheses-batch-1200
(batch),
University of Southern California
(contributing entity),
University of Southern California Dissertations and Theses
(collection)
Access Conditions
The author retains rights to his/her dissertation, thesis or other graduate work according to U.S. copyright law. Electronic access is being provided by the USC Libraries in agreement with the author, as the original true and official version of the work, but does not grant the reader permission to use the work if the desired use is covered by copyright. It is the author, as rights holder, who must provide use permission if such use is covered by copyright.
Repository Name
University of Southern California Digital Library
Repository Location
USC Digital Library, University of Southern California, University Park Campus MC 2810, 3434 South Grand Avenue, 2nd Floor, Los Angeles, California 90089-2810, USA
Repository Email
cisadmin@lib.usc.edu
Tags
aggregation
repugnant conclusion
separateness of persons
totalism
utilitarianism