Close
About
FAQ
Home
Collections
Login
USC Login
Register
0
Selected
Invert selection
Deselect all
Deselect all
Click here to refresh results
Click here to refresh results
USC
/
Digital Library
/
University of Southern California Dissertations and Theses
/
Reasoning with degrees of belief
(USC Thesis Other)
Reasoning with degrees of belief
PDF
Download
Share
Open document
Flip pages
Contact Us
Contact Us
Copy asset link
Request this asset
Transcript (if available)
Content
Reasoning with Degrees of Belief by Julia Staffel A Dissertation Presented to the FACULTY OF THE USC GRADUATE SCHOOL UNIVERSITY OF SOUTHERN CALIFORNIA In Partial Fulfillment of the Requirements for the Degree DOCTOR OF PHILOSOPHY (PHILOSOPHY) May 2013 Copyright 2013 Julia Staffel Acknowledgements The idea for this dissertation was born in an epistemology seminar that I took in the spring semester of 2009 at USC with James Van Cleve and Kenny Easwaran. I spent the following three years developing my ideas about degrees of incoherence, and about reasoning with degrees of belief. This would not have been possible without the support of many fellow philosophers. I would especially like to thank Jacob Ross, whose tireless support and advice were crucial in all stages of the project. He was always available to me (except in the morning) to discuss philosophical ideas, offer constructive criticism, and help me improve my writing. I could not have asked for a better advisor. I am also grateful to Kenny Easwaran, who was always willing to dive into technical details, and whose philosophical and mathematical genius has often helped me see how to express my ideas in formally adequate and elegant ways. Mark Schroeder deserves special thanks for always helping me keep the big picture in mind, and for being a lifesaver in several philosophical emergencies. Branden Fitelson and Ralph Wedgwood have also been invaluable sources of support. In the summer of 2009, I did an independent study with Branden, which really helped me kick-start my project, and he has been an enthusiastic supporter of my work ever since. Ralph joined USC during the later stages of my work, and his support and advice regarding my dissertation, and also regarding the presentation of my work during my job, search were very helpful. There are many other philosophers at USC and elsewhere that I am grateful to for supporting me in my research: Scott Soames, Barry Schein (who I will count as an honorary philosopher), James Van Cleve, Alan Hájek, Teddy Seidenfeld, Justin Snedegar, Ben Lennertz, Shyam Nair, Justin Dallmann, Robert Shanklin, Rebekka Hufendiek, and Adam Keeney. I have also received useful comments and feedback from many more people than I can list here, who attended various conferences where I presented my work. Two more people deserve special mentioning: Eddie Maler, my high school philosophy teacher, who encouraged me to become a philosopher early on, and Uwe Scheffler, who taught me (almost) everything I know about logic, and whose dedication to teaching will always inspire me. I’d also like to thank my friends and family for their love and support, especially my husband Brian Talbot, who is a magnificent companion in philosophy and in life. Table of Contents Abstract 1 Chapter One: Can There be Reasoning with Degrees of Belief? 5 Introduction 5 1. The Case for Reasoning with Degrees of Belief 6 2. The Case against Reasoning with Degrees of Belief 9 2.1 The Explicitness Argument 9 2.2 The Complexity Argument 13 2.3 The Intentionality Argument 20 Conclusion 23 Chapter Two: Is Subjective Bayesianism a Theory of Reasoning? 24 Introduction 24 1. Basics: The Probability Calculus 25 2. Logic vs. Reasoning 28 3. Idealization to the Rescue? 37 Conclusion 43 Chapter Three: Formulating Principles of Reasoning 45 Introduction 45 1. The Criterial View 46 2. The Tradeoff View 53 Conclusion 56 Chapter Four: Degrees of Incoherence and Dutch Books 58 Introduction 58 1. Why We Need a Measure of Degrees of Incoherence 59 2. First Proposal: Zynda’s Measure of Incoherence 61 3. Incoherence and Dutch Books 65 4. Schervish, Seidenfeld, and Kadane’s Dutch Book Measure of Incoherence 70 5. Two Conditions of Adequacy for Measures of Incoherence 74 6. The Maximum Dutch Book Measure 76 Conclusion 84 Appendix 1: Justifying the Choice of Bet Normalization 85 Appendix 2: Schervish, Seidenfeld, and Kadane’s Dutch Book Measure 86 Chapter Five: Should I Pretend I’m Perfect? 91 Introduction 91 1. Reasoning with Degrees of Belief for Ideal Agents 92 2. Imitating Ideal Agents in the Practical Domain 96 3. Reasoning with Degrees of Belief for Non-Ideal Agents 98 4. Measuring Incoherence 101 4.1 A Desideratum for Evaluating Rules of Reasoning 101 4.2 The Maximum Dutch Book Measure of Degrees of Incoherence 102 5. Should Non-Ideal Agents Imitate Ideal Agents’ Reasoning? 106 5.1 An Easy Example: The Case of Sally 106 5.2 A Counterexample to the Strong Imitation Thesis 111 5.3 A Counterexample to the Weak Imitation Thesis 114 5.4 Conditions Under Which IMB Credences are Optimal 117 Conclusion 119 References 121 1 Abstract In this dissertation, I lay the groundwork for developing a comprehensive theory of reasoning with degrees of belief. Reasoning, as I understand it here, is the mental activity of forming or revising one’s attitudes based on other attitudes. I argue that we need such a theory, since degrees of belief, also called credences, play an important role in human reasoning. Yet, this type of reasoning has so far been overlooked in the philosophical literature. Discussions of reasoning, understood as a mental activity of human beings, focus almost exclusively on the traditional notion of outright belief, according to which an agent can believe, disbelieve, or suspend judgment about a proposition. The philosophical literature on degrees of belief, on the other hand, acknowledges that this model of belief is too coarse grained: agents can have different levels of confidence in a proposition, and hence we should think of belief as a graded notion. Yet, the literature on degrees of belief is hardly concerned with the question of how agents should reason. The leading research paradigm is subjective Bayesianism, a theory according to which the probability axioms constitute the norms of rationality for degrees of belief. However, the norms of subjective Bayesianism should not be construed as principles of reasoning, and so this theory does not provide an account of reasoning with degrees of belief. One important constraint on a comprehensive theory of reasoning with degrees of belief is that it must apply to non-ideal reasoners, who have incoherent degrees of belief. This constraint provides one of the reasons why subjective Bayesianism cannot be viewed as a theory of reasoning: it is widely criticized for applying at best to ideal agents, since the coherence norms on degrees of belief it postulates seem impossibly demanding for human agents. I argue that when we try to establish principles of reasoning that apply to incoherent agents, a condition of adequacy on such principles is that they should minimize increases in incoherence. In order to evaluate whether a principle of reasoning meets this condition of adequacy, we need to be able to measure the degree to which an agent’s credence function is incoherent. Yet, the standard Bayesian theory provides no way of measuring degrees of incoherence. This theory allows us to distinguish between coherent and incoherent credence functions, but it does not allow us to distinguish between credence functions with higher and lower degrees of incoherence. I propose a way of extending the standard Bayesian framework by developing and defending a formal measure of such incoherence. I then show how this measure 2 can be applied to formulate constraints on adequate rules of reasoning for incoherent agents. In particular, I use the measure to ascertain whether it is advisable for non-ideal agents to follow the same reasoning strategies as their ideal counterparts. I show that this is not always a good idea, because doing so can sometimes make an agent more incoherent than following some alternative reasoning strategy. Each chapter of the dissertation is written as an article that can be read and understood on its own, but together they provide a greater picture of how to establish a theory of reasoning with degrees of belief. Chapter Overview 1. Can There Be Reasoning with Degrees of Belief? I begin my dissertation with a discussion of the question of whether degrees of belief are the kinds of attitudes that can figure in reasoning processes. I take reasoning to be a mental activity that may or may not be conscious, and that aims at forming, reaffirming, or changing mental attitudes on the basis of other attitudes. It is commonly assumed that humans can reason with outright beliefs. Yet it has been argued that reasoning with degrees of belief is impossible, on a number of grounds: because degrees of belief are not explicitly represented; because reasoning with degrees of belief is too complicated for humans to engage in; and because mental processes involving degrees of belief lack the conscious, intentional character that is supposed to be a necessary feature of reasoning. Based on philosophical as well as empirical considerations, I show that these arguments are unsuccessful, and that we have good reason to think that humans reason with degrees of belief all the time. 2. Is Subjective Bayesianism a Theory of Reasoning? Having established that humans can reason with degrees of belief, the next step in my investigation is to check whether the philosophical literature already provides us with some clues about how we should reason with them. I begin by surveying an important discussion of reasoning with outright beliefs. Gilbert Harman (1986) argues that a theory of logic should not be regarded as providing a theory of reasoning: the former concerns logical relations between propositions, while the latter concerns normative relations between beliefs. 3 I argue that we can draw a similar distinction among Bayesian theories. Subjective Bayesianism falls short of being a theory of reasoning with degrees of belief for exactly the same kinds of reasons that prevent deductive logic from being a theory of reasoning with outright beliefs. Some authors conclude from this that subjective Bayesianism is a type of logical system, while others have argued that subjective Bayesianism should be understood as a theory of reasoning that applies only to ideal agents. I argue that the norms proposed by subjective Bayesians can inform a comprehensive theory of reasoning with degrees of belief, but they do not provide such a theory, even for ideal agents. 3. Formulating Principles of Reasoning In the third chapter, I discuss in general terms how rules of logic or probability can inform norms of reasoning. I present two different approaches for developing norms of reasoning that I call the criterial view and the tradeoff view. The criterial view looks for a list of necessary conditions that a norm of reasoning needs to fulfill in order to be a good norm of reasoning. It is a very common approach that underlies many discussions of how to select norms of reasoning in the literature. By contrast, the tradeoff view specifies a number of dimensions of evaluation according to which the goodness of a principle of reasoning can be measured, and it specifies how these individual measurements are to be weighed against each other in evaluating candidate norms of reasoning. The tradeoff view is thus more flexible than the criterial view. And this flexibility is desirable, I argue, as it allows us to take into account that different reasoning strategies may be appropriate in different circumstances and for different agents. I also consider the question of what the aims are in relation to which principles of reasoning are to be evaluated, and I argue that these will include both purely epistemic aims, such as coherence, as well as aims of efficiency. 4. Degrees of Incoherence and Dutch Books In the fourth chapter, I take up the project of developing one of the elements we need in order to use the tradeoff approach in evaluating principles of reasoning. Given that I take one of the aims to be probabilistic coherence, we need a way of measuring the degree to which an agent achieves this aim. I develop such a measure in this chapter. The measure exploits the fact that an agent with incoherent degrees of belief is vulnerable to a Dutch book – a collection of bets, each of which is fair by the lights of the agent’s credences, that leads to a guaranteed loss for the agent. I 4 show how we can set up a special kind of Dutch book that I call a maximum Dutch book, and that we can measure an agent’s degree of incoherence in terms of how much money she is guaranteed to lose from such a Dutch book. I also show why the two previous proposals for measuring degrees of incoherence in the literature, put forth by Zynda, and by Schervish, Seidenfeld and Kadane, are not suitable for capturing the degree of incoherence of an agent’s entire credence functions. 5. Should I Pretend I’m Perfect? In the last chapter, I explain how the measure of incoherence developed in chapter four can be put to work to evaluate principles of reasoning. And in particular, I ask whether agents who have incoherent degrees of belief should try to imitate the reasoning strategies of ideal agents. We know that in the case of practical reasoning, when non-ideal agents attempt to imitate ideal agents, the results can be disastrous. Here I ask whether something similar is true with respect to theoretical reasoning. I argue that, in a restricted class of cases, imitating the ideal agent’s reasoning is optimal, because it does not make the agent more incoherent than she was to begin with. However, I also show that there are other cases in which there is no way of finding an optimal credence assignment by imitating an ideal reasoning strategy. 5 Chapter One: Can There Be Reasoning with Degrees of Belief? Introduction In this paper I am concerned with the question of whether degrees of belief can figure in reasoning processes that are executed by humans. Reasoning, as I understand it here, is the mental activity of forming or revising one’s attitudes based on other attitudes. It is generally accepted that outright beliefs and intentions can be part of reasoning processes, but the role of degrees of belief remains unclear. The literature on subjective Bayesianism, which seems to be the natural place to look for discussions of the role of degrees of belief in reasoning, does not address the question of whether degrees of belief play a role in real agents’ reasoning processes. Subjective Bayesianism tends to be concerned instead with modeling reasoning processes of certain kinds of ideal agents, but it usually does not discuss how these models relate to human psychology. Some authors even think that subjective Bayesianism seems more akin to a logic of degrees of belief, which is quite different from a theory of reasoning. 1 On the other hand, the philosophical literature on reasoning, which relies much less heavily on idealizing assumptions about reasoners, is almost exclusively concerned with outright belief. 2 One possible explanation for why no philosopher has yet developed an account of reasoning with degrees of belief is that reasoning with degrees of belief is not possible for humans. I will investigate in this paper whether this claim is plausible. In the first part of the paper, I will discuss introspective and empirical considerations that suggest that we can reason with degrees of belief. In the second part, I will discuss three different arguments that purport to show that humans cannot reason with degrees of belief. Two of them have been suggested by Gilbert Harman in Change in View (1986), and the last one is based on claims commonly made about reasoning in the literature. I will show why these arguments are flawed, and conclude that, at least as far as these arguments are concerned, it seems like there is no good reason why the topic 1 For an insightful discussion of the difference between a theory of logic and a theory of reasoning, see Harman, 1986, Ch. 1&2. For an argument for the view that subjective Bayesianism is a kind of logical system, see Howson & Urbach, 2006. 2 In the manuscript for his new book Rationality through Reasoning, John Broome focuses on outright belief, and he begins his very brief discussion of degrees of belief with the remark that he does not know of a worked-out theory of reasoning with degrees of belief (Broome, 2009, p. 277). Other authors who have published important work on reasoning in the last twenty years or so also focus on outright belief (eg. Boghossian, 2011 (APA presentation); Streumer, 2007; Wedgwood, 2006; Grice, 2001; Harman, 1986; Walker, 1985). 6 of reasoning with degrees of belief has received so little attention. Any plausible theory of reasoning should consider degrees of belief as serious candidates for attitudes that can be involved in reasoning processes. 1. The Case for Reasoning with Degrees of Belief It is not my goal in this paper to defend a particular account of reasoning, but I should say a few words about what I take reasoning to be. I am only interested in reasoning of the kind that is done by one person, not reasoning that is done by a group of people. I think I am in agreement with good common sense if I take reasoning to be a mental activity that is directed at forming or revising mental attitudes on the basis of other such attitudes. So, the question I am trying to answer is whether reasoning, so understood, can involve degrees of belief. Degrees of belief differ from outright beliefs in the following way: the outright belief that p is what you ascribe to some subject S by saying that S believes that p. By contrast, degrees of belief are the kinds of attitudes we mean when we speak about how confident S is in p, or that S is more confident in p than in q. Degrees of belief are often represented in formal models as numbers between 0 and 1, and I will adopt this practice in some of my examples. By modeling degrees of belief in this way, we can express how confident a person is in a proposition, but it doesn’t mean that these numbers are actually present in that person’s mind. I won’t say much here about how exactly we know what numerical value to assign to a certain degree of confidence, and whether we should model degrees of belief with precise numbers or intervals. In general, people’s degrees of confidence manifest themselves in their behavioral dispositions and their decision-making. It is important not to confuse degrees of belief with outright beliefs about probabilities. The outright belief that the probability of p is 0.7 is not the same attitude as a degree of belief of 0.7 in p. It is possible to have a degree of belief in a proposition without having a corresponding outright belief in the probability of that proposition. No matter how we spell out what we mean by probability – objective probability, evidential probability, frequency etc. – it is always possible for a subject to have a degree of confidence in some proposition p, yet be uncertain what probability to assign to p, and thus to lack the corresponding outright belief. 3 3 See for example: Ross, 2006, p. 196; Christensen, 2004, p. 19. 7 I will argue that degrees of belief, just like outright beliefs, can function as attitudes that we reason from and attitudes we reason to. In other words, degrees of belief, just like outright beliefs, are available as starting points and end points of reasoning processes. I will now consider four different examples of reasoning processes, and I will argue that we can best capture the similarities and differences between these examples if we maintain that degrees of belief can function as premises and conclusions of reasoning processes. The first example is an instance of practical reasoning, in which outright beliefs serve as starting points. Suppose Waltraud is planning a party for her birthday, and she is trying to decide whether to have the party on Friday or on Saturday. It is of utmost importance to her that as many as possible of her three best friends Franz, Hinz and Kunz will be able to attend. Waltraud believes that Franz is unavailable on Friday because he has ballet practice, but is free on Saturday. She also believes that Hinz is unavailable on Friday because he’ll be babysitting his daughter, and is free on Saturday. Moreover, she believes that Kunz is free on Friday, but busy with his knitting circle on Saturday. From these beliefs, Waltraud reasons that since only one of her friends is free on Friday, but two of them are free on Saturday, and since she wants as many of them as possible to attend, she should have the party on Saturday. Compare this case to a second example, with the only difference that degrees of belief are the starting points of the process. Again, Waltraud is deciding between having the party on Friday or on Saturday. She knows that each of her friends is available on one of the two days, but unavailable on the other. Yet for each particular day, she doesn’t have outright beliefs about each of her friends’ plans; she only has her degrees of belief to work with. This may be, for example, because her friends were rather vague in giving her information about their schedules, or because she simply doesn’t remember exactly what they said. Suppose Waltraud’s credence that Franz is free Saturday is 0.7. Her credence that Kunz is free on Friday is also 0.7. Moreover, her credence that Hinz free on Saturday is 0.6. Again, Waltraud wants as many of her friends as possible to attend her party. She realizes that, given her credences about Franz and Kunz, Friday and Saturday seem equally good, but since she is slightly more confident that Hinz is free on Saturday rather than on Friday, she decides to have the party on Saturday. It is easy to imagine oneself in each of these predicaments, and each case seems like a paradigmatic case of practical reasoning. 8 We can produce a similar pair of examples in the realm of theoretical reasoning. Suppose Franz believes that Hinz, Kunz, or Waltraud will soon become his new boss. He also believes that each of them values his work very highly and would offer him a promotion if they were his superior. Thus, Franz comes to believe on the basis of this information that he will soon be promoted. This third example is an instance of theoretical reasoning with outright beliefs. We can easily construct a fourth example, which is similar except that it results in a degree of belief. Suppose again that Franz believes that either Hinz, Kunz, or Waltraud will soon become his new boss. He also believes that Hinz and Kunz would immediately promote him if they became his boss, but that Waltraud wouldn’t. On the basis of these beliefs, Franz forms a degree of belief of 2/3 that he will soon be promoted. Again, we have two very similar deliberation processes, which differ with respect to the mental state that serves as their respective conclusion. It is certainly uncontroversial that the first and the third example, which only involve only outright beliefs, are instances of reasoning. And given the similarity between the second and the first example, and the similarity between the fourth and third example, it seems very natural to think that the examples involving degrees of belief are instances of reasoning as well. One might object to my characterization of these examples by arguing that reasoning with degrees of belief is really the same as reasoning with outright beliefs about probabilities. Thus, one might claim that in the last example, Franz’ reasoning concludes with the outright belief that the probability that he will be promoted is 2/3, rather than a degree of belief of 2/3 that he will be promoted, and similarly in the second example. This would be a natural view to hold if degrees of belief were nothing over and above outright beliefs of a certain kind. In other words, if degrees of belief were the same thing as outright beliefs about probabilities, then reasoning with degrees of belief would plausibly not be different from reasoning with outright beliefs. However, as I mentioned above, a subject can have a degree of belief in some proposition p without having an outright belief about the probability of p, no matter how we spell out the relevant sense of probability. This is because she may be uncertain about the probability of p, while still having a specific degree of belief in p. Thus, we can simply assume that my examples of reasoning with degrees of belief are cases in which the agents have degrees of belief, but lack outright beliefs in the probabilities of the relevant propositions. If the examples are specified in this way, the possibility that the subjects in 9 the examples are reasoning with outright beliefs instead of degrees of belief is ruled out. The claim that degrees of belief play a distinct role in cognitive processing is also vindicated by empirical studies, for example, by some interesting research by Parsons & Osherson (2001). They conducted several experiments in which they asked subjects to either judge the deductive validity of an argument in premise-conclusion format, or to judge whether they considered a certain conclusion highly likely given a specific set of premises. Meanwhile, researchers were monitoring the subjects’ brain activity. They found that non-numerical, credence-based processing involves neural activations that are distinct from the activation patterns observed in deductive reasoning, and they conclude that “the findings confirm that deduction and induction are distinct processes, consistent with psychological theories enforcing their partial separation.” (p. 954) The fact that credence-based tasks seem to be executed by the brain in a different way than deductive reasoning tasks lends support to the view that there is a real difference between the outright belief-based and the credence-based reasoning processes that served as examples above. I conclude based on my arguments that we have strong prima facie reasons to assume that humans can reason with degrees of belief. In the next section, I will discuss different attempts to establish the opposite conclusion. 2. The Case against Reasoning with Degrees of Belief In the second part of my paper, I will discuss three arguments against the possibility of reasoning with degrees of belief. The first two arguments have been given by Gilbert Harman in his book Change in View, and the third argument is constructed from claims about reasoning that have been made in various places in the literature. 2.1 The Explicitness Argument The explicitness argument is simple: Harman claims that any attitude that can be part of a reasoning process must be an explicit attitude, and he claims that since degrees of belief are not explicit attitudes, one can’t reason with them. I will argue that Harman’s premise that degrees of belief are not explicit is false, and that it relies on a flawed account of the nature of degrees of belief. Yet, I accept the first premise of Harman’s argument – that only explicit attitudes can enter into reasoning processes. 10 Harman explains the difference between explicit and implicit attitudes as follows: I assume one believes something explicitly if one’s belief in that thing involves an explicit mental representation whose content is the content of that belief. On the other hand something is believed only implicitly if it is not explicitly believed, but, for example, is easily inferable from one’s explicit beliefs. (Harman, 1986, p. 13) 4 The same distinction Harman draws here between individually represented, explicit beliefs, and merely inferable, implicit beliefs has been adopted by cognitive scientists in order to distinguish attitudes than can enter into cognitive processes directly from attitudes that need to be retrieved via some computational process in order to be available for processing. The more complex the process is by which a piece of information must be retrieved, the more implicit is the way it is stored. (Harman, 1986, p. 22) This distinction between explicit and implicit attitudes is not only applicable to beliefs and reasoning, but to all mental attitudes and cognitive processes. Any mental attitude that participates in a cognitive process must be maximally explicit. As David Kirsh explains, “the computational complexity of the process of interpretation determines where on the continuum of explicit to implicit a given representation lies. If the interpretative process [...] extracts the content quickly and without substantial involvement of the rest of the cognitive system, then the information it extracts is directly available and hence explicitly encoded.” (Kirsh, 2003, p. 479) Thus, if the distinction between explicit and implicit attitudes is made in this way, it becomes a definitional truth that only explicit attitudes can enter into cognitive processes. That is because, by definition, explicit attitudes are those kinds of attitudes that are represented in a format that is directly available for cognitive processing. A piece of information that is implicitly represented cannot be used in processes like reasoning unless it is first retrieved, or – in Harman’s words – inferred and thus made explicit, at least temporarily. For example, I might implicitly 4 Harman points out that this is not the same as the distinction between conscious and unconscious beliefs, or between occurrent and dispositional beliefs. Unconscious beliefs are not the same as implicit beliefs, because the latter, but not the former, can be easily brought to one’s awareness. Conscious beliefs are not the same as explicit beliefs, because it might be that unconscious beliefs involve explicit representations. Furthermore, the distinction between occurrent and dispositional (merely potentially occurrent) beliefs does not map onto the distinction between explicit and implicit beliefs either, because a belief can be explicitly represented in someone’s mind without being currently in the focus of awareness (Harman, 1986, pp. 13-4). I should note here that there is also a different use of the explicit-implicit distinction in the psychological literature. The distinction used in psychology is much closer to the conscious-unconscious distinction than the one proposed by Harman. Eric Schwitzgebel explains this in more detail in his article on belief in the SEP. 11 believe that I don’t have 12.546 siblings, because that is implied by my explicit belief that I have exactly two siblings, but the mere fact that it is implied by one of my explicit beliefs does not by itself make it ready for entering into a causal process. In order to make this belief ready to participate in cognitive processing, I would have to actually draw the inference, so that my belief that I don’t have 12.546 siblings is at least temporarily represented in an explicit format that is immediately available as a premise in reasoning. 5 Based on these considerations I accept Harman’s first premise: only explicit attitudes can participate in reasoning. More generally, we have seen that any mental attitude that participates in any cognitive process must be explicit, because that means it is represented in our minds in the right way to enter into such a process. Let us now turn to Harman’s second premise: that degrees of belief are implicit attitudes. In light of the considerations in support of the first premise, it is clear that by holding the position that degrees of belief are always implicit, Harman commits himself to claiming that degrees of belief cannot participate in reasoning. If degrees of belief were always implicit, they could never be represented, even temporarily, in a format that makes them accessible as starting points and end points of reasoning processes. And this holds not only for reasoning processes. I have argued above that for any attitude to participate in any mental process, that attitude must be represented explicitly. Thus, Harman commits himself to the position that degrees of belief cannot participate in any mental process. Yet, this position seems difficult to maintain in light of the introspective and empirical considerations I presented in the first part of the paper. Thus, the burden of proof is on Harman to show that his second premise is correct. Harman’s endorsement of the claim that degrees of belief are implicit stems from a very idiosyncratic view of degrees of belief. He thinks that they are an emergent property of the way our outright beliefs are linked. In Change in View, he proposes the following explanation for how beliefs can have varying strengths: 5 Drawing the distinction between explicit and implicit attitudes also provides a neat solution to a kind of storage problem. In the sibling example, it seems very natural to say that, besides believing that you have exactly two siblings, you also believe that you don’t have three siblings, and that you don’t have four siblings, and so on. However, given that the mind only has a limited storage capacity, it seems implausible to claim that there is a separate, explicit representation for each of these beliefs. Distinguishing between explicit and implicit beliefs is one strategy for avoiding this problem, because having just one explicit belief about having two siblings makes it unnecessary to waste storage capacities on the countless other beliefs about how many siblings you don’t have. Those beliefs are implicit, and they can easily be inferred from your explicit belief. 12 I am inclined to suppose that these varying strengths are implicit in a system of beliefs one accepts in a yes/no fashion. My guess is that they are to be explained as a kind of epiphenomenon resulting from the operation of rules of revision. For example, it may be that P is believed more strongly than Q if it would be harder to stop believing P than to stop believing Q, perhaps because it would require more of a revision of one’s view to stop believing P than to stop believing Q. (Harman 1986, p. 22) Harman suggests here that degrees of belief can be reduced to outright beliefs. He thinks that the degree of belief that one has in a proposition depends on how robustly embedded the belief is in one’s overall web of beliefs. And since he believes that these features of one’s explicit beliefs are not themselves explicitly represented in one’s belief-box, Harman regards degrees of belief as implicit. As Keith Frankish argues in his book Mind and Supermind, Harman’s view requires that one have an outright belief in every proposition that one has a degree of belief in. But that seems absurd. Say I am at the horse races, and I am watching a race with five horses. I have a credence of 0.5 that ‘Darwin’s Pride’ will win. In this situation, I certainly have neither an outright belief that ‘Darwin’s Pride’ will win, nor that it won’t win. But according to Harman, I cannot have a degree of belief in this proposition unless I have an outright belief in it. Harman’s account conflicts with the possibility of such middling degrees of belief. (Frankish, 2004, p. 18) Moreover, it is implausible to claim that the degree to which one believes a given proposition varies with the degree to which it would be difficult for one to give up one’s outright belief in this proposition. Harman’s view implies that if I have a credence of 0.9 in some proposition p and a credence of 0.95 in another proposition q, then it would be more difficult to revise my belief that q than to revise my belief that p, because a higher degree of belief reflects a stronger connection to other beliefs. However, as the following example shows, a lower credence can in certain cases be more robust than a higher credence. Say there is a big jar full of red and black marbles, but you don’t know the ratio between the numbers of red and black marbles. In each case, you know that you will draw a sequence of two million marbles, with replacement. In case A, so far you have drawn twenty marbles, nineteen black and one red. As a result, your credence that the last marble you draw will be black is 0.95. In case B, you have drawn a million marbles, 900,000 of which have been black. As a result, your credence that the last marble you draw will be black is 0.9. Your rational credence in case A is higher than your rational credence in case B, but it is much less robust. In case A, if you were to go on to draw a sequence of twenty red marbles, you would cease to be confident that the 13 last marble you draw will be black, but in case B, drawing a sequence of twenty red marbles would have virtually no effect on your confidence that the last marble will be black. These two arguments show that Harman’s thesis that degrees of belief are implicit because they are an emergent property of full beliefs is flawed, since his claim is based on an extremely implausible conception of degrees of belief. Moreover, it should be noted that even if Harman were correct, and degrees of belief were implicit in the way he suggests, namely by being an epiphenomenon of the way our explicit beliefs are related, it still does not follow immediately that they cannot somehow be made explicit and thus be used in cognitive processing. What Harman seems to have in mind is that we cannot “easily infer” degrees of belief, and thus make them explicit, because they are a structural feature of our web of explicit beliefs. Yet, in order for his argument to go through, he would have to show that this view completely precludes that we can access our degrees of belief in a way that would make them usable in cognitive processes. He provides no argument to this effect, but since I have already shown that his basic conception of degrees of belief is problematic, I won’t pursue this line of reasoning any further. I thus conclude that the explicitness argument fails to show that reasoning with degrees of belief is impossible. 2.2 The Complexity Argument After presenting the explicitness argument, Harman considers whether it would even be possible for us to reason with degrees of belief if we had them explicitly. He argues that even if we tried to reason with explicit degrees of belief, we wouldn’t be able to do so, because it would be too complicated. (p. 22) His basic argument has the following structure: 1. For any being S, if S reasons with degrees of belief, S makes extensive use of updating by conditionalization. 2. Humans can’t make extensive use of updating by conditionalization, because it is too complicated for them. 3. Therefore, humans don’t reason with degrees of belief. Harman does not explicitly argue for the first premise of his argument, only for the second one. Here’s what he says: 14 One can use conditionalization to get a new probability for P only if one has assigned a prior probability not only to E [the evidence proposition], but to P & E. If one is to be prepared for various possible conditionalizations, then for every proposition P one wants to update, one must already have assigned probabilities to various conjunctions of P together with one or more of the possible evidence propositions and/or their denials. Unhappily, this leads to a combinatorial explosion, since the number of such conjunctions is an exponential function of the number of possibly relevant evidence propositions. In other words, to be prepared for coming to accept or reject any of ten evidence propositions, one would have to record probabilities of over a thousand such conjunctions for each proposition one is interested in updating. (Harman, 1986, p. 25-6) Thus, the idea behind premise 2 is that a reasoner would need to assign degrees of belief to far too many conjunctions of propositions in order to be prepared to employ conditionalization as an updating rule, which is supposed to show that reasoning with degrees of belief wouldn’t be manageable for humans even if they had explicit degrees of belief. 6 I will argue that we should reject this premise, as well as the first premise of the argument. The first premise of the argument assumes that if we reasoned with degrees of belief, we would have make extensive use of the conditionalization rule, i.e we would update our degrees of belief in the ideally rational manner. Harman is correct in pointing out that depending on the particular situation, conditionalization can be a process of considerable mathematical complexity. However, Harman does not seem to consider that our minds might use certain shortcuts or heuristics, i.e. procedures that are less complex than the ideal procedures, but that yield outcomes that are “good enough” most of the time. There is a large literature in psychology that investigates these kinds of heuristics, and it has produced credible evidence that our mind cuts corners in order to produce outcomes efficiently with the limited capacities it has. 7 Thus, even if ideal reasoning with degrees of belief requires updating by conditionalization, it does not follow that anyone who reasons with degrees of belief must always employ the conditionalization rule, or even employ it most of the time. A heuristic or simplified rule might be used instead. Moreover, the first premise also neglects the fact that much reasoning with degrees of belief is done without taking into account new evidence, so conditionalization is irrelevant in these cases. These are cases in which the reasoner forms a new credence on the basis of her existing credences, in combination with the rules of probability. Such cases surely count as 6 Harman assumes here that conditional probabilities are defined by the ratio formula: P(p|q) = P(p & q) / P(q). 7 One of the classic collections of papers in this topic is Kahneman, Slovic & Tversky (1982). Another one is Gilovich, Griffin & Kahneman (2002). There is some controversy about whether the heuristics people reason by produce bad results or results that are “good enough”. For our discussion, it doesn’t matter which perspective on this issue is more plausible. 15 reasoning, and they don’t require employing conditionalization. 8 We can conclude from this and the previous argument that the first premise should be rejected. The second premise of the argument states that making extensive use of conditionalization as an updating rule is too complicated for humans. As the passage I cite earlier shows, Harman believes that a) the amount of data required to update by conditionalization is too large for humans to cope with, and b) the reason why there is so much data to be handled is that reasoners must be prepared for all sorts of incoming evidence, which means that they must have vast numbers of different conditional degrees of belief. I will argue that both of these assumptions are questionable. Harman claims that reasoning with degrees of belief, and more specifically updating by conditionalization, would be too complicated for a normal human mind. Yet he never makes explicit what level of complexity he thinks the human mind can handle, and to what extent this level is exceeded by reasoning with degrees of belief. In the context of the principles he proposes as feasible, he appears to hold that the reasoning processes we actually employ cannot outstrip the capacities of our conscious reasoning and working memory. (c.f. Harman, 1986, Ch. 2, Ch. 4) However, not all cognitive processes that may be employed in reasoning are of the conscious, working-memory-based kind. There is a broad consensus in psychology that humans have two very different kinds of cognitive processing levels, or systems, which play a role in reasoning, decision-making, and social cognition. One type of processing is fast, effortless, automatic, and non-conscious. The other type is slow, effortful, controlled, and conscious. Both types of processing can tackle the same kinds of tasks, and sometimes deliver conflicting results. The automatic, non-conscious processing mechanisms are sometimes referred to as System 1, the controlled, conscious mechanisms as System 2. There is some controversy among psychologists as to whether they are actually two different cognitive systems in the mind that execute those different kinds of processing, or whether they are different modes of operation of the same underlying mental architecture, but those details don’t really matter here. 9 What matters for this argument is the fact that human beings have processing capacities that are independent of working memory, and can handle vastly more data than the conscious, controlled System 2 processes. System 1 processes can operate on the attitudes we have and generate new attitudes 8 Thanks to Alan Hájek for pointing this out. 9 See, for example, Frankish (2009); Evans (2008); Oaksford and Chater (2007); Evans & Over (2004); Sloman (1996). 16 without our conscious involvement. This is what happens when we infer a conclusion or make a decision without consciously applying any particular rule to the attitudes that constitute the starting points of our reasoning. Rather, our mind “spits out” a conclusion that we become aware of, but the generation of the conclusion happens automatically, and the reasoner is unaware of the exact process by which she reaches the conclusion. 10 Moreover, System 1 processes don’t require that we consciously call to mind every single attitude that is used as a starting point of reasoning. Thus, Harman might be right that we are bad at conscious, System 2-based probability math, because it requires too much working memory. However, that does not disqualify degrees of belief from playing an essential part in System 1 reasoning, because it can handle vastly more data (cf. Evans & Over, 1996, p.50). Harman also argues for the second premise by claiming that conditionalization requires the reasoner to be prepared for various kinds of incoming evidence, which means she would need to have assigned degrees of belief to a very large number of different conjunctions of evidence propositions in order to have the corresponding conditional credences. The point I made earlier about System 1 having large processing capacities independently of our working memory applies here, but furthermore, it is not clear why we need to be “prepared” for various kinds of incoming evidence. Harman is right that in order to update one’s credence in a proposition p by conditionalization, one needs to have a credence in the proposition conditional on relevant piece of evidence. However, it is not clear why Harman assumes that we need to have these credences before we even encounter the evidence. It would cut down the complexity of the task if we could just generate the relevant conditional credences on the fly as we encounter pieces of evidence that we need to update on. If this were the case, it would not be necessary to have stored degrees of 10 To give another example of this, consider language processing. When we understand an utterance, we are usually not aware of the semantic and pragmatic norms by which we infer what the speaker meant. Also, this kind of processing happens very quickly, and would be much slower, and probably not even feasible, if we had to consciously walk ourselves through applying Grice’s maxims in order to find out what our interlocutor was trying to communicate with her utterance. 17 belief for all possible types of evidence we might encounter. 11 As long as we come up with an explicit degree of belief when it is needed, there isn’t a problem. 12 We can illustrate this idea with a toy example. Suppose I am about to watch a horse race, and there are four horses competing that are named Aristotle, Bacon, Confucius, and Descartes. I am about to place my bets, and I have a degree of belief of 0.4 that Aristotle will win, a degree of belief of 0.3 that Bacon will win, a degree of belief of 0.15 that Confucius will win, and a degree of belief of 0.15 that Descartes will win. Then I learn from a trustworthy source that Aristotle definitely won’t win the race. Upon learning this, I need to update my degrees of belief accordingly, which means that for each horse, I need a conditional degree of belief that this horse will win, given that Aristotle won’t win, which is determined by the ratio formula. For example, my conditional degree of belief that Bacon will win given that Aristotle won’t win is Cr(Bacon wins & Aristotle doesn’t win) / Cr(Aristotle doesn’t win) = 0.3/0.6 = 1/2, and similarly for the other horses. Equivalently, I might realize that conditionalization requires that my updated degrees of belief must sum to 1 while preserving their relative weights before updating, which would also lead me to the correct updated credences of Cr(Bacon wins) = 0.5, Cr(Confucius wins) = 0.25, and Cr(Descartes wins) = 0.25. If Harman were right, then I would have needed to have the relevant conditional credences all along in order to be able to update my credences in this way, even before I had even considered the possibility of Aristotle definitely not winning. Yet it seems very implausible that I had these conditional credences all along. Moreover, it is also implausible that I needed to have these credences all along in order to be able to update my credences by conditionalization. As the example shows, I can simply generate the credences I need for updating once I encounter the relevant evidence. While this is an example in which the unconditional probabilities logically determine the conditional probabilities, there could also be cases in which this is not so. Suppose I am watching a Formula 1 race, which is currently in the 34 th round, and, based on the drivers’ current 11 In their paper “On the provenance of judgments of conditional probability”, Zhao, Shah & Osherson elicit judgments of conditional probability from subjects in different kinds of experiments. One way they do this is by making subjects directly estimate the conditional probability of some unfamiliar event. Given that the subjects in their experiments seem to be readily able to do this, it seems not unreasonable to think that people can generate conditional credences ‘on the fly’ when they are needed for updating. 12 As Alan Hájek has pointed out to me, coming up with the relevant conjunctions of propositions that figure in the ratio formula for conditional probability might often be a lot simpler than Harman assumes, for example when they can be determined by applying some kind of indifference principle. This further undermines his claim that employing conditionalization in reasoning is too complicated for human beings. 18 performance and positions, I have some specific credence distribution regarding the places in which each driver will finish. Then I hear that Jenson Button, who is currently in second place, is warned by his team that he must slow down lest he will run out of fuel. Prior to receiving this information, I had a 0.8 credence that he would finish in second place. Yet, upon learning about his fuel problems, my credence that he will finish in second place drops to 0.05. If Harman were right, then I would have needed to have the relevant conditional credence all along in order to be able to update my credences in this way. Even before I considered the possibility of Button’s fuel problems, I would have needed to have a conditional credence of 0.05 that Button would finish in second place given that he had to slow down from the 34 th round on because of a fuel shortage. And I would have had to have analogous conditional credences for all the other drivers and possible problems they might encounter. Yet, it seems very implausible to assume that I already had all the relevant credences. Moreover, there seems to be no reason to deny that I can make up the needed conditional credences on the fly once I learn that Jenson Button has fuel problems. However, the relevant conditional credences might not be so straightforwardly logically determined by my unconditional credences as they were in the previous example. Yet, I can come up with the relevant credences by drawing on some very general information about how Formula 1 racing works, from which I can easily reason to the relevant conditional credences once I need them for updating. Given that I know how many rounds the race has left, that the race cars behind Button will have a speed advantage over him, that it is unlikely that all drivers behind him will have to give up, and that it is unlikely that the race will be finished behind the safety car (which would prohibit anybody from overtaking), I can easily see that my confidence that Button will finish in second place given his fuel problems should be very low. In order to come to this conclusion, I certainly need some kind of capacity to apply general knowledge to a particular case, but since it is hard to deny that humans possess this skill, it seems unproblematic to appeal to it in my argument. 13 This is not so say that it is always possible to generate a conditional credence on the fly when it is needed. There might be cases in which a reasoner simply lacks the relevant knowledge needed to figure out the appropriate conditional credence for the predicament she is in. Yet, the 13 As an anonymous reviewer points out, it is not entirely clear whether this is a case in which we make up conditional credences on the fly, or a case in which we make up the posterior (i.e. updated) credences on the fly. Yet, since the agent would have to reason from her general knowledge about car racing to the relevant credences in either case, I don’t need to take a stand on whether such reasoning proceeds always in one of these ways rather than the other. 19 recognition that figuring out conditional credences on the fly might not produce ideal results in every case does not tell against the idea that this could be the strategy humans often employ when they update their degrees of belief. This is of course also not to say that we always have to generate conditional credences on the fly. My argument does not preclude the possibility that people are prepared for certain kinds of evidence in Harman’s sense. Rather, I claim that it is possible for reasoners to generate conditional credences on the fly, which means that we should not accept Harman’s preparedness assumption, which is his main support for the second premise of the complexity argument. I have shown that there is a way in which humans could update their degrees of belief via conditionalization that does not require the vast numbers of representations that Harman thinks we need. We have seen that both premises of Harman’s complexity argument are problematic. His first premise, which states that subjects who reason with degrees of belief would have to make extensive use of conditionalization, does not take into account that humans who reason with degrees of belief might do so by employing heuristics and shortcuts instead of the conditionalization rule. It also fails to acknowledge the possibility of reasoning with degrees of belief that doesn’t appeal to conditionalization because it is not based on new evidence. His second premise, which claims that making extensive use of conditionalization would be too complicated for humans, rests on at least two problematic assumptions: the assumption about computational capacity and the assumption about being ‘prepared’ for conditional updating. My discussion has shown that our resources for computation are not as limited as Harman assumes, because System 1 processes can operate with degrees of belief in ways that are not constricted by the limits of our working memory. Furthermore, even if we had to use the conditionalization rule in updating, this would not be as problematic as Harman assumes, because we could generate the relevant credences on the fly, rather than carrying them around with us all the time in order to be prepared for all sorts of possible incoming evidence. Thus, even though we can concede to Harman that human agents don’t have the cognitive capacities necessary to reason with degrees of belief in an ideally rational manner, this does not mean that degrees of belief cannot play a role in human reasoning at all. 20 2.3 The Intentionality Argument In the previous section, I argued that we should not underestimate the ability of the mind to execute complex reasoning processes. I pointed out that human cognitive processing can either operate in a conscious, controlled way (System 2), or in an unconscious, automatic manner that is not constricted by working memory (System 1). The latter mode of processing can handle vastly more data than the former, and has the capacities needed for processing degrees of belief. However, some philosophers think that anything carried out by System 1 should not be dignified with the name reasoning. A number of philosophers who have offered accounts of reasoning claim that it is an intentional, active process (e.g. Grice, 2001; Broome, 2009; Raz, 2010). For example, Paul Grice holds a view of reasoning according to which the reasoner intends the production of his conclusion to be based on her premises in some particular rule-governed way: [...]reasoning is typically an activity, with goals and purposes, notably the solution of problems. [...] we may think of the reasoner as intending his production of the conclusion to be the production of something which is an informal consequence of his premiss (premisses), a state of affairs which is evidently distinguishable from merely thinking that a certain proposition is, somehow or other, informally derivable from a given set of propositions. (Grice, 2001, p.27) A relevantly similar view of reasoning is defended by John Broome in his book manuscript Rationality through Reasoning. He rejects what he calls the “jogging model” of reasoning, because he thinks that it is incompatible with his view that reasoning is an active process. According to the jogging model, one can call some premise-attitudes to mind, which then sets off an automatic process that produces a conclusion. He states that if reasoning worked like this, it “would scarcely be an act of yours. Most of it would not be done by you, all you would do is call the premises to mind. Reasoning would mostly be a passive process, which sometimes needs a jog. But intuitively there is more to reasoning than that.” (Broome, 2009, p. 232) Instead, he endorses a view of reasoning according to which it is “[...] a process in which you say to yourself the contents of your premise-attitudes, you operate on them by applying a rule to construct a conclusion, which is the content of a new attitude of yours that you acquire in the process.” (2009, p. 290, my emphasis, see also a slightly different version of the definition on p. 241) It is evident that both of these views of reasoning require that reasoning is an active process in which the reasoner intends to produce a particular conclusion in a particular way. 21 However, this is hardly compatible with System 1 processing, since mental processes that work in this way don’t need to be intentionally initiated by the subject, and the subject does not monitor or have access to the way the conclusion is generated. We can capture Grice’s and Broome’s line of thinking in the following argument: 1. Genuine reasoning is an active, intentional process. 2. If so-called "reasoning with degrees of belief" were carried out by System 1, it would not be an active, intentional process. 3. Therefore, if so-called "reasoning with degrees of belief" were carried out by System 1, it would not constitute genuine reasoning. I will argue that there are strong reasons to reject both premises of this argument. The problem with the first premise is that it is not plausible that all reasoning is an active, intentional process, if we mean by this that it can’t be automatic. There are simply too many examples that we would intuitively classify as cases of reasoning, but that would be excluded by the account in question. It often happens that we learn something new, for example by testimony or by observation, and we automatically infer certain new beliefs from what we’ve just learned without intending to draw, or initiating these inferences. Here’s just one case involving outright beliefs to illustrate this type of case: Suppose you just spoke to your friend Waltraud, who told you that her fiancé Gottlob is out of town for a business trip for a few days. The next day you happen to talk to your mutual friend Franz on the phone, who mentions in passing that he saw Gottlob the night before with a woman who wasn’t Waltraud in a dingy little restaurant a few hours outside the city. Based on your friend’s testimony, you form the belief that Gottlob was at the restaurant with another woman, and you immediately infer from this that he is lying to Waltraud. You also infer that the “business trip” was just an excuse Gottlob made up to spend time with the other woman. It seems very natural to think that your inferences constitute reasoning. You start out from an initial belief – that Gottlob was at the restaurant with another woman – and the beliefs you form subsequently are inferred from it and some other background information. However, the actual inferences were drawn automatically. Upon acquiring the initial belief based on testimony, your mind simply “spat out” the inferred beliefs. It seems wrong to say that your inferences were 22 intentional activities in the sense employed in the first premise. You drew these inferences automatically, without monitoring or initiating the application of some inference rule or strategy. There is no sense in which you “set out” to draw these inferences from your original belief, and you didn’t form the intention to do so. That your friend’s fiancé is lying was just a natural thing to conclude when you came to believe that he was at the restaurant with another woman, but the inference was not something you needed to initiate. This example illustrates the more general observation that it often happens that we learn some proposition p from observation or testimony, and we infer some proposition q from p (or p and some background beliefs) without ever asking ourselves whether q, or intending to infer q from p. Yet, according to the view that all reasoning is an active, intentional process, the mental processes in the example don’t constitute reasoning, and neither do any other inferences that work similar to those in the example. On this view, reasoning is something we rarely do, because it is an active process in which the reasoner intends to produce a particular conclusion in a particular way. But this latter view is in conflict with our ordinary views of what reasoning is, and moreover, it leaves us with the puzzle of how to classify those ubiquitous automatic inferences that surely look like cases of reasoning, but aren’t reasoning according to this view. This is not to say that reasoning is never an active, intentional process. For example, I might be executing a proof in a new proof system whose rules I have just learned, and in drawing each inference, I deliberately set out to apply a certain rule of the system to reach a particular conclusion. The important point here is that not all reasoning processes are intentional in the relevant sense, because some of them involve inferences that are drawn automatically. Claiming that none of these automatic processes constitute reasoning leads to an untenable view according to which we very rarely engage in reasoning processes. It is therefore implausible to characterize reasoning as an active, intentional process in the sense that it can’t be an automatic process. The second premise of the argument is questionable as well. The authors mentioned above endorse the second premise of the argument because they have a very specific view of what it means to be an active, intentional process. They think that automatic processes of the kind executed by System 1 don’t fit this description. However, it is not clear that this is the correct way of understanding what it means for a process to be active and intentional. For example, it seems very natural to describe speaking and driving as active, intentional processes. Yet, when we speak and drive, much of what we do is executed automatically, and does not need to be initiated by 23 forming a particular intention. In order to be able to describe these processes as intentional activities, we could plausibly adopt a wider conception of what an active, intentional process is. Then we could have an account of reasoning according to which reasoning can be both automatic and intentional, which would be compatible with the possibility of reasoning with degrees of belief. I am sympathetic to this view, but I won’t defend it here. Conclusion I started my paper by pointing out that currently there is no worked-out theory of reasoning with degrees of belief to be found in the philosophical literature. Such an absence would make sense if reasoning simply couldn’t involve degrees of belief. After presenting the case in favor of the possibility of reasoning with degrees of belief, I discussed several arguments for the conclusion that degrees of belief cannot play a role in reasoning. Harman’s explicitness argument turned out to be flawed because it relies on an implausible account of the nature of degrees of belief. His complexity argument is based on three assumptions: 1) the no-heuristics assumption, 2) the computational capacity assumption, and 3) the ‘preparedness’ assumption about updating. None of these assumptions turned out to be plausible. The intentionality argument, which was supposed to show that automatic (System 1) processes involving degrees of belief can’t be genuine reasoning, turned out to rest on an implausible notion of what constitutes an active, intentional process. Moreover, even granted this notion, the argument failed to correctly capture certain processes that intuitively constitute reasoning. Thus, at least as far as these arguments are concerned, it seems like there is no good reason why the topic of reasoning with degrees of belief has received so little attention. Any plausible theory of reasoning needs to include degrees of belief among the attitudes that can be involved in reasoning processes, and it needs to explain which principles govern reasoning with degrees of belief. 24 Chapter Two: Is Subjective Bayesianism a Theory of Reasoning? Introduction Many philosophical discussions that are concerned with logic and reasoning are not careful in distinguishing between a logical system and a theory of reasoning. However, as Gilbert Harman has pointed out in his book Change in View, there are important differences between a logical system like classical deductive logic, which is a formalization of truth-functional relations between sets of statements, and a theory of reasoning, which is a theory of how agents should rationally go about forming and revising beliefs. As Harman shows, not every inference licensed by deductive logic is a rational way for an agent to reason, and not every rationally permissible belief revision or formation is captured by classical deductive logic. While Harman’s discussion mainly concerns the relationship between the system of classical logic and norms of reasoning for full beliefs, I will focus in this paper on the relationship between the probability calculus and norms of correct reasoning for partial beliefs, or degrees of belief. If we try to answer the question what the correct norms of reasoning are that apply to reasoning with degrees of belief, it seems to be a plausible strategy to look towards the probability calculus for answers. In the philosophical literature, the theory of subjective Bayesianism seems to be executing just this kind of project, namely generating norms of reasoning and rationality by applying the probability axioms to degrees of belief. In discussing subjective Bayesianism, I will argue for two claims: First, there is an interesting parallel concerning the relationship between deductive logic and norms of reasoning for full belief on the one hand, and the probability calculus and norms of reasoning for partial belief on the other hand. I will show that the same kinds of reasons for which deductive logic cannot be a theory of reasoning also apply to the subjective Bayesian interpretation of the probability calculus. Secondly, I will argue that these observations raise some interesting questions about exactly what kind of theory subjective Bayesianism is intended to be. In the literature, some authors argue that it is a theory of reasoning and epistemic rationality for ideal agents, whereas other authors see it as more akin to a logical system. My discussion will bring out the difference between these two views, and I will argue that subjective Bayesianism can be seen at best as an incomplete theory of reasoning and rationality for ideal agents. Viewing subjective Bayesianism as a kind of logical system seems like an interesting alternative view; however, the proponents of 25 such a view would have to supplement it with an explanation of the precise sense in which subjective Bayesianism can be seen as a logical system. I will begin by explaining the rules of the probability calculus, and how one might generate norms of reasoning on their basis. Then I show that the familiar problems that drive a wedge between classical deductive logic and a theory of reasoning for full belief equally apply to the connection between the degree of belief interpretation of the probability calculus that constitutes subjective Bayesianism, and a theory of reasoning for degrees of belief. I go on to show that even if we take subjective Bayesianism to be a theory of reasoning that only applies to ideally rational agents, it still cannot give us everything we want from a theory of reasoning with degrees of belief. 1. Basics: The Probability Calculus In this section, I will briefly state the central tenets of the probability calculus, and explain how they could be used to generate norms of reasoning for degrees of belief. The probability axioms can either be applied to events or statements. For the purposes of my discussion, I will choose the latter framework. To set up the probability calculus, we begin with a set of atomic statements {Ai}, and we will combine it with the standard sentential logical operators to define a language L. We will also assume that the relation of logical entailment |= is defined in the classical way. A probability function on L must satisfy the following axioms: Normalization: For any statement A, if A is a tautology, P (A) = 1 Non-Negativity: For any statement A, P (A) ≥ 0 Finite Additivity: For any two mutually exclusive statements A, B, P (A ∨ B) = P (A) + P (B) Conditional Probability: For any two statements, P (B|A) P(A) = P (B & A) Moreover, there is a rule for updating probabilities in light of new evidence. Conditionalization: When new evidence A becomes available, the new probability assigned to any statement B is the previous probability of B conditional on A. 26 So: When A has been added to the body of evidence, Pnew (B) = Pold (B|A) 14 Many further rules of probability can be derived from the standard axioms, for example Bayes’ theorem, and rules for computing the probabilities of logically complex statements. These axioms have often been seen to play a crucial role in finding an answer to the question what constraints there are on rational degrees of belief. In a recent survey article on different interpretations of probability, Weisberg offers the following characterization of the degree of belief interpretation of the probability axioms: On this view, the force of the probability axioms [...] is that of constraints on rational belief. [...] Many notable thinkers in this tradition have thought that this rational force is on a par with the rational force of the rules of deductive logic – that the probability axioms provide the rules for degrees of belief just as the laws of deductive logic provide the rules of full belief.” (Weisberg, 2011, p. 14) One early example of a thinker who endorses the view that the probability axioms provide norms of rationality for degrees of belief is Keynes. In the first chapter of his Treatise on Probability, he says: The Theory of Probability is logical, therefore, because it is concerned with the degree of belief which it is rational to entertain in given conditions, and not merely with the actual beliefs of particular individuals, which may or may not be rational. Given the body of direct knowledge which constitutes our ultimate premisses, this theory tells us what further rational beliefs, certain or probable, can be derived by valid argument from our direct knowledge. (Keynes, 1921, p. 3) Keynes states a very natural idea here: a rational agent who has degrees of belief in a set of propositions should draw inferences according to the rules of the probability calculus. For example, an agent who has a degree of belief of 0.3 in the proposition that her friend John is 14 Some philosophers propose to replace the standard version of the conditionalization principle with Jeffrey Conditionalization. The standard conditionalization rule assumes that when new evidence is learned, it becomes certain, i.e. it is assigned probability 1. However, it has often been thought that this condition is too strong, because one doesn’t always become certain of evidence propositions. An alternative variant of the conditionalization rule takes this consideration into account, since it allows for evidence that is not certain: Jeffrey Conditionalization: When an observation bears directly on the probabilities over a partition {A i}, changing them from P old (A i) to P new(A i), the new probability for any proposition B should be: P new (B) = ∑ P old (B | A i) P new (A i) Thus, Jeffrey conditionalization, unlike standard conditionalization, can accommodate situations where an agent’s evidence shifts, but no evidence proposition comes to be known with certainty. Conveniently, standard conditionalization is a special case of Jeffrey conditionalization, where the relevant partition is an evidence proposition and its negation, and the probability of the evidence proposition shifts to 1. Nothing in my arguments rests on whether we adopt standard conditionalization or Jeffrey conditionalization. 27 currently in San Francisco, and a degree of belief of 0.3 that John is in Los Angeles, may rationally infer a degree of belief of 0.6 in the proposition that John is either in San Francisco or in Los Angeles. Similarly for cases in which new evidence is learned: if our agent has a conditional credence of 0.9 in the proposition that John is in Los Angeles, given that John says on Facebook that he is in Los Angeles, then it would be rational for him to update his degree of belief in the proposition that John is in Los Angeles to 0.9 upon reading a facebook post by John that says so. These examples seem to suggest that the probability axioms, when combined with an agent’s current degrees of belief, provide the agent with norms that prescribe how the agent should reason, i.e. form new credences or change them in light of new evidence. We can capture this idea in its general form by proposing the following simple bridge principle, which says that rational agents should only make probabilistic inferences that are in agreement with the rules of probability: (SPBP) Simple Probabilistic Bridge Principle: It is rationally permissible for an agent to draw some probabilistic inference I if and only if the inference I follows the rules of the probability calculus. An agent draws a probabilistic inference just in case she has credences that serve as the starting point of reasoning, and she forms a new credence or changes a credence in some proposition on the basis of these initial credences. An inference follows the rules of the probability calculus if the agent’s existing credences together with the probability axioms prescribe (or at least permit) the new credence that the agent adopts. The formulation is supposed to capture both inferences that involve learning something new, as well as inferences by means of which agents assign degrees of belief to propositions they hadn’t considered before, which can happen in the absence of learning anything new. This seems like a natural way of generating principles of reasoning for degrees of belief. In essence, the idea is that the probability calculus, applied to degrees of belief, just is a theory of norms of reasoning for degrees of belief. However, there is reason to be worried about this sort of theory, because it has been forcefully argued that a similar view is not plausible in the case of deductive reasoning. Gilbert Harman, in his book Change in View, has made a convincing case that 28 we cannot generate principles of deductive reasoning by combining standard deductive logic with a simple bridge principle that is analogous to (SPBP). I will present his arguments in the next section, and consider whether similar arguments apply to the probabilistic case. 2. Logic vs. Reasoning Just like in the probabilistic case, it is a natural thought that there is a close connection between deductive logic and deductive reasoning. For example, in discussing the question what logic is, Graham Priest says: “But just as with dynamics, so with logic, one needs to distinguish between reasoning, or better, the structure of norms that govern valid/good reasoning, which is the object of study, and our logical theory, which tries to give a theoretical account of this phenomenon.” (Priest, 1987, pp. 257-8, cited after Tanaka, 2003, p.31) This view makes it seem like logical systems tell us how we should reason, by giving a formal account of the rules of good reasoning. However, as a quick look at a standard way of defining a system of classical logic shows, logic is not directly about reasoning. A logical system does not contain any norms of the form “It is rationally required/permitted for an agent to reason in ways X, Y and Z if and only if such and such is the case.” Rather, it contains rules that define well-formed formulas in the system, and it also defines what a proof is and what a permissible transition in a proof must look like. However, we can capture the hypothesis that a logical system provides norms of reasoning by proposing a simple bridge principle, which is analogous to the simple bridge principle for the probabilistic (SPBP). (SDBP) Simple Deductive Bridge Principle: It is rationally permissible for an agent to draw some deductive inference I if and only if the inference I is valid according to the rules of deductive logic. An agent draws a deductive inference just in case she has beliefs that serve as the starting point of reasoning, and she then forms or revises a belief in some proposition on the basis of these initial beliefs. An inference follows the rules of deductive logic if the agent’s existing beliefs together with the rules of deductive logic permit the derivation of the belief that the agent adopts. Stating the principle as a biconditional captures the idea that a deductive logical system just is a theory that provides norms of deductive reasoning, rather than only partially contributing to an account of 29 the correct norms of deductive reasoning. The passage from Keynes that I cited in the previous section makes it clear that Keynes endorsed it both for the probabilistic and the deductive case, because he talks about probabilistic as well as certain inferences. However, as Gilbert Harman has shown, a simple bridge principle like (SDBP) is hopelessly implausible. (Harman, 1986) There is a much wider gap between a logical system and a theory of reasoning than (SDBP) assumes, and there is no easy way of stating the connection between the former and the latter. We can find counterexamples to both directions of the biconditional: there are logically valid inferences that make for rationally impermissible inferences, and there are also rationally permissible inferences that don’t constitute logically valid inferences. Generally speaking, there are three conditions that give rise to mismatches between epistemically permissible and logically valid inferences: 1) agents having irrational or inconsistent beliefs, 2) agents having limited cognitive capacities, and 3) cases of rational non-rigidity in an agent’s beliefs. For each of these conditions, I will first explain how it applies to the deductive case, drawing on Harman’s discussion, and then I will show how there is a similar phenomenon in the probabilistic case – a parallel which, to my knowledge, has not been pointed out. Harman illustrates the first condition with the following case: One way of having irrational beliefs is to have beliefs in contradictory propositions. It is a theorem of classical logic that anything follows from a contradiction. If (SDBP) were correct, we should expect it to be rationally permissible to infer any belief whatsoever from such a pair of contradictory beliefs. However, it is obviously not a rule of good reasoning to infer whatever we want from a pair of contradictory beliefs. If an agent has contradictory beliefs on some issue, she should probably refrain from making any inferences at all based on them, and try to resolve the conflict. (Harman, 1986, p. 6) Alternatively, an agent can also have irrational beliefs that are logically consistent. Suppose, for example, that an agent irrationally believes that she has been abducted by aliens, and that anyone who has been abducted by aliens should assassinate the president. Her other beliefs, which may or may not be irrational, are consistent with these beliefs. The standard deductive logical system has a built-in cumulativeness assumption, which means that once a set of premises has been determined for a proof, the rules of logic don’t allow that any of the premises get dropped or revised. The rules only allow for inferences to be drawn from the initial set of premises, and thus beliefs can only be added, never subtracted. Thus, even though it seems most 30 rational for the agent to revise her irrational beliefs, the rules of deductive logic don’t contain any subtraction rules that would allow her to do so. However, it is important to notice in this context that not all cases in which an agent has logically inconsistent beliefs are cases in which the agent’s beliefs are rationally defective. There also seem to be cases where it is rationally permissible to have inconsistent beliefs, even though it is not permitted by the rules of logic. A famous example of this kind is the preface paradox: suppose an author has written a long and detailed book about some historical event. She has researched all the facts very diligently, and she rationally believes every individual claim she makes in the book. However, at the same time, it seems perfectly rational for her to believe that her book contains some errors. Having this combination of beliefs is not permitted by the rules of deductive logic, because from a logical point of view, the belief about the book containing errors is inconsistent with the set of individual beliefs she has. Yet it is widely accepted that it is rational to believe that her book contains errors. Thus, it is not always the case that an agent who holds logically inconsistent beliefs is rationally defective (Harman, 1986, p. 15; Christensen, 2004). Let’s now consider whether we can find the same type of counterexamples to the Simple Probabilistic Bridge Principle (SPBP). In the deductive case, we had examples of agents being irrational in virtue of having logically inconsistent beliefs, and agents being irrational in virtue of having individual beliefs that were irrational. We can find the same two kinds of cases in the probabilistic case – agents who are irrational in virtue of having incoherent degrees of belief, and agents who have irrational degrees of belief without thereby being incoherent. 15 For an example of the former kind, suppose an agent has degrees of belief that violate the probability axioms in some way. For example, an agent may have a credence of 0.4 in each of three mutually exclusive and jointly exhaustive propositions. These credences are incoherent, because they don’t sum to 1, as credences in a partition of proposition should according to the axioms of probability. Plausibly, there is some rationally permissible reasoning strategy by which the agent may adjust her credences in order to remove the incoherence. However, the probability calculus doesn’t apply to this case, because it is silent about how to adjust incoherent credence functions. Thus, we have a case of a rationally permissible (or even required) revision of an agent’s credence function, for which there is no corresponding rule that can be derived from the probability axioms. 15 While the deductive bridge principle is also subject to the preface paradox, we don’t find the same problem in the probabilistic case. See Christensen (2004) for a discussion of why the preface paradox arises in the deductive, but not the probabilistic case. 31 We can also give examples of the latter kind, where an agent has irrational, but coherent credences. The probability calculus, just like standard deductive logic, assumes that all probabilistic inferences are cumulative, even though it is often rationally permitted or required to reason in a non-cumulative way. One way in which the system is cumulative concerns changes to the agent’s initial credence assignments, or priors. Once the priors are fixed, the Bayesian framework allows no changes to them other than via the conditionalization principle, on pain of irrationality. However, it sometimes seems reasonable to change one’s priors without using conditionalization, for example when one’s priors are irrational (though not necessarily incoherent). Suppose that, for some reason, I have an initial conditional credence of 0.99 in the proposition that I am an iced cappuccino, given that my neighbor says so. If my neighbor then happens to tell me that I am an iced cappuccino, updating by conditionalization would lead me to be almost certain that I am an iced cappuccino. But the rational thing for me to do would certainly not be to become almost certain that I am an iced cappuccino. Rather, I should revise my conditional credence and not update on it. The same thing can happen for assignments of unconditional credences. If an agent realizes that one of her unconditional credences is irrational or unjustified, the rational thing for her to do might be to change it, even though this is not a permitted revision procedure according to the rules of the probability calculus. A second source of problems for (SDBP) that is implied by Harman’s discussion concerns the fact that human agents have limited cognitive capacities, and so certain entailment relations are difficult for them to see. For example, a difficult mathematical truth, like Fermat’s last theorem, is logically entailed by some very basic axioms, but we would consider it irrational if an agent believed Fermat’s last theorem simply on the basis of believing these basic axioms, without really understanding the relationship between the theorem and the axioms. Rather, it would be rational for the agent in this case to withhold belief in Fermat’s last theorem, even though it is entailed by the agent’s other beliefs. This example points to a more general problem with formulating principles of reasoning. The fact that agents (at least agents who are not ideally rational) only have limited computational resources leads many philosophers to think that principles of reasoning have to “fit” the capacities of these agents in ways that a logical system doesn’t need to concern itself with. If the capacities of real agents are significantly more restricted than the capacities that are required to reason according to rules that directly correspond to logical principles, and principles of reasoning need 32 to fit those limited capacities, then these principles might end up looking very different from the principles that correspond to some logical system. Again, while Harman is concerned with the case of deductive reasoning, it is easy to see that the same concerns apply to the case of reasoning with degrees of belief. The probability axioms require that an agent must assign credence 1 to tautologies, and credence 0 to logical contradictions. Yet, there seem to be many cases where doing so would not seem rational, because the agent may lack evidence that determines the logical status of a formula. For example, before having come up with a proof for some formula that she suspects to be a theorem, it might be rational for a logician to have a credence lower than 1 in that theorem. In fact, it would be irrational for her to assign the theorem a credence of 1 before proving it, given that she has no basis for being certain that it is in fact a theorem. And even after coming up with a proof, the logician may still rationally be less than fully certain of the theorem, since she cannot completely rule out that there is a mistake in the proof. The reason why it seems unreasonable for human agents to meet the standards set by the probability calculus is because they have limited cognitive powers. They are not in a position to make complicated calculations easily, and they don’t have an immediate grasp of complex logical relations and properties. Thus, it seems rationally permissible for people to assign non-extreme credences to certain logical formulas, because it reflects the fact that they have a limited grasp of them. Similar arguments can be made with respect to other calculations that are too complicated for humans to execute, but that are required according to the probability calculus. More generally, it has often been taken to be a decisive objection against generating principles of reasoning by applying the probability axioms to degrees of belief that the resulting norms of rationality don’t “fit” the cognitive capacities of human agents, since the norms are way too demanding for them. Another well-known problem that arises because of the cognitive limitedness of humans is the so-called problem of old evidence. 16 In a nutshell, the problem is that according to the rules of the probability calculus, evidence that is already known cannot confirm anything, whereas in reasoning, evidence that is already known can play an important role in supporting a theory. The reason why evidence that is already known cannot confirm anything according to the rules of probability is that if the probability of some evidence proposition E is already 1, the probability of 16 For this take on the problem of old evidence, see Glymour (1980), Garber (1983), and Eels (1985). 33 any hypothesis H conditional on E is equal to the unconditional probability of H. This means that conditionalizing on E leaves the probability of H unchanged. However, in actual human reasoning there are many examples where previously known evidence provides strong support for a theory. In these cases, an agent updates her credences based on previously known evidence, and thereby induces a rational shift in her credences that violates the rules of probability. A famous example of this is Einstein’s general theory of relativity. When Einstein showed that his theory of general relativity correctly predicted the observed perihelion shift of Mercury, this was taken to lend major support to Einstein’s theory. His theory was able to account for a well-known fact that had caused major problems for previous accounts of planetary motion. In this case, even though the facts about the perihelion shift were old evidence, they lent crucial evidential support to Einstein’s theory, contrary to the rules of probability. The underlying cause that put Einstein in this position was that – because of his imperfect computational capacities – he was unable to recognize when he first came up with the theory of general relativity that it entailed the perihelion shift. If he had recognized this right away, he could have taken it into account when he first assigned a credence to the view, and not just later when he actually became aware of this logical relation. Thus, cognitive limitations can have the effect that agents don’t immediately see the logical relation between a theory and a piece of evidence, which means that once they recognize the connection, they must shift their credences in a non-probabilistic way. The third condition that generates mismatches between rationally permissible and logically valid inferences is constituted by instances of what may be called “rational non-rigidity”. These are cases in which an agent rationally doesn’t hold certain things fixed that the logical system assumes to be fixed. I already mentioned that standard deductive logic does not allow for revisions to premises once they have been accepted. In the example discussed earlier, the agent should have revised one of her premise-beliefs because it was irrational. However, there are also cases where it is rational for an agent not to stick to a premise-belief, and where that premise- belief is not initially irrational. For example, I might see a person at the farmer’s market who looks like my friend Ben, so I form the belief that Ben is at the farmer’s market. Forming a belief in this way based on visual evidence seems rationally unobjectionable. Then my cellphone rings, and Ben is on the phone. He tells me that he is on vacation in Hawaii. It is certainly an open possibility at this point for me to conclude that my farmer’s market is in Hawaii, or that Ben is lying. However, I am not rationally obligated to make either one of these inferences. Rather, I 34 can rationally revise my earlier belief that Ben is at the farmer’s market, and I conclude that the person I saw was not him but someone who looks very similar. Thus, based on learning new information, I reject a belief that I accepted earlier as a premise. Reasoning in this way doesn’t seem rationally objectionable, yet it is not cumulative like a logical deduction (Harman, 1986, p. 4). We can also find examples of rational non-rigidity in the domain of reasoning with degrees of belief. These are cases where an agent rationally updates her credences in a way that deviates from the prescribed probabilistic procedure, which requires that the agent doesn’t update her credences except by conditionalization or Jeffrey conditionalization. As Jonathan Weisberg (2009) has pointed out, the standard Bayesian updating procedures are unable to capture how an agent should update her credences when she encounters an undercutting defeater for her evidence. He illustrates this problem with the following example: Suppose you, having standard background beliefs, enter a room, and you see what seems to be a red jellybean on the table. Then you notice that the room appears to be illuminated by red-tinted light. The rational way of updating your credences in this case plausibly goes as follows: first, when you enter the room, you should become more confident that the jellybean is red, since that is what your experience tells you. Moreover, you should consider the color of the jellybean to be independent of the lighting conditions in the room. 17 After all, what color the jellybean actually is has nothing to do with the lighting conditions of the room it happens to be stored in. Moreover, if you considered them to be dependent, then increasing your credence that the jellybean is read would also lead you to increase your confidence that the lighting is normal, which would be an objectionable form of bootstrapping. So, becoming more confident that the jellybean is red should leave your confidence that the light is normal unchanged. Then, when you notice the red- tinted light, you are faced with an undercutting defeater for your initial evidence. Noticing the light undermines the support you have for thinking the jellybean is red, but without directly being evidence against the jellybean being red. Upon noticing the light, your credence in the jellybean being red should go back down to where it was before you saw it. 17 Of course, there could be a case in which you have antecedent information about some connection between the color of the jellybean and the condition of the lighting. But this is not such a case. 35 As Weisberg points out, this change in your credences cannot be achieved when updating with conditionalization or Jeffrey conditionalization 18 , because these updating procedures are rigid. That means that they always preserve the agent’s conditional credences. However, in order to correctly model undercutting defeat, the agent must first consider the color of the jellybean and the lighting to be probabilistically independent of each other, but then consider them probabilistically dependent, so that discovering the red lighting can lower the agent’s credence that the jellybean is red. Formally, the following two conditions would have to hold in order to model undercutting defeat as in the jellybean example, where Pt1 is the credence function the agent has before seeing the jellybean, and Pt2 is the credence function the agent has after seeing the jellybean, but before noticing the trick lighting: (I) Pt1 (red jellybean | red light) = Pt1 (red jellybean) (II) Pt2 (red jellybean | red light) < Pt2 (red jellybean) However, since both conditionalization and Jeffrey conditionalization are rigid, and hence preserve independence, the agent’s conditional credence cannot change in the way condition (II) assumes. As Weisberg shows, if (I) holds, then it follows that Pt2 (red jellybean | red light) = Pt2 (red jellybean), which contradicts condition (II). Thus, by the standard Bayesian updating procedure, discovering the red-tinted light cannot lower the agent’s credence to where it was before seeing the jellybean. Thus, we have a class of cases of rational updating in which an experience provides defeasible evidence for the truth of some proposition, and a second experience serves as an undercutting defeater for the evidential support provided by the first experience, and this class of cases cannot be modeled by the standard Bayesian updating rules. It is an open question whether there is a different updating rule that could be used instead of conditionalization that might solve the problem. Dmitri Gallow (forthcoming) has proposed such an alternative updating rule that seems to be able to model undercutting defeat. However, adopting Gallow’s rule does not avoid, and even exacerbates the problem of conceptual omniscience that I discuss in the last section. My overall argument therefore does not depend on whether or not we adopt Gallow’s rule. 18 To be precise, there is technically a way this sequence of credences can be achieved with Jeffrey conditionalization, but only if we thereby completely trivialize the way in which updating works. Weisberg discusses this in more detail. 36 As I explained in this section, Harman’s arguments show that a simple bridge principle like (SDBP) is implausible. That means that there is a much wider gap between a logical system and a theory of reasoning than philosophers like Priest and Keynes acknowledge, and it is an open question how norms of reasoning are informed by logical principles. But, as I have argued in this section, we encounter the same gap between subjective Bayesianism and a theory of reasoning for degrees of belief. I have thus established the first one of my argumentative goals: to show that the same kinds of reasons for which deductive logic cannot be a theory of reasoning also apply to the subjective Bayesian interpretation of the probability calculus. This observation suggests some interesting conclusions. First, when we state what the nature of a logical system is, we should characterize it in a way that leaves open its relationship to reasoning. Moreover, the parallels between deductive logic and the probability calculus suggest that it might be fruitful to view the subjective Bayesian interpretation of the probability calculus as a kind of logical system whose relationship to a theory of reasoning is at most indirect, rather than a system that directly provides norms of reasoning and rationality, as many subjective Bayesians seem to assume. Different interpretations of the probability calculus are even sometimes called “the logic of probability”, which shows that at least some philosophers conceive of it has having the characteristics of a logical system. 19 It has also been argued that the probability calculus shares certain important features with standard logical systems, such as concerning relations between statements, being formal, and being sound and complete (cf. Howson, 2002). We should also expect there to be important similarities between deductive logic and probability because the probability calculus is built on a standard logical language, so it inherits certain basic features of deductive logic. 20 In fact, the probability calculus contains the rules of deductive logic as a special case when all propositions are assigned values of 1 or 0. However, viewing the degree of belief-interpretation of the probability axioms as a type of logical system is not the only way in which philosophers have responded to the arguments in this 19 To be precise, there are several things that are called the “logic of probability”, or “probability logic”, and some, but not all of them refer to the standard probability calculus. For example, Adams’ theory of the “logic of probability” studies the transmission of probability from the premises to the conclusion in valid inferences. Moreover, there is also an interpretation of the probability axioms known as “the logical interpretation of probability”, which is a particular view of what probabilities are. For a good survey that disentangles the various uses of these notions, see Hájek’s article “Probability, Logic, and Probability Logic.” 20 However, it should be noted that there are alternative axiomatizations, for example by Popper, that start with probabilistic axioms from which deductive logic can be derived. For a good explanation and references, see Hájek (2001), and Miller (2012). 37 section. Another strategy that has been adopted is to claim that subjective Bayesianism is a theory of reasoning and rationality, but that it only applies to ideal agents. My next goal will be to discuss the merits of each of these approaches. 3. Idealization to the Rescue? The arguments in section two have shown that a theory of reasoning cannot be obtained by combining the probability calculus with a simple bridge principle like (SPBP). In the recent philosophical literature, the following two responses have been adopted: One possible response is to reject (SPBP) and to claim that applying the probability axioms to degrees of belief leads to a theory that should properly be understood as a logic of degrees of belief. 21 If we don’t take the resulting view to be a theory of reasoning, but a logical system, then we cannot criticize the view for providing implausible norms of reasoning, because that is not what the theory is meant to do. Under this interpretation, the theory is no more problematic or implausible than, for example, standard deductive logic. This position has been endorsed by Howson and Urbach (2006). They characterize their view in the following way: What these last objections reveal, we believe, is not that the standard formalism of personal probability is insufficiently human-centric, but that it is misidentified as a model, even a highly idealising one, of a rational individual’s beliefs. [...] The formalism of epistemic probability suggests, we believe, a similar conclusion: that formalism is a model of what makes a valid probabilistic inference, not of what ideally rational agents think or ought to think in conditions of uncertainty. (p. 61) Howson and Urbach base their argument for taking the probability axioms to provide a logic for degrees of belief (rather than a theory of reasoning) mostly on the fact that this interpretation takes the force out of the argument that humans are too cognitively limited to have perfectly coherent credences. It is easy to see that their view also addresses the other problems for (SPBP) that I pointed out. Each of these problems pointed to a mismatch between what constitutes rationally permissible reasoning, and what is required by the probability axioms. Since their view commits them to rejecting (SPBP), it is of course not subject to any of the arguments I put forth against (SPBP). While this response is an attractive way of avoiding the problems I discussed for viewing subjective Bayesianism as a theory of reasoning and rationality, it cannot be adopted, in my view, 21 Again, this view should not be confused with what is standardly called the “logical interpretation” of the probability axioms, or Adam’s probability logic. 38 without providing further arguments for the claim that subjective Bayesianism is best interpreted as a type of logical system, or a model of valid probabilistic inference. In support of their view, Howson and Urbach highlight the similarities they see between deductive logic and the probability calculus, and argue that we can view the latter as a system of valid probabilistic inference, for which soundness and completeness proofs can be given (p. 70). They also highlight formal similarities between consistency constraints that apply to sets of sentences, and to sets of credence assignments. While their observations about the analogies between deductive logic and the probability calculus are interesting, I think they don’t amount to a conclusive argument that this is how we should view the Bayesian system. Their discussion leaves me with two concerns: First, Howson and Urbach don’t specify what they generally consider to be the defining features of a logical system. Pointing out that a system has certain features that make it similar to classical logic doesn’t amount to a convincing argument that this system is best viewed as a logical system unless we have reason to think that these features are defining characteristics of a logical system. Secondly, subjective Bayesianism, unlike deductive logic, purports to be about norms on agents’ attitudes. Howson and Urbach don’t seem to deny this – they refer to the attitudes the probability axioms are applied to as “personal probabilities.” But if this is correct, then there is an obvious disanalogy between deductive logic and subjective Bayesianism: one is about attitudes, and the other one is not. It is not difficult to defend the claim that deductive logic isn’t directly about norms of reasoning and rationality, precisely because the standard way of defining a classical logical system only makes reference to sentences or propositions, but not to agents, or attitudes held by agents. Subjective Bayesianism, by contrast, explicitly identifies its objects as degrees of belief, and the probability axioms as norms that constrain these degrees of belief. Hence, Howson and Urbach’s claim that subjective Bayesianism is best identified as a logic of valid probabilistic inference is puzzling, since they are effectively claiming that the best interpretation of subjective Bayesianism is in direct conflict with what the system purports to be about, namely the attitudes of (rational) agents. They also owe an explanation of what they mean by an “inference”, since they surely can’t take an inference to be a psychological process, as it is often understood. However rejecting (SPBP) in favor of a logical interpretation of Bayesianism is not the only strategy that has been adopted in response to the arguments that call the principle into 39 question. Another response is to attempt to save (SPBP) by claiming that it is a plausible theory of reasoning and rationality, but only for ideally rational agents. This response seems to be a common background assumption in many discussions of subjective Bayesianism, and it has been explicitly endorsed, for example, by David Christensen (Christensen, 2004). Howson and Urbach also mention this response, but they don’t explicitly argue against it. In the last chapter of Putting Logic in its Place, Christensen discusses the worry that any norms of rationality derived from the probability axioms are too demanding for real agents. He agrees with the common criticism that “attaining probabilistic coherence is far beyond the capacity of any real human being.” (p. 151) Yet, he argues that this is not a problem, because the resulting view is a theory that directly applies only to perfectly rational agents. Having no cognitive limitations, perfectly rational agents can execute arbitrarily complex computations, so they can reason according to the principles generated by combining the probability calculus with (SPBP). It should be added that Christensen doesn’t think that a similar principle for deductive logic along the lines of (SDBP) can be rescued via the idealization strategy, since for example the preface paradox could not be avoided, and I think that he is right. In order to find out whether the idealization strategy can save (SPBP), we need to examine the three types of counterexamples, and evaluate whether they can be avoided if we assume that our theory of probabilistic reasoning only applies to ideally rational agents. The first two sources of counterexamples were: 1) agents having incoherent beliefs, and agents having coherent, but irrational beliefs, and 2) agents having limited cognitive capacities. The idealization strategy clearly avoids these types of counterexamples. An ideally rational agent would not assign jointly incoherent credences to propositions, so there is no need for norms of reasoning that specify how the agent should reason in this case. Moreover, an ideally rational agent would not assign irrational credences to individual propositions, because she would proportion her credence assignments correctly to the available evidence. Thus, there can’t be any cases in which she has an irrational credence that needs to be changed in a way that is not compatible with the probability axioms. Furthermore, since ideal agents don’t have limited cognitive capacities, the idealization strategy avoids any counterexamples of this type, such as the problem of old evidence. At this point in the discussion, it seems like both responses to the arguments against (SPBP) – rejecting the principle, and the idealization strategy – are on a par. However, I haven’t 40 yet considered cases of rational non-rigidity. The problem of modeling updates of one’s credences in the face of undercutting defeaters cannot be solved by the idealization strategy, since examples of this kind don’t exploit any sort of rational deficiency of the agent. Even an ideally rational agent could find herself in the jellybean example discussed above. However, adopting a different updating rule, as suggested by Gallow, might avoid this problem. But these are not the only problematic counterexamples to (SPBP) that are cases of rational non-rigidity. There is a further argument against (SPBP) that cannot be circumvented by claiming that subjective Bayesianism only applies to ideally rational agents. I will call the argument the argument from conceptual omniscience. The problem that gives rise to the argument can be seen as a version of what is sometimes called “the problem of new theories” in the literature. Here’s how the problem arises: suppose an agent has credences in a specific set of theories about some subject matter that are not ruled out by the currently available evidence. If a previously unknown, competing theory is introduced, this can have the effect that the original theories are no longer considered as well-supported as they were before the introduction of the new theory. It seems perfectly rational that an agent who becomes aware of a new, plausible theory that she was previously unaware of should become less confident the old theories, because now there are n+1 theories that all deserve to be assigned positive credence, and she cannot rationally assign a positive credence to the new theory without becoming less confident in the old theories (since her credences in all available hypotheses must sum to 1). Yet, there is no probabilistic mechanism that would warrant a decrease in confidence in the original theories, and an increase in confidence in the newly introduced theory. In cases like this, there is a shift in the probability space, and the agent must redistribute her credences across the new probability space in a way that is not licensed by the rules of probability. One might try to respond to this problem by stating that a perfectly rational agent should always leave some room in her credence distribution for a “none of the above”-option, so that when she encounters a previously unknown theory, she can shift some of her “none of the above”-credence to the new theory, without having to decrease her credences in the old theories. But even if we adopt this response, we cannot avoid the problem of there being a credence update that is not licensed by the probability axioms. The probability axioms are silent about how the agent should redistribute her credences by assigning some of the “none of the above”-credence to the new theory. 41 It is important to recognize here that this is not simply a case in which the agent has the conceptual resources to formulate the new theory, but hasn’t yet assigned it a new credence. In these cases, the agent can simply fill in the “gap” in her credence function according to the laws of probability. Rather, this is a case in which the agent was not even able to formulate the new theory, since she lacks the conceptual resources to entertain the relevant proposition(s). That means that upon being able to formulate the new theory, the number of propositions the agent can entertain has increased, and so she must now distribute her credences over a larger algebra than before. This shift in credences cannot be captured by the probability axioms. 22 Howson and Urbach’s view – that we can generate a logic of degrees of belief by applying the probability axioms to credences, but that we cannot generate a theory of reasoning in this way – is not threatened by the problem of new theories. Since they reject (SPBP), the fact that the problem of new theories is a counterexample to (SPBP) creates no additional worries for them. This is different in the case of Christensen’s idealization strategy, however. The problem of new theories cannot be solved by claiming that (SPBP) only applies to ideally rational agents. In order for this problem to disappear, ideally rational agents would have to assign credences to all possible theories when they initially assign priors, because in that case, they could never come to be in a position where they have to assign a credence to a theory that they previously didn’t know about. That means they would never have to confirm a newly invented theory based on old evidence, or shift their credences to a new probability space in a non-probabilistic way. But for an ideally rational agent to assign credences to all possible theories about absolutely everything at the very beginning of their cognitive life, they would have to be equipped with very substantial knowledge. They would have to know every single way of conceptualizing all the phenomena they might ever encounter. Even though they wouldn’t have to be literally omniscient, they would have to be what one might call “conceptually omniscient”: they would need to have the conceptual resources to articulate all possible theories about everything they might ever encounter, even about phenomena and technologies that will only begin to exist in the far future. Yet, it is very implausible that conceptual omniscience is a necessary condition for being an ideally rational agent. When we conceive of ideally rational agents, we usually conceive of a person who is like us, but with enhanced cognitive capacities. An ideally rational agent is 22 For a discussion of the problem of new theories, see, for example, Earman (1992), Ch. 8. There is also an attempt by Maher to capture the adoption of new theories within the constraints imposed by conditionalization, but he ends up not endorsing this attempt because he persuasively argues that it rests on untenable assumptions (Maher, 1995). 42 someone who is able to always monitor and keep her attitudes coherent, who is able to always weigh evidence correctly, who never makes mistakes in reasoning, and who can execute very complicated reasoning processes and computations. By contrast, we don’t take it to be a matter of being rational to have certain specific concepts, just like having factual knowledge is not usually considered relevant for being rational. When someone does not possess the concept ‘up quark’, we don’t consider this a failure of rationality. Someone could be ideally rational and still be ignorant in these ways. 23 Yet if it is correct that an ideally rational agent doesn’t need to possess every possible concept, then the idealization strategy is vulnerable to the argument from conceptual omniscience. If an ideally rational agent does not need to be conceptually omniscient, then it is possible that she acquires a new concept that allows her to formulate a new theory that she previously hadn’t assigned a credence to. And if this is possible, then an ideally rational agent can still encounter the problem of new theories, because once she has conceived of a new theory that utilizes the new concept she has learned, her probability space has changed and she must redistribute her credences in a way for which there is no rule that can be derived from the probability calculus via (SPBP). Thus, (SPBP) cannot be saved by claiming that it only applies to ideally rational agents, because even ideally rational agents can encounter the problem of new theories. It is open to defenders of (SPBP) to respond to these arguments by claiming that (SPBP) should not be rejected, but that it only applies to agents who are ideally rational, as well as conceptually omniscient. However, it is not clear that there is any independent motivation for this view. The common notion of what it takes to be epistemically rational don’t seem to support the idea that one can become more rational by acquiring new concepts, or that lacking some concept is a failure of rationality. But if it is not independently motivated that conceptual omniscience is a condition for ideal rationality, then it seems like a proponent of (SPBP) who endorses this view is tailoring her notion of an ideal agent to fit her theory of reasoning, rather than the other way around. This seems problematic in terms of getting the direction of explanation right. Moreover, it seems unhelpful, given that we are ultimately interested in finding the right theory of reasoning for human agents, who can’t be tailored to the needs of our theories. 23 Christensen himself argues that factual omniscience is different from ideal rationality, and the former is not required for the latter. 43 While the argument from conceptual omniscience shows that subjective Bayesianism cannot be seen as a theory that by itself provides a complete theory of reasoning for ideal agents, it is of course open to proponents of the idealization strategy to maintain that subjective Bayesianism provides us with necessary conditions on reasoning and rationality that agents have to fulfill in order to count as ideally rational. However, this response does not provide us with much of an insight into how subjective Bayesianism should inform the reasoning of non-ideal agents, for whom full compliance with the probability axioms is out of reach. Conclusion In this paper, I argued for two claims: First, I showed that there are significant parallels in how deductive logic and the probability calculus relate to theories of reasoning with full and partial beliefs, respectively. Harman’s arguments for the claim that there is a gap between deductive logic and a theory of reasoning turn out to carry over directly to the domain of partial belief, especially when we consider the reasoning processes of non-ideal agents. This means that we should be careful to distinguish between theories that are more akin to logical systems, and theories that are meant to capture norms of reasoning. These two kinds of theories have very different aims, and are vulnerable to very different kinds of counterexamples. Secondly, I argued that the Simple Probabilistic Bridge Principle (SPBP) cannot be saved by arguing that the resulting theory of reasoning only applies to ideal agents. That is because there are still instances of rationally permissible probabilistic inferences that are not captured by (SPBP) even if we restrict it to ideally rational agents. I argued that in order to retain (SPBP), we would have to assume that ideally rational agents must be conceptually omniscient – an assumption that is neither attractive nor independently motivated. However, it is still open to proponents of subjective Bayesianism to maintain that its norms provide necessary conditions on reasoning and rationality that agents have to fulfill in order to count as ideally rational. I also discussed an alternative response to the arguments against (SPBP), namely to view the degree of belief-interpretation of the probability axioms as a kind of logical system instead, and to deny (SPBP). While this response obviously avoids the problems with (SPBP), there are some worries about the claim that subjective Bayesianism is best interpreted as a logical system of probabilistic inference. Howson and Urbach, the main proponents of this view, base their claims mostly on similarities between the probability calculus and classical logic. However, it is not 44 obvious that these similarities are sufficient to establish the status of the probability calculus as a logical system, especially since this interpretation seems to be at odds with the fact that the subjective Bayesianism is usually stated as a theory that constrains agents’ rational degrees of belief. More discussion of the nature of a logical system is needed in order to settle the question of whether this is a tenable interpretation of subjective Bayesianism. Whichever response to the arguments against (SPBP) one favors, we are left with the question of how the rules of subjective Bayesianism inform how non-ideal agents should reason. Whether we view the Bayesian system as providing a logic of probabilistic inference, or as providing necessary conditions for ideal reasoning and rationality, there is plausibly some relationship between the Bayesian rules and the norms that govern how non-ideal agents should go about reasoning with degrees of belief. Yet, it is an open question what this relationship looks like. We saw that there are many ways in which non-ideal agents need to reason that simply don’t come up in the ideal case, for example revising irrational degrees of belief. The answer to the question of how non-ideal agents should go about revising irrational degrees of belief may turn out to be informed by Bayesian constraints, but we cannot expect Bayesianism to deliver the whole answer. Moreover, I will show in the last chapter of the dissertation that it cannot be taken for granted that principles of reasoning that are suitable for ideal agents are equally applicable to non-ideal agents. This is the case not just because ideal rules are sometimes impossible to follow for non-ideal agents, but also because when these agents can follow ideal rules, doing so sometimes turns out to be non-optimal. 45 Chapter Three: Formulating Principles of Reasoning Introduction In order for our reasoning processes to produce justified beliefs, it is crucial that we reason well. But what constitutes good reasoning? Prima facie, it might seem like we can get the answer by simply consulting logic or probability theory. But as we saw in the previous chapter, these theories are ill-suited for directly providing prescriptions or norms for good reasoning, especially when these prescriptions are meant to apply to non-ideal agents. In a nutshell, these systems either contain rules such that following them would constitute bad reasoning, or they contain rules that would be too difficult for non-ideal agents like us to follow. An example of the former comes up in classical logic, which contains the theorem that a contradiction entails everything. Yet, the prescription for reasoning that we might directly extract from this – “If you have contradictory beliefs, infer whatever you like from them” – is certainly not a rule reasoners should follow. The latter phenomenon can be illustrated with an example from probability theory. The probability axioms require that one is at least as confident in the logical consequences of a claim as one is in that claim itself. However, there are many mathematical claims, such as Goldbach’s conjecture, of which we may never know whether they or their negations are entailed by the axioms of Peano arithmetic, because proving this has turned out to be very difficult. Yet, anyone who is more confident in the Peano axioms than in some claim that is entailed by them is incoherent. However, it seems unreasonable to prescribe that non-ideal agents who have a middling credence in Goldbach’s conjecture or other complicated mathematical claims that haven’t been proven must therefore keep also have a middling credence in the Peano axioms in order to avoid this kind of incoherence. In response to the difficulties I just sketched there have been multiple attempts in the literature to formulate norms or prescriptions of reasoning that rely upon, but are distinct from, the rules of logic and probability. In this paper, I will present two different views of how to select and evaluate prescriptions, or sets of prescriptions for good reasoning, which I will call the “criterial view” and the “tradeoff view.” According to the criterial view, the way to select prescriptions for good reasoning is by first establishing a catalog of criteria that these prescriptions must fulfill. Once these criteria have been established, they are used to rule out norms of 46 reasoning that don’t fulfill them. By contrast, the tradeoff view doesn’t look for necessary conditions that prescriptions for good reasoning must fulfill. Instead, it specifies dimensions of evaluation on the basis of which different prescriptions can be evaluated. Different prescriptions might rank differently along each dimension, and in order to determine how to evaluate a prescription, it may be necessary to weigh these different rankings along the different dimensions of evaluation against each other. The important difference between the two views is that the tradeoff view uses a graded approach for evaluating prescriptions for good reasoning along different dimensions, and the results of these evaluations must then be weighed against each other, whereas the criterial view uses a binary approach – it determines whether or not a prescriptions for good reasoning possesses various qualities to a sufficient degree to not be ruled out by one of the specified criteria. I will argue that, when applied to evaluating and selecting prescriptions for good reasoning, the tradeoff view is superior to the criterial view. This means that we should reject the positions that have been presented in the literature, because they are best interpreted as versions of the criterial view. 1. The Criterial View When we ask how we should reason, we encounter at least two competing considerations: on the one hand, a norm or prescription that tells us how to reason should not be such that complying with it leaves us in a position that is undesirable, epistemically speaking. On the other hand, it must be feasible for limited beings like us to comply with such a norm or prescription, so it can’t be too computationally demanding. It is not difficult to see that these two considerations often pull in different directions. Reasoning that is optimal from an epistemic point of view can be very difficult, and reasoning procedures that are easy for us to execute don’t always produce results that live up to ideal epistemic standards. If both of these considerations are taken to be criteria for selecting prescriptions for good reasoning in the way required by the criterial view, they may rule out any potential prescription in one way or another. 24 24 In what follows, I will use the expressions “prescriptions for good reasoning” and “norms of reasoning” interchangeably. There are different ways one might state such prescriptions or norms. One might state them in the form of an imperative, such as “Reason in ways x, y, and z!”, or in the form of a declarative sentence, such as “You ought to reason in ways x,y, and z.” The exact phrasing does not matter for my purposes. 47 We can see very clearly how this problem arises in John MacFarlane’s paper “In What Sense (If Any) Is Logic Normative For Thought?” (2004) MacFarlane’s aim is to find a bridge principle that connects a claim about logical entailment with a norm of reasoning. 25 He starts out by considering thirty-six possible ways of spelling out the following principle: If A, B |= C, then (normative claim about believing A, B, and C). He locates three different ways of placing a deontic operator in the principle: (i) In the consequent: If A, B |= C, then if you believe A, and you believe B, you ought to believe C. (ii) In the antecedent and consequent: If A, B |= C, then if you ought to believe A and believe B, you ought to believe C. (iii) Scoping over the whole conditional: If A, B |= C, then you ought to see to it that if you believe A and believe B, you believe C. More version of the principle are generated by (a) replacing ‘believe’ with ‘not disbelieve’, (b) replacing ‘ought’ with ‘may’ or ‘has defeasible reason for’, and (c) by changing the antecedent to ‘if you know that A, B |= C, then (...).’ This yields thirty-six different versions of the principle in total. He then formulates five criteria that an acceptable bridge principle would have to fulfill in order to provide us with a norm of reasoning. The first criterion, called “Excessive Demands”, is supposed to rule out principles that leave us with overly demanding norms of reasoning, for example ones that require agents to believe all of the logical consequences of their beliefs. Then there are three criteria that rule out principles following which would be undesirable from an epistemic perspective. The “Strictness Test” is supposed to rule out principles following which would leave the agent “not entirely as she ought to be” (MacFarlane, 2004, p. 12). The criterion called the “Priority Question” rules out principles following which would require agents only to draw inferences that the agent knows how to perform. He argues that this makes good reasoning too easy for agents – they just need to be ignorant enough about how to draw certain inferences. The “Logical Obtuseness” criterion rules out principles that allow agents to refuse to take a stand 25 I will use the expression “norm of reasoning” for what MacFarlane is looking for, even though strictly speaking agents don’t always have to reason in order to comply with the norms in question, depending on how the bridge principle is spelled out, and depending on the epistemic situation the agent is in. However, MacFarlane himself thinks that the resulting norms are best described as “norms of reasoning” or “norms of thought”, so I think this terminology is appropriate. 48 on obvious logical consequences of their beliefs. Then there is a fifth criterion that rules out principles that generate the preface paradox. 26 When MacFarlane sums up which of the thirty-six versions of the bridge principle are ruled out by his criteria, he finds that all of the initially plausible candidates generate norms that are either too demanding, or leave the reasoner in an epistemically undesirable position. (There turn out to be no principles that are ruled out purely based on the condition that they shouldn’t generate the preface paradox.) This result should not surprise us. Using criteria that pull in different directions can easily result in ruling out all the available options. By imposing various selection criteria that rule out principles that fall below a certain mark from an epistemic perspective, the only principles that pass the “epistemic tests” are ones that generated norms that are too hard to comply with. Before I get to MacFarlane’s proposed solution, I want to point out more generally how this conflict has been resolved by other philosophers. In the literature, two main strategies have been adopted to solve the problem of conflicting selection criteria for norms of reasoning. Both of these strategies operate within the criterial view. That means that they make the assumption that when we evaluate an entity of interest, such as a prescription for good reasoning, we rely on criteria that specify necessary conditions that the entity has to meet in each of various dimensions of evaluation. The problem of criteria pulling in opposite directions, and hence ruling out all possible entities of interest, is met by dropping or weakening some of these criteria. One strategy that avoids MacFarlane’s result consists in allowing that norms of reasoning might sometimes be too demanding for human agents to comply with. This approach amounts to either weakening or dropping the criterion that requires that norms of reasoning must make feasible demands on non-ideal agents like us. The second strategy consists in taking the epistemic limitations of human agents seriously, and denying that good reasoning requires agents to do things they can’t do. This approach amounts to either weakening or dropping the criteria that specify the minimum epistemic standards that an acceptable prescription or norm of good reasoning must meet. 26 The preface paradox arises in the following kind of case: “You have written an authoritative book about sea turtles. You believe each claim you make in the book. Yet, you also believe, on general inductive grounds, that at least one of these claims is false. So you don’t believe the conjunction of the claims in the book. Indeed, you disbelieve it, despite the fact that it is a (known) logical consequence of other things you believe. And your position seems quite rational.” (MacFarlane, 2004, p. 12) The ‘preface paradox’ criterion is meant to rule out bridge principles that say that the combination of attitudes just described is irrational, contrary to what seems to be the intuitively correct judgment about the case. 49 An example of a philosopher who takes the first approach is Ralph Wedgwood. His work on reasoning focuses on formulating theories of ideal, non-defective reasoning. He develops an account of what reasoning consists in, and he formulates norms of reasoning that don’t take into account the cognitive limitations of actual agents (Wedgwood, 2006, 2012). While he acknowledges that it is “unlikely” that a human agent could ever comply with all the prescriptions made by his theory, he maintains that it is in some weak sense possible for human agents to be perfectly rational, and that therefore the norms he proposes make adequate prescriptions for how we ought to reason (Wedgwood, 2012, p. 289). This strategy amounts to adopting a feasibility criterion that is so minimal that none of the epistemically desirable norms of reasoning are ruled out by it. Such an approach also appears to be assumed in the psychological literature on heuristics and biases, arising out of the work of Kahneman & Tversky, among others. For here it is assumed that norms of reasoning are ideal norms, and it is shown that humans often don’t reason in the ways that are prescribed by these norms (Kahneman & Tversky, 1982). 27 From this it is concluded that humans are generally irrational, and that their reasoning is flawed. The second route has been taken by philosophers and psychologists who deny that principles of ideal rationality are normative for human agents. They maintain that prescriptions that concern human reasoning must be defined within the limits of what humans are capable of. Some philosophers who have gone this route are Hacking (1967), Harman (1986), and Field & Milne (2009). They all propose principles of reasoning that limit epistemic obligations to making inferences that are “known” (Hacking, 1967, p. 320), “recognized” (Harman, 1986, p. 19), or “obvious” (Field & Milne, 2009, p. 259). In psychology, a prominent view of this kind is the theory of “bounded rationality,” which has been prominently endorsed by Gerd Gigerenzer. Accounts of bounded rationality take reasoning principles to be normative only relative to an 27 Famous examples of experiments that are meant to show this are the Wason selection task, and the research on base-rate fallacies. In the Wason selection task, subjects are asked to point out how to check whether a certain conditional statement is true, and it is shown that they select incorrect ways of doing so. The research on base-rate fallacies shows that subjects ignore relevant information about base-rates when calculating the probability of an event. It is concluded from studies of this kind that human reasoning employs certain shortcuts and exhibits certain biases that make it diverge from being perfectly logical or probabilistic, and that therefore humans are fundamentally irrational. 50 agent and an environment. Gigerenzer explicitly denies that rules of logic and probability are normative for human reasoning (Gigerenzer 2006, p. 123). 28 This is also the strategy that MacFarlane adopts. He proposes to drop the “Strictness Test,” which is meant to rule out principles of reasoning that leave the agent in a less than optimal epistemic state, and he also drops the criterion that rules out principles that lead to the preface paradox. Given the other criteria that are still in place, this leads him to accept the combination of the following two bridge principles: If A and B |= C, then you have reason to see to it that if you believe A and you believe B, you believe C. If A and B |= C, then you ought to see to it that if you believe A and you believe B, you don’t disbelieve C. It is easy to see that each of these two responses to MacFarlane’s problem has some advantages. 29 The first strategy captures the idea that reasoning should be performed in an epistemically optimal way: agents who reason well are thereby “entirely as they ought to be.” The second strategy captures the idea that norms of reasoning have to fit the capacities of the agents who are supposed to comply with them. Yet at the same time, each of these strategies is importantly incomplete. The first strategy fails to address the question what reasoners should do under non-optimal conditions. Prescriptions for good reasoning that can be employed only by ideal agents under optimal conditions clearly can’t be used to guide the reasoning processes of actual human beings in the real world. Nor are they very useful in evaluating such reasoning processes, since they don’t allow us to discriminate between the better and worse ways in which human agents can reason. 28 According to the theory of bounded rationality, a certain principle is rational to employ for an agent in a certain environment if it produces good enough results in that environment while being easy to employ, or computationally cheap, for the reasoner. Gigerenzer says with respect to the normative role of logic that “logical thinking is not central to human reasoning about these problems, as well as that that truth-table logic is an inappropriate norm.” (Gigerenzer 2006, p. 123) 29 Some authors try to go both ways, and claim that there are two different senses of being rational: rationality 1 and rationality 2. For example, Evans and Over draw this distinction, and they define principles of rationality 1 as the kind of principles that help agents achieve acceptable results within their limited resources, whereas principles of rationality 2 are understood as principles that implement formal rules of logic and probability. However, Evans & Over fail to address the question how the two kinds of rationality are related. (Evans & Over, 1996) 51 The second strategy also has several problems. The versions of the strategy that generate norms that only require agents to draw inferences that they know, or recognize, or that are obvious to them, fall prey to one of MacFarlane’s objections. Recall that MacFarlane objected to norms of this kind by arguing that they made good reasoning too easy. Here’s what he says: The more ignorant we are of what follows logically from what, the freer we are to believe whatever we please—however logically incoherent it is. But this looks backwards. We seek logical knowledge so that we will know how we ought to revise our beliefs: not just how we will be obligated to revise them when we acquire this logical knowledge, but how we are obligated to revise them even now, in our state of ignorance. (MacFarlane, 2004, p. 12) I think there are two separate problems that MacFarlane brings up in this quote. First, it seems rather strange that being terrible at recognizing inference patterns and entailments is no hindrance on this view to being a good reasoner. Quite the contrary – if an agent is bad at recognizing entailments, then there are very few inferences she is required to draw, and hence it is easy for her to comply with all the prescriptions for good reasoning. The second problem concerns the fact that these norms make the demands on a reasoner dependent on what the reasoner knows. But this seems to be in conflict with how we usually conceive of norms of reasoning. I might wonder how I ought to revise my beliefs. According to the norms in question, the answer is that I don’t need to revise them at all. The fact that I am wondering how to revise them shows that I don’t know, or recognize, or find it obvious how to revise them, which means that I am under no obligation to revise them. This is clearly an absurd consequence, which shows that these norms of reasoning cannot be correct. Recall that MacFarlane also adopts a version of the second strategy. Dropping the “Strictness Test” and the criterion that rules out principles that generate the preface paradox leaves him with the following combination of bridge principles: If A and B |= C, then you have reason to see to it that if you believe A and you believe B, you believe C. If A and B |= C, then you ought to see to it that if you believe A and you believe B, you don’t disbelieve C. In combination, these principles ensure that agents are forbidden from having inconsistent beliefs, and that agents have reasons to believe the logical consequences of what they believe. While this 52 solution avoids the problem of tying the norms to the knowledge of the agent, it still generates norms that are very weak. For example, suppose John believes that he has a brother, and he believes that he has a sister. According to the first principle, he has reason to see to it that if he believes these things, then he believes that he has a sister and a brother. This is a very weak requirement, since reasons are defeasible, and agents are usually not taken to be subject to criticism for not doing things that they have some reason to do. But intuitively, it seems like, given that he believes that he has a brother and he believes he has a sister, John has more than just a defeasible reason to believe that he has a sister and a brother. If he were to consider each of his beliefs, but refused to take a stance on the conjunction of them, then this would intuitively be a rational failing. However, given MacFarlane’s formulation of the principle, he would not be open to criticism, since people are not rationally obligated to do things they have some reason to do. Hence, one might criticize MacFarlane’s way of implementing the strategy of dropping some of the epistemic criteria on the grounds that the norms of reasoning he proposes are too weak. The problems I just pointed out with the different implementations of the second strategy all concerned the particular norms that were generated by dropping or weakening some of the epistemic selection criteria for prescriptions of good reasoning. But one might worry that this doesn’t show that there is something generally wrong with the idea that we just need to adjust the epistemic selection criteria in order to find appropriate norms of reasoning. In order to see the more general problem with this strategy, it helps to consider the ways in which the criterial view is limited in comparison to the tradeoff view. Recall that, in evaluating candidates for norms of reasoning along some dimension, the main difference between criterial view and the tradeoff view is that the criterial view makes us check whether the norm in question meets a certain mark or cutoff that qualifies it as being good enough. By contrast, on the tradeoff view, we record how highly the norm ranks along this dimension, and whether or not this makes the norm acceptable is determined by weighing its ranking in this and other relevant dimensions of evaluation. By selecting norms of reasoning that are good enough, and telling reasoners that these are the norms they must comply with in order to be good reasoners, the criterial view thereby has a hard time distinguishing between better and worse reasoners. We can of course distinguish between reasoners who comply with the prescriptions made by the theory, and reasoners who don’t. However, it seems very natural to think that if a theory prescribes norms of reasoning that are fairly easy to comply with, then there could be various reasoners that actually reason much 53 better than prescribed by the theory. In other words, they could be seen as complying with much stricter or more demanding prescriptions than the ones given by the criterial view. At the same time, there are also differences between reasoners who fall short of complying with the norms put forth by the theory. Yet, the criterial view lacks the explicit resources to make these distinctions, because it simply selects prescriptions for good reasoning that meet the relevant criteria, and doesn’t distinguish any further between other candidate norms of reasoning that agents might comply with. By contrast, the idea that we should be able to distinguish and rank better and worse norms of reasoning according to various dimensions of evaluation is explicitly built into the tradeoff view. Notice that the idea that we can distinguish between prescriptions for good reasoning that are better or worse is definitely present in the background of the criterial view. Otherwise, it would be hard to determine whether a norm of reasoning is actually “good enough”, and meets the specified necessary conditions, for example the condition of whether a norm is easy enough to comply with. Yet, the criterial view does not seem to base these judgments on an explicit theory of how to rank reasoning prescriptions along relevant dimensions of evaluations, for example feasibility or epistemic goodness. In sum, it turns out that both responses to MacFarlane’s problem actually suffer from the same type of problem. Recall that the strategy that involves dropping the feasibility criterion is problematic, because the norms it generates are not helpful for guiding or evaluating the reasoning of agents who aren’t ideal. The second strategy, which involves weakening the epistemic criteria in favor of a strong feasibility criterion, also has problems with guiding and evaluating non-ideal agents’ reasoning in cases where agents are complying with norms that are stricter or weaker than the norms of reasoning the view prescribes. The view is simply not designed to make these kinds of distinctions. By contrast, the tradeoff view is designed to do exactly that. I will explain its advantages over the criterial view in the next section. 2. The Tradeoff View In this section, I will explain why we should adopt the tradeoff view as our basis for evaluating prescriptions for good reasoning instead of the criterial view. The tradeoff view specifies certain dimensions of evaluation that are relevant to selecting principles for good 54 reasoning, and principles can be evaluated with regard to how highly they rank on any of these dimensions. These results can be weighed against each other when giving overall evaluations of norms of reasoning. Unlike with the criterial view, there are no fixed cutoff points that a principle must meet on any of these dimensions. There can, however, be constraints on how to generate an overall evaluation of a reasoning principle based on the degrees to which it possesses various qualities. The tradeoff view is thus explicitly designed to distinguish between candidates for norms such that complying them would constitute better reasoning, and other potential candidate norms such that complying with them would constitute worse reasoning. That means the tradeoff view avoids the problems I pointed out with the criterial view in the last section. In spelling out the tradeoff view for selecting norms of reasoning more clearly, we must ask what the relevant dimensions of evaluation are. The considerations we encountered in discussing the criterial view are helpful here. There must be one or more dimensions of evaluation that specify how difficult it is to comply with a particular norm or prescription, and there also must be one of more dimensions of evaluation that specify how good it is to comply with a norm or prescription from an epistemic point of view. The answer to the question of how difficult it is to comply with some norm must be informed by facts about mathematical complexity, as well as by empirical facts about how well humans are able to perform various kinds of reasoning tasks. Specifying dimensions of evaluation that help us evaluate how good a norm of reasoning is from an epistemic point of view is a project that is more traditionally philosophical, since it can arguably be done from the armchair. The key to developing scales according to which we can evaluate the epistemic goodness of a way of reasoning is to recognize that there are better and worse ways of reasoning, and that we tend to judge epistemic goodness by comparing a reasoning strategy to the optimal way of reasoning. Judging from what epistemologists take to be good epistemic states to be in, prescriptions for good reasoning should promote epistemic values such as the reasoner having true beliefs, the reasoner having coherent beliefs, and the reasoner having beliefs that are well-supported by her evidence. In the next chapter, I take on the project of developing a graded account of probabilistic coherence, which I then put to use in the subsequent chapter to evaluate norms of reasoning. But before I go into the details of my proposal, I will use a little toy example to further clarify the 55 difference between the criterial view and the tradeoff view, and the merits of the latter compared to the former. Suppose you want to figure out how confident you should be that it will rain during the Soccer World Cup final. You are pretty well informed about the likelihoods of rain for different relevant cities, but you are uncertain about where the final will take place. You are 45% confident that it will be in Berlin, 45% confident that it will be in Munich, 5% confident that it will be in Cologne, and 5% confident that it will be in Hamburg. Suppose you know that Berlin has a 40% chance of rain, Munich has a 60% chance of rain, Cologne has an 80% chance of rain, and Hamburg has a 10% chance of rain on the relevant day. The correct of way of computing the rational credence that it will rain at the final involves using the total probability theorem, and it is arguably too complicated to do quickly without a pencil and paper: P(rain at final) = P(Ber) × P(rain|Ber) + P(Mu) × P(rain|Mu) + P(Col) × P(rain|Col) + P(Ham) × P(rain|Ham) P(rain at final) = 0.45 × 0.4 + 0.45 × 0.6 + 0.05 × 0.8 + 0.05 × 0.1 = 0.495 However, you might simplify the calculation by ignoring the options you have little confidence in. Now, you can reason as follows: I am equally confident that the match is in Berlin or Munich, Berlin has a 40% chance of rain, and Munich a 60% chance, so by taking half of each and adding them, I get the result that I should be 50% confident that it will rain during the final. This way of reasoning is simpler, but it also introduces rounding errors. The simplified calculation yields a result of 0.5, whereas the full calculation yields a result of 0.495. We can associate each of these strategies with a prescription that tells us how to reason in cases like this. The “ideal” prescription says “Compute the answer according to the rules of probability theory, and include all relevant data.” The simplified prescription, which we might call “guesstimating”, says “Compute the answer according to the rules of probability theory, but leave out marginal values.” How good is guesstimating in this case? Let’s first consider what the criterial view would say. The criterial view only has the resources to give one of two answers. The “ideal reasoning” version of the view would have to reject guesstimating as a good norm of reasoning, since it allows the agent to diverge from the correct calculation. It thus lacks the resources to recognize that this simplified strategy delivers a good approximation to the correct result. By contrast, the “lowering the epistemic bar” version of the criterial view might accept guesstimating as a prescription for 56 good reasoning, yet doesn’t have the explicit resources to say that even though this way of reasoning is acceptable, there is another norm such that complying with it would be better from a purely epistemic perspective. In comparison, the tradeoff view fares much better. Since it provides dimensions of evaluation that help us evaluate reasoning strategies according to their complexity, and according to their epistemic goodness, we can use it to rank the two strategies in each dimension. We can recognize that guesstimating in our toy example is less complex than the full calculation. We can also capture the idea that the result generated by the simplified calculation is incorrect, but very close to the optimal result. Yet, these results by themselves don’t yet deliver a verdict on the guesstimating norm, because we must decide how to weigh them. Whether or not guesstimating is acceptable depends not only on its complexity and the goodness of its result, but also on the specifics of the situation in which it is employed. In a situation with little time, where perfect accuracy is not crucial, it seems like a good prescription for the agent to comply with. Yet, if perfect accuracy matters, taking the extra time to do the long calculation might be what the agent ought to do. More generally, the tradeoff view does not force us to single out certain norm of reasoning as uniquely correct. Rather, it ranks different norms according to their complexity and epistemic goodness, and it is a matter of the capacities of the reasoner and the specific features of the situation which norm or prescription would be most appropriate in a given case. In contrast to the criterial view, the tradeoff view sacrifices neither epistemic goodness nor feasibility in evaluating principles of reasoning. Furthermore, it can explain how prescriptions that make for good human reasoning are related to norms of ideal reasoning. Norms of ideal reasoning simply score higher on the epistemic goodness scale, but they might also score lower on the feasibility scale. Conclusion In this paper, I contrasted two different views regarding how to select and evaluate norms of reasoning: the criterial view and the tradeoff view. I showed how the criterial view fails to select any norms if it employs criteria that pull in opposite directions and thereby rule out every candidate norm. I then argued that solving this problem by weakening or dropping some of the criteria still leaves us with the problem that the resulting views are ill-suited to distinguish between 57 different prescriptions following which would constitute better and worse reasoning. The strategy that involves dropping the feasibility criterion cannot distinguish between better and worse norms of reasoning that non-ideal agents can comply with. The second strategy, which involves weakening the epistemic criteria in favor of a strong feasibility criterion, also has problems with guiding and evaluating non-ideal agents’ reasoning in cases where agents are complying with norms that are stricter or weaker than the norms of reasoning the view prescribes. The tradeoff approach fares much better in comparison, because it is explicitly designed to rank norms of reasoning along different dimensions of evaluation. Hence, it can easily capture how well an agent is doing by complying with one prescription rather than another. Moreover, as the example in the last section demonstrates, the tradeoff approach exhibits a certain desirable flexibility: if we assume that a reasoning process needs to be very simple to execute, but doesn’t need to be very accurate, some prescription S for how to reason might look most suitable. But if we assume instead that simplicity is not very important, but the reasoning process should be very accurate, another prescription T might be appropriate for the task at hand. Thus, the account can accommodate the natural thought that different ways of reasoning might be best in different situations, because the relevant desiderata should be weighed differently in different circumstances. In the remaining two chapters, I will take steps towards developing a tradeoff view that helps us select norms of reasoning. In chapter four, I will develop a graded account of probabilistic coherence, which we can then use to evaluate how various norms of reasoning promote coherence among an agent’s degrees of belief. 58 Chapter Four: Degrees of Incoherence and Dutch Books Introduction Many philosophers hold that the probability axioms constitute the norms of rationality governing degrees of belief. This view is widely known as subjective Bayesianism. While this view is the foundation of a broad research program, it is also widely criticized for being too idealized. It is claimed that the norms on degrees of belief postulated by subjective Bayesianism cannot be followed by human agents, and hence that these norms have no normative force for beings like us. This problem is especially pressing since the standard framework of subjective Bayesianism only allows us to distinguish between two kinds of credence functions – coherent ones that obey the probability axioms perfectly, and incoherent ones that don’t. My goal in this paper is to extend the framework of subjective Bayesianism in such a way that we can capture differences between incoherent credence functions. Being able to measure to what degree a credence function is incoherent will help us model the degrees of belief of non-ideal agents. Further, such a measure would enable us to explain how the ideal rules of Bayesianism can be approximated by non-ideal agents, and hence to explain how such rules can be normative for non-ideal agents. And such a measure of degrees of incoherence will also give us an important tool for evaluating reasoning processes of non-ideal agents. In order to reach this goal, I will rely on a well-known class of arguments that have been put forth in support of various norms on rational degrees of belief: Dutch book arguments. Dutch book arguments are commonly taken to dramatize incoherence in an agent’s credence function by showing that an incoherent agent is vulnerable to a guaranteed monetary loss. Hence, since Dutch books draw a connection between incoherence and something quantifiable, they are a natural resource to turn to for a measure of probabilistic incoherence. I argue that we can use a type of Dutch book to measure degrees of incoherence, by setting up betting arrangements in such a way that the degree to which a credence function is incoherent is reflected in the amount of guaranteed loss to which the arrangement leads. The structure of my discussion is as follows: In the first section of the paper, I explain in more detail why we need a measure of degrees of incoherence. In the second section, I turn to the first of two proposals that have been made in the literature for measuring incoherence, which has been put forth by Lyle Zynda. I argue that the measure is unsuccessful, and I explain what lessons 59 we can learn from its failure for setting up a better measure. In the third section, I review how Dutch book arguments work, and I explain why we should expect them to be a good starting point for developing a measure of incoherence. In the fourth section, I turn to the second proposal for a measure of incoherence, which is a Dutch book measure that has been put forth by Schervish, Seidenfeld, and Kadane. I argue that it is unsuitable as a measure of the total incoherence of a credence function, and hence unsuitable for the tasks laid out in section one. Finally, in sections five and six, I propose my own measure, the maximum Dutch book measure. This measure avoids the problems that befall the earlier two proposals, and gives us the way of comparing incoherent agents that we need. 1. Why We Need a Measure of Degrees of Incoherence On the standard subjective Bayesian view, any rational credence function, which is an assignment of degrees of belief to propositions, must obey the axioms of probability. To set up the probability calculus, we begin with a set of atomic statements {Ai}, and we combine it with the standard sentential logical operators to define a language L. We also assume that the relation of logical entailment is defined in the classical way. A probability function P on L must satisfy the following axioms: Normalization: For any statement A, if A is a tautology, P (A) = 1 Non-Negativity: For any statement A, P (A) ≥ 0 Finite Additivity: For any two mutually exclusive statements A, B, P (A ∨ B) = P (A) + P (B) An agent’s degrees of belief, or credences, are considered coherent if and only if they obey these probability axioms. 30 While the standard Bayesian system doesn’t let us distinguish any further between various incoherent credence functions, it is easy to see that, intuitively, some incoherent credence functions are more incoherent than others. Suppose, for example, that there are two agents, Sally and Polly, whose credence functions are the same, except for their respective credences in some tautology T. According to the probability axioms, the correct credence to assign to T is 1. Sally’s 30 These three axioms are the standardly accepted synchronic requirements for having coherent degrees of belief. There are also other requirements that concern how the agent should change her credences over time, but I won’t be concerned with them in this paper (see Weisberg, 2011). 60 credence in T is 0.99, whereas Polly’s credence in T is 0.2. Intuitively, Polly’s credence in T displays a greater failure of coherence than Sally’s. Yet, we cannot capture this difference in the standard Bayesian framework. Being able to extend the Bayesian framework so as to capture these differences is a desirable goal in itself, but there are further reasons to think that a measure of probabilistic incoherence would be desirable. One reason is that having such a measure would help us respond to one of the standard objections to Bayesianism, which claims that Bayesianism has no application to real agents (e.g. Hacking, 1967; Harman, 1986; Christensen, 2004). This is because the Bayesian norms are so demanding that no actual agent could be expected to have a credence function that complies with all of them. Human beings, having limited cognitive capacities, are not in a position to make complicated calculations easily, and they don’t have an immediate grasp of complex logical relations and properties, which is arguably needed in order to keep one’s credences coherent at all times. For example, the probability axioms require that one is at least as confident in the logical consequences of a claim as one is in that claim itself. However, there are many difficult mathematical claims, such as Goldbach’s conjecture, of which we may never know whether they or their negations are entailed by the axioms of Peano arithmetic. Yet, anyone who is more confident in the Peano axioms than in some claim that is entailed by them is incoherent. As Lyle Zynda (1996) has argued, it is hard to see how an ideal norm can apply to non- ideal agents if there is no meaningful sense in which these agents can come closer to complying with the norm. We can capture this worry in a general principle: Norm Governance: For any agent A and end E, if A could never achieve E, then E can serve as a norm for A just in case (i) we can grade how closely A approximates E, and (ii) relative to this gradation, it is within A’s power to approximate E to varying degrees. To see why Norm Governance is plausible, consider the example of a glassblower who tries to make a spherical Christmas ornament. A perfect sphere serves as the norm here, but of course the glassblower can’t possibly make a perfect sphere. Yet it makes sense to say that a perfect sphere serves as an ideal norm, because the glassblower can evaluate her attempts based on how much they approximate a perfect sphere. However, if there were simply no way of assessing how close a given object is to being a perfect sphere, it would be hard to see how it could serve as a norm. 61 If we apply Norm Governance to the standard Bayesian framework, the problem is immediately obvious: the standard Bayesian view doesn’t allow us to measure degrees of incoherence. But if we can’t measure degrees of incoherence, then Norm Governance tells us that probabilistic coherence is not an end that can serve as a norm for human agents. However, if we have a measure of probabilistic incoherence, then we can explain how agents can approximate perfect coherence, and hence we can explain how the ideal Bayesian norms apply to non-ideal agents. Another reason why it would be helpful to have a measure of probabilistic incoherence is that we can use it to evaluate the methods by which non-ideal agents might revise their degrees of belief. Normal agents, who never comply with all the ideal Bayesian norms, clearly revise their degrees of belief routinely. It would be nice to have a way of evaluating the reasoning strategies or other methods by which these revisions are carried out. One dimension along which we can evaluate reasoning strategies is with respect to their effect on the agent’s degree of incoherence. Surely, even if an agent has credences that are already incoherent, it would be epistemically bad for her to reason in ways that make her even more incoherent. Yet, if all we can say is that non- ideal agents always move from one incoherent state to another, we can’t provide any useful assessment along these lines. By contrast, if we can measure how incoherent an agent’s credences are before and after she employs a certain reasoning strategy, we have a tool by which reasoning strategies can be assessed. There are many other applications of a measure of degrees of incoherence, but the ones I just listed should suffice for now as a motivation for the usefulness of such a measure. In the philosophical literature, two different proposals have been put forth for how to set up such a measure. In the next section, I will discuss one of them. 2. First Proposal: Zynda’s Measure of Incoherence The claim that we need a measure of degrees of incoherence in order to explain how the ideal Bayesian norms can apply to real agents is what motivates Lyle Zynda (1996) in developing his measure. The basic idea behind Zynda’s measure is that we can compare incoherent credence functions by comparing their maximal coherent restrictions. He defines a credence function as a set of ordered pairs of propositions and their assigned credences, where the propositions must 62 form a Boolean algebra. 31 A maximal coherent restriction of an incoherent credence function can be generated by removing the smallest possible number of proposition/credence pairs from the agent’s credence function, such that the remaining credences can be extended to a coherent credence function over a Boolean algebra. Often, there will be more than one way of doing so, in which case all possible ways of creating a maximal coherent restriction must be considered. In comparing two different credence functions f and g, one can arrive at one of four different outcomes: 1) f and g have the same maximal coherent restrictions, 2) for all maximal coherent restrictions of f and g, each of the maximal coherent restrictions of f is a proper subset of one the maximal coherent restrictions of g, 3) for all maximal coherent restrictions of f and g, each of the maximal coherent restrictions of g is a proper subset of one of the maximal coherent restriction of f, or 4) none of the above. In the first case, f and g are equally coherent; in the second case, g is more coherent than f; in the third case, f is more coherent than g; and in the fourth case, f and g are incommensurable. We can use this method to order credence functions that are commensurable with respect to how incoherent they are. I will now argue that Zynda’s measure is problematic, for three reasons: (i) it orders incoherent credence functions in ways that are intuitively incorrect, (ii) credence functions that are intuitively comparable are incommensurable in his framework, and (iii) Zynda’s measure is ill- suited to evaluate reasoning processes of incoherent agents. I will first show that the measure doesn’t capture intuitive differences between incoherent credences, and hence produces counterintuitive orderings of incoherent credence functions. For example, consider two credence functions f and g, which are both defined over the same set of propositions {p, ~p, ⊥, T}. They assign credences as follows: f(p) = 0.5 g(p) = 0.5 f(~p) = 0.5 g(~p) = 0.5 f(⊥) = 0 g(⊥) = 0 f(T) = 0.5 g(T) = 0.99 31 A set of propositions has the structure of a Boolean algebra just in case it contains every logically distinct proposition that can be expressed by combining the atomic propositions in the set with the standard logical connectives. 63 The two functions have the same maximally coherent restrictions, namely the following set of proposition/credence pairs {<p, 0.5>, <~p, 0.5>, <⊥, 0>}. This means that according to Zynda’s measure, they are equally incoherent. However, this is not an intuitive result at all. Given that f assigns a degree of belief of 0.5 to the tautology, whereas g assigns it a degree of belief of 0.99, and that is their only difference, it seems much more natural to think that f is more incoherent than g. Hence, Zynda’s measure fails to capture the intuitive difference in incoherence between two credence functions. And since capturing intuitive differences between incoherent credence functions is one important goal of a measure of incoherence, this is a significant problem for this measure. 32 The second problem with Zynda’s measure concerns cases in which two credence functions can intuitively be compared, but are incommensurable according to Zynda’s measure. Suppose again that we are comparing credence functions defined over the set of propositions {p, ~p, ⊥, T}. We want to compare two credence functions f’ and g’: f’(p) = 0.5 g’(p) = 0.9 f’(~p) = 0.51 g’(~p) = 0.9 f’(⊥) = 0 g’(⊥) =0 f’(T) = 1 g’(T) = 1 Notice that the sum of the credences in p and ~p is 1.01 for f’, whereas it is 1.8 in the case of g’. Since rationality requires that any agent’s credence in two propositions p and ~p sum to 1, it 32 Zynda is in fact aware of this kind of result of his measure, and he comments on it in a footnote of his paper. Consider, for example, a person whose degree of belief function f is thoroughly incoherent but is everywhere numerically close to a probability function. […] Intuitively, there is a sense in which such a person’s state of opinion is very ‘close’ to being coherent, but it would come out very badly on my account, since very little of it is actually coherent. […] This is a distinct sense of comparative coherence from the one offered above; in my view, both senses are interesting and worth developing in greater detail. (Zynda, 1996, p. 215) Interestingly, Zynda acknowledges that his measure does not capture the very intuitive idea that the degree of incoherence of a credence function depends on numerical closeness to a probability function. Yet, he claims that there is a different graded notion of incoherence, which only depends on which parts of a credence function are actually coherent. I don’t think that this notion is what we are aiming for when we try to find a measure of probabilistic incoherence. If we want to know how much an agent diverges from being perfectly coherent, it seems natural and important to take numerical differences between agents’ credences into account. Zynda’s measure may still be of technical interest, but I think it fails to capture our most natural and interesting judgments about degrees of incoherence. 64 seems intuitively obvious that g’ displays a greater failure of coherence than f’. However, Zynda’s measure doesn’t let us compare the two credence functions. For each credence function, we can create two maximally coherent restrictions, either by removing the credence in p, or by removing the credence in ~p. Doing so reveals that they neither have the same maximally coherent restrictions, nor do their maximally coherent restrictions stand in a subset relation, which means that they are incommensurable. This is an undesirable result, since it seems intuitively unproblematic to compare the two functions. The last problem concerns a serious limitation of Zynda’s framework. He must assume that every credence function is defined on a full Boolean algebra of propositions, i.e. contains every logically distinct proposition that can be formed from some set of atomic propositions and the standard logical operators. However, this is an undesirable idealizing assumption, because real agents most likely have “gaps” in their credence functions, which are propositions they have never entertained, and thus don’t have a credence in. Filling in gaps in one’s credence function is actually one important form of reasoning agents can engage in: an agent may consider some proposition p that she had not previously entertained, and wonders what credence to assign to it based on the credence she already has. Yet, the Boolean algebra requirement prevents us from evaluating whether an agent has successfully executed this kind of augmentative reasoning. Since it involves adding a credence to one’s existing credence function, it follows that either the agent’s initial credences, or her resulting credences, or both, cannot be defined over a Boolean algebra. Of course, this requirement only presents this kind of problem for Zynda’s measure if it cannot be relaxed. And it unfortunately turns out that Zynda needs this assumption, because otherwise his measure clearly gives incorrect results, as we can see from the following example: Suppose an agent reasons in the following way, augmenting her existing credences by adding a credence in ~(p ∨ ~p). f’’(p) = 0.5 f’’(p) = 0.5 f’’(~p) = 0.5 ⇒ f’’(~p) = 0.5 f’’(p ∨ ~p) = 1 f’’(p ∨ ~p) = 1 f’’(~(p ∨ ~p)) = 0.1 65 In evaluating this instance of augmentative reasoning, it is immediately obvious that the agent’s new credence in ~(p ∨ ~p) makes her credences incoherent, even though her credences were coherent initially. Hence, an adequate measure of incoherence should tell us that the agent increased her degree of incoherence by reasoning in a way that made her incoherent. With Zynda’s measure, however, we cannot evaluate how incoherent the initial credence function f’’ is, because it is not a full Boolean algebra. It only becomes a Boolean algebra when ~(p ∨ ~p) is added. If Zynda allowed his measure to apply to f’’ before and after ~(p ∨ ~p) is added, both credence functions would turn out to be equally incoherent, because they have the same maximal coherent restriction. However, f’’ is initially coherent, but with f’’(~(p ∨ ~p)) = 0.1 added, it is incoherent, so this cannot be true. It is precisely to avoid this sort of result that Zynda’s measure requires probability functions to be defined over complete Boolean algebras of propositions. In sum, we saw that Zynda’s measure has three major problems. The first problem is that his measure does not take into account numerical differences between incoherent agent’s credences, which leads to counterintuitive ways of ordering various credence functions according to their degree of incoherence. The second problem is that the measure renders credence functions incommensurable that can intuitively be easily compared. The third problem stems from the indispensable requirement that a credence function must be defined over a Boolean algebra of propositions, which makes the measure unsuitable for evaluating very common ways of reasoning. These problems are important to keep in mind, because they give us guidance for developing a better measure of incoherence. A better measure should take into account numerical differences between credence functions, it should not render credence functions incommensurable that can intuitively be compared, and it should be applicable to credence functions that are not Boolean algebras. In the next section, I will review how Dutch book arguments work, and I will argue that they can provide the basis of a measure that can meet these conditions. 3. Incoherence and Dutch Books Dutch book arguments are one of the standard ways of arguing that a rational agent’s credences must obey the probability axioms. They show that an agent whose credence function violates the probability axioms is vulnerable to a guaranteed betting loss from a set of bets that are 66 individually sanctioned as fair by her credences. By contrast, a coherent agent faces no such guaranteed loss. But having degrees of belief that sanction as fair each bet in a set of bets that, by the laws of logic, jointly guarantee a monetary loss is rationally defective. Therefore, since only probabilistic credences avoid sanctioning as fair individual bets that lead to a sure loss when combined, only probabilistic credences avoid being rationally defective. Dutch book arguments rest on the basic assumption that there is a connection between an agent’s degrees of belief in a proposition and the cost the agent is willing to incur for a bet on that proposition. As David Christensen (2004) argues, this connection is normative, in the sense that one’s credence in a proposition justifies, or sanctions as fair, paying a specific cost for a bet on that proposition. If one’s degree of belief in some proposition p is x, then one should consider it fair to pay a cost whose utility is xY in order to get a reward whose utility is Y if p is true, and nothing if p is false. In this scenario, we will say that an agent who takes part in this sort of transaction is buying a bet on p. Likewise, one should consider it fair to be on the other end of a gamble of this kind, so that one receives a payment whose utility is xY, and one must pay out a reward whose utility is Y just in case p is true. In this case, we will say that an agent who takes part in this sort of transaction is selling a bet on p. Hence, the agent’s credence marks a point that determines a fair price for both buying and selling a bet. Of course, an agent should also consider it fair to buy the same bet at a lower price, or sell it for a higher price, but the indifference point marked by the agent’s credence is special, because it is the highest buying price, and the lowest selling price justified by the agent’s credence. And since, in a Dutch book, we are trying to make the agent lose as much as possible from a set of bets, each of which she considers fair, we must take advantage of the highest buying prices and lowest selling prices that are justified by her credences. It is common practice to represent these gambles as actual monetary gambles, even though it is obviously an idealizing assumption that utility can be represented linearly in terms of dollar amounts. For ease and familiarity of exposition, I will from now on frame my arguments in terms of monetary gambles. Of course, as Christensen (2004) has emphasized, the reason why Dutch books indicate rational defectiveness is not that the agent is actually in danger of being impoverished by a clever bookie. Being cheated out of one’s money is a practical problem, not an epistemic one. Vulnerability to Dutch book loss indicates an epistemic defect, because the evaluation of the bets in 67 an unfair betting situation as fair derives its justification directly from the agent’s credences. Each of the bets in the Dutch book is fair in light of the agent’s credences, yet the logically guaranteed outcome of the combination of these bets ensures an unfair advantage for the bookie. Yet, the credences of a rational agent should not justify regarding as fair each bet in a set of bets that logically guarantees an unfair advantage for the bookie. Hence, since having incoherent credences puts the agent in such a situation, incoherent credences are rationally defective. 33 But if it is irrational for an agent to regard individual bets as fair that jointly form a betting situation that guarantees a loss for her, then it seems natural to think that the higher the guaranteed loss that the agent is bound to incur from a set of bets, the more problematic are the credences that justify regarding these bets as fair, and vice versa. In short, if a guaranteed loss indicates incoherence, then it is plausible that the higher the guaranteed loss, the worse the incoherence it indicates. This idea fits very naturally with the interpretation of Dutch book arguments according to which they dramatize a kind of epistemic incoherence. Plausibly, the agent’s incoherence increases the greater the monetary disadvantage that is logically guaranteed by a set of bets, each of which is justified by her credences. Thus, it makes sense to assume that there is a pretty straightforward connection between the agent’s degree of incoherence and the guaranteed loss she is vulnerable to. 34 We can further underpin this idea by looking at some simple examples. Consider Penny and Jenny, and their credence in the tautology ~(p&~p). Penny’s credence in it is 0.2, but Jenny’s is 0.99. Neither of them is assigning the correct credence of 1, but intuitively, Jenny’s credence is 33 For a good overview of the ongoing debate about Dutch book arguments, see Hájek, 2008. 34 One might worry that instead of focusing on the loss an incoherent agent is guaranteed to incur, we should instead focus on the loss the agent expects to incur. Even though a Dutch book always leads to a guaranteed loss, there can be betting scenarios in which the agent loses more than that, depending on which world is actual. Hence, one might suggest that the agent should evaluate how unfair the betting arrangement is based on how much she expects to lose, rather than how much she is guaranteed to lose. For example, if an agent had a credence of 1 in the proposition (p&q), and a credence of 0.4 in the proposition (~p&~q), she is guaranteed to lose $0.40 in a Dutch book in which she buys a $1 bet on (p&q) for $1 and a $1 bet on (~p&~q) for $0.40, since she can win at most one of those bets. But if she happens to be in a world in which p is true and q is false, or a world in which p is false and q is true, then she will win neither bet, and her total loss will be $1.40. Hence, the fact that the agent loses more than the minimum of $0.40 in some worlds might suggest that we should take into account her expected loss, rather than her guaranteed loss. Yet, a closer look at the agent’s credences shows that asking what an incoherent agent expects to happen does not make a whole lot of sense. According to the agent’s credence in (p&q), she is certain to be in a world where both p and q are true, which, taken by itself, means that she doesn’t expect to be in a world where she could lose more than $0.40. Yet, according to the agent’s credence in (~p&~q), her degree of confidence in each of ~p and ~q is at least 0.4, which suggests that she doesn’t consider it impossible to be in one of the worlds in which she loses more than $0.40. Hence, it seems like her credences, taken together, suggest that she both expects and doesn’t expect to be in a world in which she loses more than the minimum, which leaves us with no clear idea of what she expects. Hence, we cannot rely on how much the agent expects to lose in order to measure how incoherent she is. 68 more coherent than Penny’s. This is reflected in the Dutch book loss to which they are vulnerable. If we make each of them sell us a ticket that says “Pay $1 to the owner of this ticket if ~(p&~p)”, Penny’s credence justifies a transaction that leads to a sure loss of $0.80 for her, whereas Jenny’s credence justifies a transaction that leads to a sure loss of $0.01 for her. Or consider Sally and Polly, who have credences in only two propositions, p and ~p. They both have a credence of 0.5 in the proposition p. Sally’s credence in ∼p is 0.49, whereas Polly’s credence in ∼p is 0.4. Both Sally and Polly have incoherent credences, because their credences in p and ∼p don’t sum to 1. However, intuitively, Polly’s credences are worse than Sally’s. This intuitive difference in incoherence is reflected in the Dutch book loss to which they are vulnerable. Suppose each of them is presented with two tickets. Ticket A says: “Pay $1 to the owner of this ticket if p.” Ticket B says: “Pay $1 to the owner of this ticket if ~p.” Since both of them assign a credence of 0.5 to p, a selling price of $0.50 for ticket A would be fair according to both of them. Given their credences in ~p, a fair selling price for ticket B would be $0.49 for Sally, and $0.40 for Polly. Thus, if Sally and Polly both sold the tickets at the prices determined by their credences, Sally would receive a total of $0.99, whereas Polly would receive a total of $0.90. Of course, one of the tickets is guaranteed to win, which means that the seller of the bets must definitely pay out $1 to the buyer. This would leave Sally with a guaranteed loss of $0.01, and Polly with a guaranteed loss of $0.10, which reflects our intuitive judgment that Polly’s credences are more incoherent than Sally’s. Given that there is an intuitive connection between the degree to which an agent’s credences are incoherent, and the Dutch book loss that these credences make her vulnerable to, Dutch books seem like a promising tool we can use to build a measure of incoherence. Measuring incoherence in some way by noticing the Dutch book loss to which the incoherent agent is vulnerable also seems like a good strategy to avoid the problems with Zynda’s measure. In setting up Dutch books, numerical differences in agents’ credences lead to different guaranteed losses, as we saw in the examples just discussed. Hence, a Dutch book based measure can avoid Zynda’s measure’s problem of being insensitive to such numerical differences. Moreover, since incoherence would be measured in terms of guaranteed monetary loss, a Dutch book measure would not face the incommensurability problem that besets Zynda’s measure. Lastly, an agent’s credences don’t have to be defined over a Boolean algebra of propositions in order to make her vulnerable to a Dutch book if her credences are incoherent. Hence, a Dutch book measure of 69 incoherence will avoid the Boolean algebra requirement that created problems for Zynda’s measure. These features seem like promising advantages of a Dutch book measure. Yet, there is an important problem that any such measure must find a solution to, which I will call the normalization problem. It is easy to see how the problem arises: the standard way in which Dutch book arguments are formulated makes no prescriptions about the sizes of the bets involves in a Dutch book. We are told which buying or selling price would be justified for a given bet by the agent’s credence, but nothing constrains the amount of the payout. For example, if Sally and Polly both have a credence of 0.6 in some tautology T, I could make Sally sell a $1 bet on T for $0.60, thereby making her lose $0.40, and I could make Polly sell a $10 bet on T for $6, thereby making her lose $4. In this scenario, Polly would lose ten times as much as Sally. But of course, this difference in monetary loss does not reflect any difference in incoherence in this case. The only difference between Sally and Polly is that I chose to make them sell bets of different sizes. Hence, without any way of normalizing Dutch book loss, we can’t use Dutch books to measure incoherence, because there simply won’t be a way of determining which Dutch book indicates how incoherent a credence function is. In order to be able to formulate any Dutch book measure of incoherence, we need to build some kind of normalization of the bet sizes into the measure in order to rule out scenarios like the one just described. Otherwise, the same incoherent credences can lead to wildly different Dutch book losses, which makes these losses useless for measuring degrees of incoherence. In effect, any Dutch book measure can just be viewed as a particular way of solving the normalization problem. As I mentioned earlier, there have so far been two proposals for measures of incoherence in the literature. The second measure is in fact a Dutch-book-based measure, which has been proposed by Schervish, Seidenfeld, and Kadane, and which is built on a particular solution to the normalization problem. Being a Dutch book measure, it predictably avoids the problems that beset Zynda’s proposal, but unfortunately, the measure’s solution to the normalization problem prevents it from serving as a measure of the total incoherence of a credence function. I will explain and discuss this measure in the next section. 70 4. Schervish, Seidenfeld, and Kadane’s Dutch Book Measure of Incoherence The second measure of incoherence that I will discuss is a Dutch book-based measure that has been put forth by Mark Schervish, Teddy Seidenfeld, and Joseph Kadane (2000, 2002a, 2002b – henceforth S, S&K). In their work, S, S&K discuss the formal properties of a class of Dutch-book- based measures of incoherence. The different measures in the class differ with respect to how they solve the normalization problem. S, S&K don’t explicitly endorse any measure in the class as being preferable to the others. 35 However, in the informal sections of their work, they devote much of their discussion to motivating the intuitive appeal of the measures based on so-called “sum”-normalizations. Yet, I will argue that these measures have certain features that make them unsuitable as measures of the total incoherence of an agent’s credence function. I will focus here on the measure that employs a normalization that S, S&K call the “neutral/sum” normalization. For ease of presentation, I will henceforth call this measure simply ‘S, S&K’s measure.’ They also discuss other normalizations from the subclass of “sum”-normalizations, especially two normalizations that they call the “bookie’s escrow” and the “agent’s escrow”. However, since these two normalizations face some additional problems, and are subject to the same problems as the neutral/sum normalization, I don’t discuss them separately here. I will keep the discussion here as informal as possible to make it easy to follow. The interested reader may consult the appendix for the formal details of their measure. S, S&K recognize that an adequate Dutch book measure must be normalized in order to accurately reflect differences in incoherence between credence functions. They choose to control for possible variations in the size of bets by stipulating that a credence function’s degree of incoherence must be determined by a particular ratio, which relates the Dutch book loss from a set of bets to the size of the bets involves in the Dutch book. When this is done in terms of the “neutral/sum” normalization, the result is a measure that works in the following way: the degree of incoherence of a credence function is determined by looking at the worst Dutch book that can be made against someone with that credence function. The worst Dutch book can be found by determining which set of bets makes the agent lose the most money relative to the sum of the stakes of all the bets involved in this Dutch book. In order to find this set of bets, the bookie may include bets on or against as many propositions in the agent’s credence function as necessary, as long as each proposition is used for no more than one bet. In this sort of arrangement, bets of any 35 They express some preference for the mathematical properties of measures based on “neutral” normalizations, but they don’t endorse any particular measure from the class of “neutral” measures (S, S&K, 2002, p. 7). 71 size are permissible, but since the guaranteed loss from these bets has to be divided by the sum of the total stakes of the bets involved in order to determine the degree of incoherence, the resulting degree of incoherence is normalized. We can illustrate this with a simple example. Suppose, for example, that I have the following credence function: f(p) = 0.6 f(~p) = 0.5 f(T) = 0.9 We’ll assume for simplicity that all bets we can use for Dutch books have $1 stakes. This is not required by S, S&K’s measure, but in this case, we can safely make this simplifying assumption, because it does not change the result. Now, if these are my credences, then my degree of incoherence is determined by the Dutch book that creates the highest guaranteed loss relative to the sum of the stakes of the involved bets. There are three prima facie plausible candidates for being the worst Dutch book: (I) Buy one $1 bet on each of p, ~p, costing $0.60 and $0.50, respectively. Result: guaranteed loss of $0.10, hence the loss ratio is 0.1/2 = 1/20 (II) Sell one $1 bet on T for $0.90. Result: guaranteed loss of $0.10, hence the loss ratio is 0.1/1 = 1/10 (III) Make all of the bets in I) and II) at the same time. Result: guaranteed loss of $0.20, hence the loss ratio is 0.2/3 = 1/15 As we can easily see, the Dutch book that results in the highest loss ratio is (II), and so, according to S, S&K’s measure, this is the Dutch book that actually determines my degree of incoherence, which is 1/10. The interesting result to notice is that in order to find the worst Dutch book that can be made against an agent (in S, S&K’s sense), it is not necessarily optimal to include bets on or against all of the propositions in which I have incoherent credences. The Dutch book that does this, which is (III), in fact leads to a lower loss ratio than Dutch book (II), which includes only one of my incoherent credences. And this is exactly the aspect of S, S&K’s measure that makes it 72 problematic, because it makes the measure unsuitable to determine the total incoherence of a credence function. We can see in this example that S, S&K’s measure suffers from what I will call a swamping problem. Since only the worst incoherent credences in the agent’s credence function, which lead to the highest loss ratio, determine the agent’s degree of incoherence, these credences wind up swamping any other incoherent credences the agent might have, which then never get reflected in the degree of incoherence. As I will show now, the swamping problem is the reason why S, S&K’s measure gives us unintuitive results if we try to use the it to compare credence functions with respect to their total incoherence. First, suppose there is an agent whose credences are defined over the propositions in the following set: {p, ~p, q, ~q}. The agent can adopt one of two credence functions f and g: f(p) = 0.6 g(p) = 0.6 f(~p) = 0.6 g(~p) = 0.6 f(q) = 0.5, g(q) = 0.6 f(~q) = 0.5 g(~q) = 0.6 Intuitively, f is less incoherent than g, because f contains incoherent credences only for the partition {p, ~p}, whereas g additionally contains incoherent credences for the partition {q, ~q}. However, since S, S&K’s measure only looks at the Dutch book that delivers the worst loss ratio, it doesn’t deliver this intuitive result. Instead, it judges f and g to be equally incoherent, because in both cases, the worst Dutch book that can be made leads to a loss ratio of 0.1. This is because both credence functions allow us to generate a guaranteed loss of $0.20 from two $1 bets, and there is no betting arrangement that leads to a worse loss ratio. This result is clearly an instance of the swamping problem, because the credences that don’t contribute to the worst Dutch book get swamped in S, S&K’s measure, even though, intuitively, they should be taken into account. The fact that f is coherent on the partition {q,~q}, but g isn’t, makes no difference to S, S&K’s measure, whereas it clearly makes a difference to our intuitive judgments about the total incoherence of the two credence functions. Hence, if two credence functions have the same loss ratio from their worst Dutch book, then the remaining credences cannot influence the degree of incoherence. And, as the example shows very clearly, 73 the measure therefore does not agree with intuitive judgments about the total incoherence of different credence functions. In fact, because of the swamping problem, S,S&K’s measure can even give us the opposite of what is intuitively the correct ranking of credence functions in order of their total incoherence. Assume again that there is an agent whose credences are defined over the propositions in the following set: {p, ~p, q, ~q}. The agent can adopt one of two credence functions g and h: g(p) = 0.6 h(p) = 0.600001 g(~p) = 0.6 h(~p) = 0.6 g(q) = 0.6 h(q) = 0.5 g(~q) = 0.6 h(~q) = 0.5 Intuitively, g is more incoherent than h, because g is much more incoherent than h with respect to the partition {q, ~q}, and g is only minutely less incoherent than h with respect to the partition {p, ~p}. However, because the worst Dutch book that can be made against h leads to a loss ratio of 0.200001/2, which is very slightly higher than the loss ratio of 0.2/2 from the worst Dutch book that can be made against g, the measure predicts that h is more incoherent than g. Again, the credences that don’t contribute to the worst Dutch book get swamped, which means that the differences in coherence on the partition {q, ~q} have no effect on the measure. And, as a result, the measure’s verdicts are directly opposed to those of intuition. The fact that S, S&K’s measure faces a swamping problem is important, because it makes it unsuitable for one of the main applications of a measure of incoherence, the evaluation of reasoning processes. Recall that one way in which we might evaluate the goodness of a reasoning process is by checking how it affects the degree to which the agent’s credences are incoherent. However, the swamping problem has the effect that S, S&K’s measure is not sensitive to relevant differences in an agent’s credences, as long as the reasoning process does not change the worst Dutch book that can be made against the agent. The following example demonstrates this problem. Suppose the agent begins with the following credences: f(p) = 0.6 f(T) = 0.9 74 Then she forms a new credence in ~p. Clearly, it would be better for her to assign a credence of 0.4 to ~p than, for instance, a credence of 0.5, since the former, but not the latter, would be coherent with her credence in p. However, S, S&K’s measure doesn’t distinguish between these two choices for a new credence in ~p, since the agent’s degree of incoherence continues to be determined solely by her credence in the tautology T either way. Hence, S, S&K’s measure cannot be used to evaluate the reasoning processes of incoherent agents. Where does this leave us? We have seen that S, S&K’s measure is subject to a swamping problem that is caused by their choice of normalization. However, it actually doesn’t suffer from any of the problems that we encountered with Zynda’s measure. Hence, while S, S&K’s measure isn’t suitable for measuring the total incoherence of an agent’s credence function, it does give us evidence that a different kind of Dutch book measure could give us what we need. In the next two sections, I will develop a Dutch book measure that avoids the swamping problem by solving the normalization problem in a different way. 5. Two Conditions of Adequacy for Measures of Incoherence In principle, there are many different ways in which one might try to use Dutch books to measure degrees of incoherence, but of course not all of these measures will be equally well suited for our purposes. Based on the lessons from the two failed measures, I will propose two basic principles that any measure of degrees of incoherence must meet in order to be adequate. The first principle I propose is called the Proportionality Principle (PP). It articulates the obvious idea that any measure of degrees of incoherence should correctly capture intuitive judgments about comparative incoherence. The Proportionality Principle does not assume, however, that we have intuitions about the degree of incoherence of every credence function compared to every other credence function. The very reason why we need a measure is that it is often difficult to determine degrees of incoherence on an intuitive basis, especially when considering larger credence functions. What PP is meant to capture is the idea that an adequate measure must agree with our intuitions in cases in which we have clear intuitive judgments. Proportionality Principle: An adequate measure of incoherence should capture intuitive judgments about differences and similarities between the degrees of incoherence of different credence functions. 75 Applied to Dutch book measures, this means that an adequate Dutch book measure of incoherence should measure credence functions in such a way that intuitively greater levels of incoherence correspond to greater Dutch book losses, and intuitively equal levels of incoherence correspond to equal Dutch book losses. There are various ways in which a measure can fail to comply with the Proportionality Principle. First, the measure might be inapplicable to credence functions that can intuitively be compared, or deem these credence functions incommensurable. This is why Zynda’s measure fails to comply with PP: we saw that it can’t be used for credence functions that aren’t defined over Boolean algebras. Moreover, credence functions that seem easy to compare are judged to be incommensurable by his measure. The second way in which a measure can fail to comply with PP is by ordering credence functions in ways that are in conflict with our intuitive judgments. We saw this problem with both Zynda’s and S, S&K’s measures. For each measure, we found examples where the measure judges credence functions to be equally incoherent that intuitively differ in incoherence. For S, S&K’s measure, there are even examples where the measure delivers the reverse of the intuitively correct ordering. The second principle I propose is motivated by the idea that we are trying to measure the total incoherence of a credence function. In order to do so, we must make sure that all of the agent’s incoherent credences are taken into consideration equally, so that no swamping can occur. Hence, an adequate incoherence measure must obey the Equality Principle: Equality Principle: A measure of degrees of incoherence should take all of the agent’s incoherent credences into consideration equally, regardless of the subject matter of the propositions to which the credences are assigned. To see why the subject matter of the propositions in which the agent has credences should not play a role in the measure, consider the following example: Suppose k is the proposition that kohlrabis are green, and l is the proposition that lychees are pink. Let f be Sally’s and g be Polly’s credence function: 76 Sally: Polly: f(k) = 0.5 g(k) = 0.5 f(~k) = 0.4 g(~k) = 0.3 f(l) = 0.5 g(l) = 0.5 f(~l) = 0.3 g(~l) = 0.4 Intuitively, Sally and Polly have equally incoherent credences. The reason the two credence functions seem equally incoherent is that one can be transformed into the other by switching around the logically contingent propositions. That kind of transformation should not change the degree of incoherence. And the only natural way in which a measure can guarantee this result is by giving equal weight to all propositions, regardless of what they are about. With regard to a Dutch book measure, the Equality Principle captures the idea that an adequate Dutch book measure of incoherence must control or normalize the size of the bets involved in some way, and it also captures the idea that such a measure must avoid leaving out any of the agent’s incoherent credences, or giving them different weights. We saw that S, S&K’s measure violates the Equality Principle, because its choice of normalization makes it vulnerable to the swamping problem. Their idea is to allow for bets of any size initially, but then divide the resulting guaranteed loss by the sum of the stakes in the bets involved. The Dutch book resulting in the worst ratio of these two quantities is the one that determines the degree of incoherence. But, as we saw, this choice of normalization has the result that incoherent credences that don’t affect the worst Dutch book go undetected by the measure; they get swamped. Hence, the resulting measure is a poor reflection of the total incoherence of a credence function. In the next section, I will explain how we can design a Dutch book measure that complies with PP and EP by adopting a different solution to the normalization problem. Instead of first allowing bets of any amount, and then controlling for their size, I suggest that we fix the size of each bet that can be used in a Dutch book from the outset. 6. The Maximum Dutch Book Measure In the last section, we used the lessons we learned from discussing Zynda’s and S, S&K’s measures in order to formulate constraints on an adequate Dutch book measure of degrees of 77 incoherence, which I called the Proportionality Principle and the Equality Principle. My goal in this section is to find an alternative measure that complies with both principles, and that really captures the total incoherence of a credence function. We saw that the swamping problem in S, S&K’s measure is caused by their choice of normalization. We can avoid this problem by taking a different approach to normalization. Instead of initially allowing bets of any size, and then making up for differences in guaranteed loss caused by this, we can instead put a restriction on the size of permissible bets right away. If we antecedently fix the size of the bets that can enter into the Dutch book measure, we can easily avoid the implication that some incoherent credences swamp others. Recall that the Equality Principle requires that all of the agent’s incoherent credences are given equal consideration in determining how incoherent she is. But if the Dutch book used to measure incoherence is permitted to include bets with higher stakes on some incoherence- generating propositions than on others, then it will give more weight to the incoherence arising from the credences in the propositions where the stakes are higher than to the incoherence generated by the credences in the propositions where the stakes are lower. The way to avoid this problem is to require that the total stakes of each bet to be the same arbitrary amount. I propose that the stakes of each bet (i.e. the sum of the possible net gain and loss per bet) be $1, though this number is arbitrary, and any other dollar amount would yield an equivalent measure. Fortunately, nothing is lost by fixing the bet size in this way. Whenever there is a Dutch book exhibiting incoherence, there is a fixed-stakes Dutch book exhibiting the same incoherence, and so this restriction creates no problem for our ability to set up Dutch books against incoherent agents. 36 36 I chose to set the stakes for each bet at $1, which means that the sum of the possible net gain and net loss from each individual bet is $1. Any agent whose credences violate the probability axioms is vulnerable to a Dutch book, given this choice of normalization: Normalization: For any statement A, if A is a tautology, P (A) = 1 Dutch book: Suppose for some tautology T, Cr(T) = x < 1. Then, we can make the agent sell a bet on T for $x, and she will have to pay out $1 in every state of the world. She will thus be guaranteed to lose $(1-x). Now suppose for some tautology T, Cr(T) = y > 1. In this case, the agent can be made to buy a bet on T for $y. In every state of the world she will gain $1 from the bet, leaving her with a guaranteed loss of $(y-1). Non-Negativity: For any statement A, P (A) ≥ 0 Dutch book: Suppose, for some statement A, Cr(A) = x < 0. In that case, if the agent sells the bet for $x, x being negative, the agent essentially pays the bookie $-x to take the bet off her hands. In worlds where A is false, nothing more happens. In worlds where A is true, the agent must pay out another $1. Thus, the agent is guaranteed to lose at least $-x. Finite Additivity: For any two mutually exclusive statements A, B, P (A ∨ B) = P (A) + P (B) 78 Of course, one might wonder whether there are other ways of fixing the size of each individual bet. In appendix 1, I discuss three other natural candidates for ways of fixing the size of each bet: (i) fixing the product of the potential net gain and net loss to be $1 (or any other amount), (ii) fixing the potential net gain from each bet to be $1 (or any other amount), or (iii) fixing the potential net loss from each bet to be $1 (or any other amount). I show that each of these alternative normalizations has difficulties accommodating bets on or against propositions in which the agent has credence 0 or 1, while my proposed normalization does not have this problem, and is therefore a better choice. Having chosen a way of normalizing the bet size, we now need to find a way of setting up a Dutch book measure that captures the agent’s total incoherence in the way prescribed by PP and EP. Given that we are trying to capture all of the ways in which the agent’s credences are incoherent, a prima facie plausible suggestion might be to simply set up a Dutch book that includes bets on or against all of the propositions in the agent’s credence function. However, it turns out that this is not a good way of capturing the degree of total incoherence of a credence function. This is because such a Dutch book might include bets on propositions that don’t contribute to loss-generating combinations of bets, and that can instead produce gains in some possible worlds that mask the losses from the agent’s incoherent credences. In order to avoid bets on those propositions, we should instead allow each proposition to be bet on or against at most once by the Dutch book that determines the agent’s degree of incoherence. This way, we can make sure that all and only the propositions are included in the Dutch book that contribute to the credence function’s total incoherence. Hence, by allowing, but not requiring that each proposition be included, we can avoid including bets that could lead to accidental gains, which would decrease the agent’s guaranteed loss, and thereby mask how incoherent the agent really is. We can illustrate this idea with a simple example: Suppose an agent has credences in only three propositions: f(p) = 0.4, f(~p) = 0.4, f(q) = 0.5. Clearly, the agent is incoherent, and can be Dutch booked by making her sell bets on p and ~p. However, if we required that a maximum Dutch book must include bets on every proposition Dutch book: Suppose an agent’s credence in (A∨B) is x, her credence in A is y, and her credence in B is z, and that A and B are mutually exclusive statements. Now assume first that her credences are such that x > y + z. In this case, the agent is made to buy a bet on (A∨B) for $x, and sell bets on A and B for $y and $z, respectively. In any world in which (A∨B) is false, all bets lose, and the agent is guaranteed to lose $(x-(y+z)). In any world in which (A∨B) is true, the agent wins $1 on this bet, but must also pay out $1 for either the bet on A or the bet on B, thus leaving her again with a guaranteed loss of $(x-(y+z)). Secondly, assume instead that x < y + z. The argument in this case is analogous, except that the pattern of buying and selling is reversed. 79 in the agent’s credence function, we would have to include a bet on or against q, and the agent would not be guaranteed to lose money, since she would win the q-bet in some possible worlds. Thus, in order to capture the agent’s true degree of incoherence, we must include only those credences that can be exploited to generate guaranteed losses, but not credences that can mask guaranteed losses by producing accidental gains. 37 How do we know which propositions we should make the incoherent agent buy bets on, which ones we should make her sell bets on, and which ones should be left out? We can find the answer by simply selecting a combination of bets that results in the greatest guaranteed loss. And once the greatest guaranteed loss has been determined, we have a straightforward indication of the degree to which the agent’s credence function is incoherent. The resulting measure avoids the swamping problem, and complies with the both the Proportionality Principle and the Equality Principle. The key to success of this Dutch book measure of incoherence is that, in defining the maximum Dutch book, we consider every possible betting arrangement in which the bets on all the propositions have equal stakes. Doing this ensures that the agent’s credence in all propositions are given equal weight. Moreover, choosing the Dutch book that leads to the maximum guaranteed loss guarantees that all of the agent’s incoherent credences are captured by the measure. We can present the proposal as a step-by-step process: 1. Where S is the agent whose credences are being evaluated, and Cr is her credence function, let BA be the smallest Boolean algebra such that every proposition in which the agent has a well-defined credence belongs to this algebra. Let n be the number of atomic propositions in BA, and let BA& be the subset of propositions in BA in which S has well- defined credences, and let m be the number of propositions in BA&. 2. Suppose that, for each proposition Ai in BA&, S has three options: she can bet on Ai, she can bet against Ai, or she can make neither bet. And suppose that S’s bet on or against Ai will have the following payoff structure: a. If S bets on Ai, then S will pay $ Cr(Ai) to buy the bet on Ai, and S will win $1 if Ai is true. Hence, if Ai is true, her net gain will be $(1 – Cr(Ai)), and if Ai is false, her net loss will be $Cr(Ai). b. If S bets against Ai, then S will receive $Cr(Ai) for selling a bet on Ai, and S will be required to pay $1 if Ai is true. Hence, if Ai is true, her net loss will be $(1 – Cr(Ai)), and if Ai is false, her net gain will be $Cr(Ai). 37 Another way to avoid this problem of accidental gains would be to require that the agent’s credence function be defined over a Boolean algebra of propositions, for in this case, the bets can be set up so that they either lead to a guaranteed loss or cancel each other out. But the credence functions of real agents are generally not defined over full Boolean algebras, and so it’s better to avoid this idealizing assumption. 80 Construct a table in which the columns represent the possible states of nature defined by BA, and the rows represent all of the possible combinations of bets for all the propositions in BA&. This table will thus have 2 n columns (corresponding to the 2 n possible combinations of truth values for the atomic propositions in BA, and 3 m rows (corresponding to the possible ways in which, for each of the m propositions in BA&, the agent can either buy a bet on it, sell a bet on it, or do neither). 3. Now fill in the values of every cell in this table, indicating, for each state of nature and each combination of bets, how much the agent would lose if this state of nature obtained and the agent made this combination of bets. If making the bets in question in the state of nature in question would result in a loss of $x, then the value of the cell is x, and if it would result in no win or loss for the agent, the value for the cell is zero. 4. For each row (representing a given combination of bets), identify the minimum value among the cells in a row. This will correspond to the guaranteed Dutch book loss for this combination of bets, that is, the smallest amount of money that could be lost by making this combination of bets. 5. Finally, identify the row whose minimum value is greatest. This will correspond to the combination of bets with the greatest guaranteed Dutch book loss. This maximal guaranteed Dutch book loss represents the degree of incoherence of the credence function Cr. We can also express this method as one long formula. Suppose Cr is defined over {A1,…, An}, I is the indicator function, which returns 1 if the relevant proposition is true in a given worlds, and 0 if it is false, and W is the set of all states of nature. Then DOI(Cr) gives the degree of incoherence for any credence function Cr. DOI(Cr)= max α i ∈{0,1,−1} [−max w∈W α i i=1 n ∑ (I A i (w)−Cr(A i ))] In a nutshell, the measure determines the degree of incoherence of an agent’s credence function by finding the Dutch book that can be made against her that leads to the highest guaranteed loss, given that all bets have $1 stakes, and each proposition in her credence function can be used for at most one bet. 38 Since the measure determines the degree of incoherence of an agent’s entire credence function, regardless of how many propositions she has credences in, it is best suited to make comparative judgments about credence functions that are roughly equal in size. Fortunately, 38 According to Teddy Seidenfeld (personal communication), the measure I develop here can be shown to be a special case of the “neutral/max” measure, which is part of the broader class of measures discussed by S, S&K. However, S, S&K don’t single it out for special consideration, nor do they discuss the advantages it possesses compared to the “neutral/sum” measure I criticized earlier. 81 these are the kinds of cases we want to apply the measure to, insofar as we are evaluating modes of reasoning. For when we are comparing the levels of incoherence that would result from different ways of forming or revising an agent’s credences in a given proposition, we are always comparing credence functions that are similar in size. I now want to briefly show the results of the measure when applied to the cases that I discussed earlier. Jenny and Penny: In the example of Jenny and Penny, Jenny has a credence of 0.99 the tautology ~(p&~p), whereas Penny’s credence in it is 0.2. In both cases, we have the agent sell a $1 bet on ~(p&~p) at the price determined by her credence. The measure gives a degree of incoherence of 0.01 for Jenny, and of 0.8 for Penny. Thus, the measure gives the intuitively correct result that Jenny is less incoherent than Penny. More generally, for any tautology T, the closer an agent’s credence in T is to 1 in this example, the less incoherent she is according to the measure, which is, intuitively, the correct result. Sally and Polly: In the example of Sally and Polly, they both have a credence of 0.5 in p, Sally has a credence of 0.49 in ~p, and Polly has a credence of 0.4 in ~p. The maximum Dutch book measure results in a degree of incoherence of 0.01 for Sally, and of 0.1 for Polly. This result can be achieved by making each agent sell both bets. This means that the measure gives the intuitively correct result that Sally is less incoherent than Polly. More generally, where an agent’s credence in a proposition p is x, the more her credence in ~p diverges from (1-x), the more incoherent she is according to the measure, which is, intuitively, the correct result. Counterexamples to Zynda: In the first example, we considered two credence functions f and g, which are both defined over a one-atom Boolean algebra {p, ~p, ⊥, T}. They assign credences as follows: 82 f(p) = 0.5 g(p) = 0.5 f(~p) = 0.5 g(~p) = 0.5 f(⊥) = 0 g(⊥) = 0 f(T) = 0.5 g(T) = 0.99 They are equally incoherent according to Zynda’s measure, which is intuitively incorrect. The maximum Dutch book measure, by contrast, returns the intuitive result that f is more incoherent that g, since the guaranteed loss of f is 0.5, and the guaranteed loss of g is 0.01. In the second example, we again compared credence functions defined over a one-atom Boolean algebra {p, ~p, ⊥, T}, and we considered two credence functions f’ and g’: f’(p) = 0.5 g’(p) = 0.9 f’(~p) = 0.51 g’(~p) = 0.9 f’(⊥) = 0 g’(⊥) =0 f’(T) = 1 g’(T) = 1 Intuitively, f’ seems less incoherent than g’, which was not predicted by Zynda’s measure. By contrast, the maximum Dutch book measure delivers the correct result, since f’ leads to a guaranteed loss of 0.01, and g’ leads to a guaranteed loss of 0.8. In the third example, we assumed that agent A would reason in the following way, by augmenting her coherent credences: f’’(p) = 0.5 f’’(p) = 0.5 f’’(~p) = 0.5 ⇒ f’’(~p) = 0.5 f’’(p ∨ ~p) = 1 f’’(p ∨ ~p) = 1 f’’(~(p ∨ ~p)) = 0.1 This example requires assuming gappy credence functions, which was not possible for Zynda, but is unproblematic for the maximum Dutch book measure. This measure assigns to the agent’s credence function a degree of incoherence of 0 before the credence in ~(p ∨ ~p) has been added, and a degree of incoherence of 0.1 once this credence has been added. 83 Counterexamples to S, S &K: In the first example, we assumed that there is an agent whose credences are defined over the propositions in the following set: {p, ~p, q, ~q}. The agent can adopt either of two credence functions f and g: f(p) = 0.6 g(p) = 0.6 f(~p) = 0.6 g(~p) = 0.6 f(q) = 0.5, g(q) = 0.6 f(~q) = 0.5 g(~q) = 0.6 S, S&K’s measure incorrectly judges the two credence functions to be equally incoherent. The Dutch book measure returns a degree of incoherence of 0.2 for f, and of 0.4 for g. Thus, the measure correctly judges g to be more incoherent than f. For the second example, we assumed again that there is an agent whose credences are defined over the propositions in the following set: {p, ~p, q, ~q}. g(p) = 0.6 h(p) = 0.600001 g(~p) = 0.6 h(~p) = 0.6 g(q) = 0.6 h(q) = 0.5 g(~q) = 0.6 h(~q) = 0.5 The problem with S,S &K’s measure was that it judges h to be more incoherent than g, although intuitively, it should be the other way around. The maximum Dutch book measure, by contrast, gets the correct ordering. It assigns a degree of incoherence of 0.4 to g, and a degree of incoherence of 0.200001 to h. The third example concerned a case of reasoning. We assumed that the agent begins with the following credences: f(p) = 0.6 f(T) = 0.9 84 Then she forms a new credence in ~p. Clearly, it would be better for her to assign a credence of 0.4 to ~p than, for instance, a credence of 0.5, since the former, but not the latter, would be coherent with her credence in p. S, S&K’s measure has the problematic consequence that there is no difference between these credence assignments for ~p, because neither one changes the worst Dutch book that can be made against the agent. The maximum Dutch book measure, by contrast, can make this distinction. Before assigning a credence in ~p, the agent’s degree of incoherence is 0.1. Assigning f(~p) = 0.4 preserves this degree of incoherence, whereas assigning f(~p) = 0.5 would worsen the degree of incoherence, and make it 0.2. As we can see from all three examples discussed here, the maximum Dutch book measure, unlike S, S&K’s measure, avoids the swamping problem. Conclusion Recall that, at the outset of this paper, I argued that there are three aims to be served by a measure of incoherence: (i) capturing intuitive differences and similarities between incoherent credence functions, (ii) explaining how the ideal Bayesian rules can serve as norms for non-ideal agents, and (iii) evaluating reasoning processes of incoherent agents. As my discussion of numerous examples illustrates, the maximum Dutch book measure does a good job at capturing intuitive judgments. It gives us the intuitively correct results in all the cases that Zynda’s and S, S&K’s measures get wrong. My measure also presents an adequate solution to the normativity problem. Since it allows us to measure how successful an agent is in approximating the ideal Bayesian rules, we can explain how these ideal norms can exert normative force over non-ideal agents. One important advantage of my measure compared to Zynda’s measure with respect to addressing this problem is that my measure avoids postulating widespread incommensurability between credence functions. If a measure implies that most changes that result from reasoning give rise to incommensurability, then it won’t make sense of how reasoning can get us closer to coherence. Hence, since my measure, unlike Zynda’s, avoids rendering many credence functions incommensurable, it is better suited to solve the normativity problem. Lastly, my measure is superior to both Zynda’s and S, S&K’s measure when used to evaluate reasoning strategies of incoherent agents. A measure that can be used for this task must exhibit two features: it must be applicable to agents with credence functions that are not defined 85 over complete Boolean algebras, and it must track the contribution of each individual credence to the overall incoherence of the agent’s credence function. Neither Zynda’s nor S, S&K’s measure possesses both of these features, whereas the maximum Dutch book measure does. It can be applied to incomplete credence functions, and it takes into account all of the agent’s incoherent credences. Therefore the maximum Dutch book measure is well suited to track changes of incoherence that are the result of reasoning. Appendix 1: Justifying the Choice of Bet Normalization In this appendix, I will justify my choice of normalization for the size of the bets involved in the maximum Dutch book. Obviously, making the stakes of each bet $1 is not the only possible way of fixing the size of each bet. There are many other ways in which one might be able to normalize the bets. Here, I discuss three other options for fixing the size of a bet that might seem natural, and I show why my method is preferable to these alternatives. The first possibility is to make all bets cost $1, so that the agent’s buying and selling price for each bet is constant, and the payoff varies with the agent’s credence. The second possibility is to make the payoff of each bet be $1, so that the net amount the agent can gain if she wins the bet is constant. The third possibility is to make the product of the possible net gain and net loss equal to $1 for every bet. The problem that arises for these three possibilities, but not for my chosen normalization, is that they cannot handle bets on or against propositions to which the agent assigns credences of 1 or 0. Suppose Sally assigns a credence of 0 to some proposition q. According to my chosen normalization, this means that Sally would only accept a bet on q if she could bet for free, winning $1 if p turns out to be true. If we choose the first alternative, making each bet cost $1, we cannot capture the bet Sally would be willing to make in this case. The payoff, or net gain for a bet that costs $1 is (1-1/Cr(q), which is undefined if Sally’s credence is 0. Or, in other words, if Sally is only willing to accept bets on q that are free, this is incompatible with stipulating that all the bets she is willing to accept cost $1. Thus, there are certain acceptable bets that are incompatible with this normalization. If we choose the second alternative, making the net gain from each bet $1, we get a similar problem. Suppose Sally assigns credence 1 to some proposition r. According to my chosen normalization, this means that Sally is willing to pay $1 for a bet that pays nothing if r is correct. But according to the second alternative, there cannot be any bets that involve a net gain of $0, because it stipulates that every bet that Sally is willing to accept has a net gain of $1. Thus, there are bets that Sally is willing to make that cannot be captured by this normalization. Alternatively, if we made the bet pay out $1, it might mean that Sally would be willing to pay any price for it, which would also be problematic for the measure, since in that case her credence would not determine a price for the bet. A further problem with the two options just discussed is that they do not guarantee that there is a Dutch book to be made in every case in which an agent has incoherent credences. For example, if an agent has credence 0.8 in some proposition p, and 0.1 in ~p, neither one of the two previous normalizations leads to a guaranteed loss. 86 If we choose the third alternative, we stipulate that the product of the net loss and the net gain of any bet must equal $1. This is obviously incompatible both with bets that cost $0 and with bets that have a net gain of $0, because in each case the product of net loss and net gain would be 0. However, since both of these types of bets can occur, the third alternative is not a suitable normalization either. Thus, we are left with my chosen normalization, which assumes that the sum of the net loss and net gain for each bet is $1. This normalization does not have a problem with bets based on credence assignments of 0 or 1. Of course, we could choose any other amount to be the sum of the net loss and net gain for each bet, which would result in a measure that it equivalent to mine. There may be other, more complicated normalizations that would work, but since my chosen normalization works just fine, there is no clear benefit from choosing a more complicated, alternative normalization. Appendix 2: Schervish, Seidenfeld, and Kadane’s Dutch Book Measure In a series of papers, Schervish, Seidenfeld and Kadane (S, S&K in the following) have developed a class of measures of degrees of incoherence based on Dutch books. (2000, 2002a, 2002b) They prove a variety of theorems that apply to this class of measures, and they specifically single out three measures for closer consideration, which are based on what they call the “neutral/sum”, “gambler’s escrow”, and “bookie’s escrow” normalizations. I focus there on the “neutral sum” measure, but my arguments also apply to the other two normalizations they highlight. 39 To see how this Dutch book measure works, suppose there is an agent who has a credence function P that assigns credences to a set of propositions {A1,..., An}. We can represent a bet on or against one of these propositions according to the agent’s credences in the following way: Bet: α (I(Ai) – P(Ai)) In this case, I(Ai) is the indicator function of Ai, which assigns a value of 1 if Ai is true and a value of 0 if Ai is false. P(Ai) is the credence the agent assigns to the proposition Ai. The coefficient α determines both the size of the bet, as well as whether the agent is betting on or against Ai. If α > 0, then the agent bets on the truth of Ai, whereas if α < 0, the agents bets against the truth of Ai. In the following, it will be assumed that an agent who assigns a precise credence to a proposition thereby evaluates as fair the bet on and the bet against that proposition at the price that is fixed by her credence. An agent is incoherent if there is a collection of gambles she evaluates as fair that together guarantee a loss. Formally, we can represent this as follows: Let A1,..., An be the propositions that some agent assigns credences to, let Cr be the agent’s credence function, which may or may not be probabilistically coherent, and let S be the set of possible world states. If there is some choice of coefficients α1,..., αn, such that the sum of the payoffs of the bets on or against A1,..., An is negative for every world state s ∈ S, then the agent is vulnerable to a Dutch book. Thus, there is a Dutch book iff 40 39 Their incoherence measure is defined in terms of upper and lower previsions and it uses random variables instead of propositions. The version I present here is somewhat simplified, because I use indicator functions instead of random variables, and I take credences to determine both the buying and selling price of a bet. That means that the measure I discuss is strictly speaking a special case of their more general measure. My criticisms of the measure are independent of these simplifying assumptions. 40 The function “sup” picks out the least upper bound of a set. In this context, it selects the highest value from the combined payoffs in all worlds in S. Thus, if the highest possible payoff is still negative, the agent can be Dutch- booked. 87 sup s∈S α i i=1 n ∑ (I A i (s)−Cr(A i ))<0 This formula tells us how to determine whether a Dutch book can be made against an agent who has a given credence function in a given set of propositions. We can capture the guaranteed loss an agent faces from a collection of gambles of the form Yi = α (I(Ai) – Cr(Ai)) as follows: 41 G(Y)=−min{0,sup s∈S α i i=1 n ∑ (I A i (s)−Cr(A i ))} In order to normalize the guaranteed loss to be able to measure an agent’s degree of incoherence, we can divide the guaranteed loss by the sum of the coefficients of the individual bets. This normalization is called the “neutral/sum” normalization by S, S&K. They also single out and discuss two other possible normalizations from the class of potential normalizations their theorems apply to, but since they also suffer from the swamping problem, and have some additional disadvantages, we may restrict our discussion to what I consider the most plausible normalization of those three. 42 We can thus compute the rate of loss H(Y): € H(Y) = G(Y) i α i=1 n ∑ The degree of incoherence can be determined for a set of propositions and a credence function over these propositions by choosing the coefficients α1,..., αn in such a way that H(Y) is maximized. To maximize H(Y), it may be necessary not to include certain propositions in the Dutch book, which can be achieved by setting the relevant coefficients αi to 0. We can illustrate how the measure works with an example. Suppose an agent has credences in two propositions, q and ∼q. Her credence assignment f is incoherent, since she assigns f(q) = 0.5 and f(∼q) = 0.6. In order to measure her rate of incoherence, we will first set up two bets with her, one for each proposition, and sum them in order to determine their combined payoff: € Y =α 1 (I q (s)−0.5) +α 2 (I ¬q (s)−0.6) Since we can either be in a world where q is true or in a world where q is false, we can get two values for Y: If q, then € Y =α 1 0.5−α 2 0.6 If ∼q, then € Y =α 2 0.4−α 1 0.5 Thus, we can calculate G(Y) as follows, where α1, α2 > 0 43 : 41 The “min” function is used here to select the smallest number of the numbers in a set. It ensures that if no Dutch book can be made against an agent, the guaranteed loss she faces is 0. If a Dutch book can be made, the “min” function selects it, and the negative sign in front guarantees that we end up with a positive number that indicates the agent’s guaranteed loss. 42 As Seidenfeld, Schervish and Kadane point out, there are different normalizations one might choose. They especially highlight three choices of normalizations, which they call the “bookie’s escrow”, the “gambler’s escrow” and the “neutral/sum” normalization. However, the “bookie’s escrow” and the “gambler’s escrow” normalizations have certain problematic features in dealing with bets on contradictions on tautologies, which is why the “neutral/sum normalization” is better suited for the task at hand. 43 If we set α 1, α 2 < 0, the combined payoff would be guaranteed to be positive. 88 if α2 ≥ 1.25 α1 or α1 ≥ 1.2 α2, then the second term in the braces in the G(Y) equation 44 is positive, which means that € G(Y) = 0 otherwise, € G(Y) =−sup s∈S {α 1 (I q (s)−0.5) +α 2 (I ¬q (s)−0.6)} Thus, when a Dutch book can be made, (i.e. when G(Y) > 0) we can measure the rate of incoherence by choosing the coefficients in such a way that H(Y) is maximized: € H(Y) = −sup s∈S {α 1 (I q (s)−0.5) +α 2 (I ¬q (s)−0.6)} α 1 +α 2 The rate of incoherence is maximized in this example if we choose α1 = α2, which results in a rate of incoherence of 0.05. I will now move on to the problems with the measure discussed in the main text. The first counterexample involved a comparison of the following two credence functions: f(p) = 0.6 g(p) = 0.6 f(~p) = 0.6 g(~p) = 0.6 f(q) = 0.5, g(q) = 0.6 f(~q) = 0.5 g(~q) = 0.6. We noted that intuitively, g is overall more incoherent than f. However, this is not the result we get from S, S&K’s measure. According to their measure, the agent would be equally incoherent in both cases. Here’s how that result comes about. First, consider the case in which the agent adopts f. The formula to calculate the degree of incoherence is the following (as the reader can easily verify, including bets on q and ~q couldn’t possibly lead to a higher guaranteed loss, so they can and should be left out): H(Y)= −sup s∈S {α 1 (I p (s)−0.6)+α 2 (I ¬p (s)−0.6)} α 1 +α 2 As before, the agent’s rate of loss is maximized when we set α1 = α2 > 0. We can thus simplify the calculation as follows: H(Y)= 0.2a 1 2a 1 = 0.1 Thus, if the agent adopts f as her credence function, her rate of loss is 0.1. Let us now compare this to what happens if the agent adopts g. If her credence function is g, we can calculate the rate of loss as follows: 44 Recall that this is the relevant equation: € G(Y) =−min{0,sup s∈S α i i=1 n ∑ (I A i (s)−P(A i ))} 89 H(Y)= −sup s∈S {α 1 (I p (s)−0.6)+α 2 (I ¬p (s)−0.6)+α 3 (I q (s)−0.6)+α 4 (I ¬q (s)−0.6)} α 1 +α 2 + a 3 + a 4 If we try to find values for α1-α4 that maximize H(Y), we get an interesting result. There is no way of picking values for α1-α4 that lead to a higher rate of incoherence than 0.1. Rather, we get exactly the same rate of incoherence we had before as long as we choose α1 =α2 and α3 = α4, and we choose α1>0 and/or α3>0. However, this result is in tension with the intuition that an agent who adopts g is more incoherent than an agent who adopts f. We can see more easily how this result arises if we strip our example down to the essential parts. In the calculation of the rate of loss of g, we are essentially combining two normalized Dutch books of the same kind into one, by adding the numerators and adding the denominators of each one. The two normalized Dutch books are the same as the one Dutch book we made against the agent who adopts f. Thus, to make it simple, the calculation for the rate of loss of the agent who adopts g goes as follows: H(Y)= 0.2a 1 +0.2a 3 2a 1 +2a 3 If two fractions are combined in such a way that the numerators and the denominators are added, the value of the resulting fraction is always in between or equal to the two original fractions. Thus, if we combine two normalized Dutch books with the same rate of loss in S, S&K’s measure, the resulting rate of loss is the same as before. 45 Moreover, it can even be beneficial in S, S&K’s measure not to make certain Dutch books at all in order to maximize the rate of loss. Remember that we are allowed to choose αi = 0 if necessary to maximize the rate of loss. In a case in which there are two Dutch books that can be made against an agent, but one of them leads to a greater rate of loss on its own, the total rate of loss can be maximized by setting the relevant coefficients to 0. For example, suppose an agent has the credence function f, that is defined as follows: f(p) = 0.6, f(~p) = 0.5, f(T) = 0.9. An agent who adopts f can be Dutch booked in two ways: on her incoherent credences in the partition {p,~p}, and on her non-zero credence in a contradiction. In this case, the rate of loss comes down to: 45 Here’s a proof of the result (thanks to Kenny Easwaran for pointing this out to me): We want to prove that if we combine two fractions by adding their respective numerators and denominators, the resulting fraction is going to lie in between the two original fractions. Suppose that a/b < c/d, with a,b,c,d being positive. Then it is the case that ad < bc. First, we show that the sum of the fractions is greater than a/b: a/b = (ab+ad)/b(b+d) (a+c)/(b+d) = (ab+cb)/b(b+d) If you compare the two terms on the right side of each equation, you notice that they are the same except for the right summand in the numerator. And since ad < bc, we can conclude that a/b < (a+c)/(b+d). Then we show that the sum of the fractions is smaller than c/d: c/d=(cb+cd)/d(b+d) (a+c)/(b+d) = (ad+cd)/d(b+d) If you compare the two terms on the right side of each equation, you notice that they are the same except for the left summand in the numerator. And since ad < bc, we can conclude that c/d > (a+c)/(b+d). 90 H(Y)= 0.1a 1 +0.1a 3 2a 1 +a 3 The rate of loss in this case reaches its maximum value of 0.1 if we set α1 = 0. This amounts to only Dutch booking the agent on her credence in the tautology, but refraining from Dutch booking her on her incoherent credences in the partition {p,~p}. This feature of the measure’s normalization is the source of the swamping problem. Since only the worst Dutch book determines the agent’s degree of incoherence, incoherencies in other parts of the credence function get swamped and are not reflected in the agent’s degree of incoherence. This also gives rise to the second counterexample, in which the measure orders two credence functions in a way that seems intuitively exactly backwards, and it makes the measure unsuitable to evaluate reasoning processes. 91 Chapter Five: Should I Pretend I’m Perfect? Introduction Ideal agents are often used in philosophy to illustrate ideal norms – norms which humans can never fully comply with because of their limitations and imperfections. We can think of ideal agents as role models, in a certain sense, whose perfection we try to approximate. However, which form this striving should take is a substantive question. On a very simple and intuitive view, non-ideal agents should take their ideal counterparts as guides, and reason and act according to the same rules as ideal agents. Yet it is a familiar observation from moral philosophy that this is not always a good idea with respect to practical norms. If non-ideal agents try to follow the same rules as their ideal counterparts, the results can be disastrous. Subjective Bayesianism, the important strand of contemporary epistemology concerned with the rationality of degrees of belief, draws heavily on the idea of ideal norms and ideal agents. Therefore it might seem very natural to expect a fair amount of discussion among Bayesians about the question of how ideal norms bear on what non-ideal agents should do. However, this question has in fact received little attention. My goal in this paper is to take the first step in tackling this question by pursuing what I hope will be a particularly illuminating special case: I will ask whether the simple view, according to which non-ideal agents can rationally follow the same rules of reasoning as ideal agents, is true for a particular type of reasoning with degrees of belief that involves augmenting one’s credence function by assigning a new credence to a proposition that one had not previously considered. If the rules of reasoning that can be employed by ideal agents turn out to produce bad outcomes when followed by non-ideal agents, then we can conclude that the simple view is false, just as in the practical case. However, if reasoning according to the same rules as ideal agents turned out to be rationally advisable for non-ideal agents, then this would be an interesting disparity between the theoretical and the practical realms that would justify the absence of Bayesian worry about this problem. I am going to show that the simple view is not true in epistemology, and argue that we need a more sophisticated picture of the way in which ideal norms inform what non-ideal agents should do. In the first section of the paper, I will explain what it means for an ideal agent to reason by Bayesian rules, and I pose the question of how these ideal norms of reasoning inform how non- 92 ideal agents should reason. In the second section of the paper, I will point out that a similar question has already been discussed in the practical realm, resulting in the insight that non-ideal agents should not follow the same rules of practical reasoning that are suitable for ideal agents. In the third section, I explain how non-ideal agents can reason with degrees of belief according to the same rules as ideal agents. In the fourth section, I will lay out a method for evaluating the results of different ways of reasoning for non-ideal agents. This method will use Dutch books to measure the degree of incoherence of an agent’s credence function. In the fifth section, I put this Dutch book measure to use in evaluating alternative reasoning strategies available to the non- ideal agent. I show that, just as there are ideal practical rules such that, if they were followed by non-ideal agents, disastrous results would sometimes be guaranteed, we can find a similar phenomenon in the epistemic case. 1. Reasoning with Degrees of Belief for Ideal Agents Formal epistemology is the subfield of epistemology that is concerned with the question of what the norms on rational degrees of belief are. A common answer to this question is that an ideally rational agent’s degrees of belief must obey the probability axioms. This view is known as subjective Bayesianism. To set up the probability axioms, we begin with a set of atomic statements {Ai}, and we combine it with the standard sentential logical operators to define a language L. We also assume that the relation of logical entailment is defined in the classical way. A probability function P on L must satisfy the following axioms: Normalization: For any statement A, if A is a tautology, P (A) = 1 Non-Negativity: For any statement A, P (A) ≥ 0 Finite Additivity: For any two mutually exclusive statements A, B, P (A ∨ B) = P (A) + P (B) Any rational credences had by an agent must obey these axioms. For the purposes of this discussion, I won’t assume that it is a necessary condition of rationality that an agent must have a well-defined credence in every proposition of L. Her credence function may have gaps, i.e. there may be propositions that the agent can entertain, but hasn’t (yet) assigned a credence to. For an 93 agent’s credences to count as fully rational it is sufficient that her existing credences can be extended to a coherent credence assignment on all propositions of L. 46 Usually, the probability axioms are taken to provide us with synchronic norms on degrees of belief: they prescribe what it takes to have a coherent credence function at any given time. But we may also ask: what rules should we follow in forming new credences? While the axioms cannot themselves be rules of reasoning – for example, they do not prescribe how to revise a set of incoherent credences to render it coherent – we will see that they provide constraints on what good reasoning strategies should look like. 47 One way in which an agent can form a new credence is by assigning a credence to a proposition in which she didn’t antecedently have a well-defined credence. I will call this kind of reasoning augmentative reasoning. In augmentative reasoning, as I understand it, the agent holds her existing credences fixed, and forms a new credence on the basis of her existing credences. Of course there are also other ways of reasoning with degrees of belief, for example types of reasoning that involve revising one’s existing credences. Updating one’s credences in response to learning new information falls under this type. However, my focus in this paper will be exclusively on augmentative reasoning. Let’s consider an example of augmentative reasoning, which will help us understand how the probability axioms can provide guidance in how an ideal agent should fill a gap in her credence function. Suppose that Ideal Ingrid has the following credences about the next presidential election: Cr (Smith will be the next president) = 0.1 Cr (Jones will be the next president) = 0.5 Cr (Murphy will be the next president) = 0.4 46 If two expressions of the language L are logically equivalent, but syntactically different, such as ‘A&B’ and ‘B&A’, I will count them as the same proposition for the purposes of this discussion. 47 Gilbert Harman (1986) forcefully makes this point about the rules of deductive logic. I show in Chapter 2 “Is Subjective Bayesianism aTheory of Reasoning?” that it equally applies to the axioms of probability, and I discuss the relationship between a theory of reasoning and a logical system. There are also other legitimate concerns about the possibility of reasoning with degrees of belief, but I unfortunately don’t have the space to address them here. For a defense of reasoning with credences, see Chapter 1 “Can There Be Reasoning with Degrees of Belief?” For a defense of a graded approach to evaluating principles of reasoning, see Chapter 3 “Formulating Principles of Reasoning”. 94 Ingrid currently lacks a credence in the proposition (Smith or Jones will be the next president). What credence should she assign to this disjunction, given her existing credences? If her existing credences remain unchanged, the only credence that is probabilistically coherent with her existing credences is Cr (Smith or Jones will be the next president) = 0.6. But of course, the probability axioms by themselves don’t prescribe what Ingrid’s initial credences have to be. The axioms only provide norms concerning the relationship between her credences in the atomic propositions and in the disjunction. Yet, if the agent has good reasons to stick with her existing credences, then the probability axioms tell us which new credences are coherent augmentations of the agent’s credence function. Plausibly, there are many cases in which it makes sense not to revisit one’s existing credences, even in cases in which one cannot be sure that they are precisely the credences one ought to have. Often, reassessing one’s existing credences would be complicated and time- consuming, and one doesn’t always have the time and cognitive resources to do so, especially when one’s credences are based on a large and diverse body of evidence. For example, suppose Ideal Ingrid thinks that Smith would be a terrible president, and so would Jones, terrible enough that she wouldn’t want to live in the United States anymore if they were elected. One day, she gets a call from her real estate agent, who tells her that she can buy an incredibly cheap house in Germany, but she has to decide immediately. In order to decide whether she should purchase the house, Ingrid must quickly figure out how confident she is that either Smith or Jones will be president, since she would only decide to buy the house if her confidence in this disjunction were high. This is a situation in which it doesn’t make sense for Ingrid to take the time to reexamine whether her existing credences are ideally proportioned to her evidence. The house would be no longer available if she did. Rather, it makes sense for her to hold her existing credences fixed, and form a credence in the relevant disjunction on the basis of them. We can capture this way of reasoning, where the agent looks towards some or all of her existing credences to form a new credence on their basis, in the following general rule of reasoning: 95 Augmentative Inference Rule (AIR): Suppose it makes sense to hold your existing credences fixed, and you are considering what credence to assign to some gap proposition p. For any subset of your credences S, and any proposition p, if, given your credences in S, x is the only credence you can have in p, that is consistent with the probability axioms, then assign Cr(p) = x. Notice that if an ideal agent like Ingrid augments her credence function by applying AIR, it doesn’t matter which relevant subset of her credence function she bases her new credence on. For any agent with coherent credences, any subset of her credence function that prescribes a precise credence in the new proposition p will prescribe the same value for p. Hence, if Ingrid also had a credence in the proposition (Neither Smith nor Jones will be the next president), her credence in this proposition would have to be 0.4 in order to be coherent with her existing credences. And if Cr (Neither Smith nor Jones will be president) = 0.4, then, by applying AIR, Ingrid would again arrive at the result that she should assign Cr (Smith or Jones will be president) = 0.6. We have seen that AIR is an inference rule that ideal agents can use to fill in gaps in their credence function. It generates unique recommendations for new credence assignments, and it keeps the agents’ credences perfectly coherent. The rule relies on the probability axioms, which shows that these axioms not only provide norms of coherence, but also constrain how ideal agents may reason. Yet, while we have now seen how ideal agents can augment their credence function, we don’t know in which way the Bayesian norms apply to non-ideal agents like us. Can we simply follow the same rules that work for our ideal counterparts? It seems natural to think that this question has already been investigated by epistemologists. Yet, surprisingly, it turns out that it hasn’t. Instead, people have focused on a different issue. Various authors have pointed out that the ideal norms of subjective Bayesianism might be too demanding to be fully complied with by human agents (e.g. Hacking, 1967; Harman, 1986; Christensen, 2004). However, a closer look at their discussions reveals that they are not concerned with the same question we are interested in. The limitations of real agents are mostly brought up by these authors with the intention of criticizing a particular set of norms for being impossibly demanding. But here we will be asking whether the ideal epistemic norms face a different problem: if a non-ideal agent can follow an 96 ideal rule of reasoning in a particular situation, is it ever inadvisable for her to do so? In the next section, I will turn to the practical analogue of this question, and point out that it has received a fair bit of attention in ethics. 2. Imitating Ideal Agents in the Practical Domain There are important similarities between epistemology (especially formal epistemology) and ethics (broadly construed so as to include the theory of practical reason). Both are normative disciplines, aiming to formulate the norms governing their respective domains. Ethics is largely concerned with norms governing intentions and actions, whereas epistemology deals with norms governing cognitive states such as beliefs and levels of confidence. We should not be surprised, therefore, if similar problems arise in the two disciplines. In ethics, we can ask the same question we asked about theoretical reasoning in the last section: Should a non-ideal agent follow the same rules as an ideal agent in a situation in which she can do so? If the answer to this question were positive, then the following thesis would be true: Strong Imitation Thesis (Practical Version): For any rule of practical reasoning R, if R is a foolproof rule to follow for ideal agents, then R is a foolproof rule to follow for non-ideal agents. By a ‘foolproof rule’ I mean a rule such that any application of this rule would be rational. While this thesis might initially seem plausible, it has been discovered in a variety of different contexts that it is false. One way of understanding foolproof practical rules for ideal agents is that they are the rules that ideal agents would follow in an ideal world – a world in which everyone is perfectly moral, or a “kingdom of ends.” And some of the main ethical theories can be seen as telling us to follow, in the actual world, the rules that would be followed in such an ideal world: Kant’s categorical imperative tells us to act on maxims we could will that everyone follow, and some versions of rule consequentialism tell us to follow the rules which would have the best consequences if everyone followed them. A well-known problem for such theories is that many rules that are ideal in this sense are such that if they were followed by a given agent in the actual world, where not all agents are ideal and follow this rule, they would lead to disastrous 97 consequences. For example, if there were an ideal rule requiring pacifism, someone following it in the actual world might thereby fail to stop a horrendous atrocity, because doing so would require a very minor act of violence. 48 Another way of understanding ideal moral rules is as the rules that would be followed by ideal agents in the actual world (a world where not all agents are ideal). An example of such an ideal rule is possibilism. Possibilism: It is, as of time t, permissible for an agent S to perform some action A iff at least one of the optimal maximal sets of actions that is, as of t, possible for S involves S performing A. 49 Possibilism guarantees ideal results if agents obey it perfectly at all times. However, it can have very bad consequences if it is followed at a given time by an imperfect agent who does not follow it perfectly at other times. This is shown by the well-known example of Professor Procrastinate (Jackson & Pargetter, 1986). Procrastinate is asked to referee a journal article that he is exceptionally well suited to judge. And while he could write the referee report, he knows that if he were to accept the invitation he’d simply procrastinate and never end up writing the report. Since the only optimal set of actions involves accepting the invitation and writing the report, possibilism implies that the only permissible action now available to Procrastinate is accepting the invitation. And yet, given his imperfections, his acting in this way would result in the worst possible outcome, namely the report never being written. We can also find an earlier example of a similar kind in a paper by Gary Watson (1975, p. 210). In the context of a discussion of free agency, Watson gives the example of an angry squash player who has just lost a match. The ideal course of action would be to walk over to her opponent, and shake her hand. Hence, Possibilism prescribes that the squash player should walk over to her opponent, because doing so is part of the ideal course of action. However, if the angry squash player walked over, she would succumb to the desire to smash her opponent in the face with her squash racquet. Hence, instead of leading to the best possible action, following the same rule as her ideal counterpart to determine what to do would lead to the worst possible action. In 48 See, for example, Sidgwick (1884), p. 485; Parfit (2011), p. 312. 49 See Portmore (2011), p. 209. For other discussions of possibilism, see for example Feldman (1986), and Zimmerman (1996). 98 this situation, it would be better for the agent to do something entirely different, namely just walk away. Both the case of Professor Procrastinate and the case of the angry squash player are cases in which the agent can follow a rule that is foolproof for ideal agents, namely Possibilism. Yet, both agents fail to achieve an optimal result, because they don’t always follow this rule; either because of imperfections of his future self (in Procrastinate’s case), or because of her current imperfections (in the squash player’s case). While it is a matter of controversy whether Possibilism can be defended in light of these examples, it is worth pointing out that the same kinds of examples also pose problems for a common way of formulating the demands of virtue ethics: “Do what the virtuous person would do.” Just like the possibilist rule, this rule is unproblematic if followed perfectly at all times, but gives bad results if followed imperfectly. The virtuous person would accept the invitation to referee, and also walk over to her opponent after the squash match, and the imperfect agent can do these things as well. However, following the rule in these instances sets the imperfect agent up for disaster, since she won’t do what the virtuous person would do next, as we just saw in the two examples (see e.g. Brown, 2011). All these cases illustrate that the Strong Imitation Thesis in its practical version is false. There are various examples of rules that are unproblematic when followed by ideal agents, but that are suboptimal when followed by non-ideal agents. Given that these problems with the applicability of ideal rules to non-ideal agents have been discovered in several different areas of moral philosophy, one might expect to find a similar discussion in areas of epistemology that are concerned with ideal norms, like subjective Bayesianism. However, as I pointed out in the previous section, Bayesians have not considered a theoretical analogue of the Strong Imitation Thesis. 3. Reasoning with Degrees of Belief for Non-Ideal Agents Now that we’ve seen that the Strong Imitation Thesis is false in the practical domain, we’re ready to return to theoretical reasoning, and we’ll consider the thesis’ theoretical analogue: Strong Imitation Thesis (Theoretical Version): For any rule of theoretical reasoning R, if R is a foolproof rule to follow for ideal agents, then R is a foolproof rule to follow for non-ideal agents. 99 As before, I take a foolproof rule to be a rule such that any application of it is rational. We will test the Strong Imitation Thesis in its theoretical version by considering cases in which non-ideal agents engage in augmentative reasoning. In the first section of the paper, I showed that ideal agents can use the rule AIR to find new credences in gap propositions, so we’ll investigate whether it is advisable for non-ideal agents to reason according to AIR. Remember that AIR is the following rule: Augmentative Inference Rule (AIR): Suppose it makes sense to hold your existing credences fixed, and you are considering what credence to assign to some gap proposition p. For any subset of your credences S, and any proposition p, if, given your credences in S, x is the only credence you can have in p, that is consistent with the probability axioms, then assign Cr(p) = x. AIR is a foolproof rule for ideal agents for finding a new credence in a gap proposition, because the rule produces a unique recommendation for the new credence, and it keeps the agent’s credences perfectly coherent. Let’s call any credence that is formed via an application of AIR an ideal-method-based credence, or IMB credence. We can then combine the Strong Imitation Thesis and AIR to generate the following thesis: Strong IMB Thesis: For any non-ideal agent N, and any proposition p, if a credence of x in p would be an IMB credence for N, then it would be rational for N to form a credence of x in p. In order to see whether the Strong IMB Thesis is true, we need to examine what happens if AIR is used by a non-ideal agent, who has incoherent credences, to find a credence assignment for a gap proposition. We will start with a simple example. Let A stand for the proposition that Jones will be the next president, and let B stand for the proposition that Jones is honest. Suppose some 100 agent, Sally, has the following credence function, which lacks a well-defined credence in the proposition that Jones won’t be the next president. Cr(A&B) = 0.2 Cr(A&~B) = 0.2 Cr(A) = 0.5 Sally’s credence function is incoherent, since her credences in (A&B) and (A&~B) don’t sum to the same number as her credence in A, even though A is equivalent to (A&B) ∨ (A&~B), and the probability axioms require that equivalent propositions must be assigned the same credence. Now, suppose Sally won’t revise her existing credences before assigning a new credence to ~A. As a result, there is more than one way to form a credence in ~A by applying AIR. Sally can do so by making her new credence cohere with her old credence in A, or by making it cohere with her old credences in (A&B) and (A&~B). The former leads to a credence of 0.5, and the latter to a credence of 0.6 for ~A. Both of these assignments are IMB credences according to our definition. So, if the Strong IMB thesis were true, then choosing either one of them would be rational. This example illustrates an important difference between ideal and non-ideal agents who are engaged in augmentative reasoning. While AIR always yields unique recommendations for ideal agents, there will often be more than one way for a non-ideal agent to apply AIR, because there will be different subsets of her credence function that recommend different credences for the new proposition. In the following, we will have to find out if all the ways in which a non-ideal agent can follow AIR are in fact rational, as is claimed by the Strong IMB Thesis. One might suggest at this point that instead of directly trying to find a credence assignment for a new proposition, non-ideal agents should instead follow a two-step strategy when they augment their credence function. In the first step, the agent revises some or all of her credences and makes them coherent, so that there is then in the second step a straightforward way of finding the correct credence for the new proposition by applying AIR. However, it is far from clear that this is always a better strategy than directly augmenting an incoherent credence function. Surely, the incoherent agent should not just become coherent in some arbitrary way. Rather, if the agent were to revise her credences, she should consider which revisions would best fit her evidence. However, we saw earlier in the case of Ingrid that one’s credences are often based on a large and diverse body of evidence, and reevaluating one’s credences can be time- consuming and cognitively demanding. Hence, there can be situations in which it makes sense for 101 an incoherent agent to stick with her existing credences, because she doesn’t have the time or resources to make non-arbitrary revisions. It would be very limiting if an agent’s inability to revise her existing credences would preclude her from engaging in any reasoning at all. Hence, I think it is a worthwhile question to ask how such an agent can rationally augment her existing credence function, even if she can’t become coherent first. And as we will see later, there are many cases in which it is in fact okay to apply AIR without revising one’s credences first. 4. Measuring Incoherence 4.1 A Desideratum for Evaluating Rules of Reasoning In the previous section, we formulated the Strong IMB Thesis, which claims that the rule AIR, which ideal agents can use to generate IMB credences for gap propositions, can also be used by non-ideal agents, because the IMB credences it generates are rational for non-ideal agents to adopt. However, it is far from obvious that the Strong IMB Thesis is true. Recall that, in the practical domain, it is often not advisable for non-ideal agents to follow rules that are foolproof for ideal agents. In order to evaluate and compare different credence assignments that result from augmentative reasoning, we need a tool that helps us measure the epistemic goodness and badness of different answers. Since the non-ideal agents we’ve been considering differ from ideal agents insofar as they have incoherent credence functions, a natural way to evaluate the outcomes of different reasoning processes is by checking how they affect the degree to which an agent’s credences are incoherent. In evaluating methods for forming and revising our credences, it seems that a plausible desideratum for such methods is that using them should minimize increases in incoherence. If an ideal agent begins with perfectly coherent credences, and augments her credences in light of the constraints imposed by the probability axioms, then her credences will remain perfectly coherent, and so her level of coherence will be preserved. It might not be reasonable to expect reasoning methods to enable agents with initially incoherent credence functions to achieve perfectly coherent credence functions. But we should expect such methods to at least minimize any increases in incoherence. 50 50 I am not advocating the view that this is the only relevant desideratum in evaluating the outcomes of various reasoning strategies. Other relevant considerations might include, for example, how well the resulting credences are supported by the agent’s evidence. However, I claim that, all other things being equal, we can evaluate reasoning strategies by checking whether they minimize increases in incoherence. 102 In order to test whether a reasoning method satisfies this condition, we need some way of measuring degrees of incoherence. The Bayesian rules by themselves won’t suffice: they allow us to determine whether an agent’s credences are incoherent, not how incoherent they are. Fortunately, in other work, I have developed and defended a measure of probabilistic incoherence in detail. 51 In the next section, I will give a brief motivation for the measure and explain how it works. 4.2 The Maximum Dutch Book Measure of Degrees of Incoherence In some simple cases, it is obvious how two credence functions compare with respect to their degrees of incoherence. If we imagine that Sally and Polly have credences in only two propositions, R and ~R, and if Sally’s credences in these propositions are 0.5 and 0.49, respectively, whereas Polly’s credences in these propositions are 0.5 and 0.4, respectively, then it seems clear that Polly’s credences are more incoherent than Sally’s. But once we consider agents with larger credence functions, we can’t simply see intuitively which agent is more incoherent. What we need is a measure of degrees of incoherence that we can apply to any incoherent credence function, and that will provide us with the rankings that we need. One way in which philosophers have argued that rational credences should be probabilistically coherent is via Dutch book arguments. Dutch book arguments are taken to dramatize incoherence, by showing that incoherent credences can be exploited by setting up betting scenarios that lead to a guaranteed loss for the incoherent agent. Coherent credences, by contrast, don’t lead to any such guaranteed loss. Since Dutch book arguments thus draw a connection between incoherence and a quantifiable outcome, namely a guaranteed betting loss, they are a natural place to look for a way of measuring incoherence. More specifically, Dutch book arguments show that an agent whose credence function violates the probability axioms is vulnerable to a guaranteed betting loss from a set of bets that are individually sanctioned as fair by her credences. The argument rests on the assumption that if one’s degree of belief in p is x, then one should consider it fair to buy or sell a bet for $xY that pays out $Y if p is true, and nothing if p is false. With this information, we can determine what the net payout is in each world. For example, if an agent has a credence of 0.2 in some proposition p, then she should consider it fair to buy a bet for $0.20 that pays her $1 if p is true 51 See Ch. 4 “Degrees of Incoherence and Dutch books.” 103 and nothing if p is false. Thus, in worlds in which p is true, the agent gains a net amount of $0.80, and she loses her buy-in of $0.20 in worlds where p is false. She should also consider it fair to sell this bet for $0.20, obligating her to pay out $1 if p is true and nothing if p is false. If she is the seller, she gains the selling price of $0.20 in worlds where p is false, and she must pay out $1, and hence loses $0.80 in worlds where p is true. Of course, the fact that an agent is prone to losing money is not itself an epistemic failure, but should instead be thought of as a dramatization of the problematic commitments that stem from credences that violate certain epistemic norms (see Christensen, 2004). 52 Being vulnerable to a Dutch book, consisting of bets that are individually supported as fair by one’s credences, is a sign of having incoherent credences. And since incoherence is reflected in vulnerability to monetary loss, it seems very plausible to expect differences in the degree to which credence functions are incoherent to be reflected in differences in possible Dutch book loss. The question now is how we can develop a general measure of incoherence in terms of Dutch books. In principle, there are many ways in which we could set up such a measure, but some of them would be very obviously bad. For example, we could allow that Dutch books with arbitrarily sized bets can be used to determine the degree of incoherence of a credence function. Yet, this would evidently lead to implausible results. If Sally and Polly both have a degree of belief of 0.5 in some tautology T, this proposal would allow us to make Sally sell a $1 bet on T for $0.50, and to make Polly sell a $10 bet on T for $5. Sally would be guaranteed to lose $0.50, and Polly would be guaranteed to lose $5. But obviously Polly is not ten times more incoherent than Sally, because they have the same credence in T. Hence, we need a way of normalizing the size of the bets that can be involved in Dutch books to ensure that the losses guaranteed by different Dutch books are commensurable. For a similar reason, we must also restrict how many times a bet on a proposition can be bought or sold as part of a Dutch book that is supposed to measure incoherence in terms of Dutch book loss. Suppose we limit the stakes of each bet to $1, but we allow each proposition to be bet on more than once. Then I could sell Sally one $1 bet on T for $.50, making her lose $0.50, and I could sell two of these bets to Polly, making her lose $1. But again, Polly is not twice as incoherent as Sally, even though she would lose twice as much money. Thus, an important 52 There is some debate about how exactly Dutch book arguments should be construed, but for our purposes, the details of this discussion don’t matter. A very good overview and bibliography can be found in Hájek, 2008. 104 constraint on a Dutch book measure of incoherence is that each proposition in the agent’s credence function cannot be used for a bet more than once. Lastly, a plausible Dutch book measure of the total incoherence of a credence function should not miss any ways in which the agent is incoherent. Thus, it is important that the measure considers all of the propositions in the agent’s credence function, and selects the Dutch book that exploits all of the credences the agent has in an optimal way to achieve the highest possible Dutch book loss. The Dutch book measure of degrees of incoherence that I defend in my paper “Degrees of Incoherence and Dutch books” is the simplest way of implementing these constraints. I propose to measure the total degree of incoherence in the following way: assume that each bet has the same stakes, say $1, and assume furthermore that each proposition in the agent’s credence function can be used for a bet at most once. Then the degree of incoherence of an agent’s credence function is the highest guaranteed loss a clever bookie can achieve by using any betting strategy that is a combination of making the agent buy and sell bets on some or all of the propositions in her credence function. Formally, we can express this measure in this way: Suppose a credence function Cr is defined over a set of propositions {A1,…, An}, W is the set of all states of nature, and I is the indicator function, which returns a value of 1 if a proposition is true in a given state of nature, and 0 if it is false. Then DOI(Cr) gives the degree of incoherence for any credence function Cr. DOI(Cr)= max α i ∈{0,1,−1} [−max w∈W α i i=1 n ∑ (I A i (w)−Cr(A i ))] The key to the success of this Dutch book measure of incoherence is that, in defining the maximum Dutch book, we consider every possible betting arrangement in which the bets on all the propositions have equal stakes. Doing this ensures that the agent’s credences in all propositions are given equal weight. Moreover, choosing the Dutch book that leads to the maximum guaranteed loss guarantees that all the ways in which the agent’s credences are incoherent are captured by the measure. We can see how the measure works by applying it to the comparison between Sally and Polly I made at the beginning of this section. Recall that both have incoherent credences in the propositions R and ~R: they both have a credence of 0.5 in R, Sally has a credence of 0.49 in ~R, and Polly has a credence of 0.4 in ~R. According to our measure, all the bets used to 105 measure an agents’ degree of incoherence must have $1 stakes, which we can implement by supposing that Sally and Polly each have two tickets: ticket A says: “Pay $1 to the owner of this ticket if R is true.” Ticket B says: “Pay $1 to the owner of this ticket if ~R is true.” The Dutch book measure now instructs us to find the combination of bets on these tickets that leads to the highest guaranteed loss for the incoherent agent. Given Sally’s credences, she would regard the fair prices for these two tickets to be $0.50 and $0.49, respectively. We can achieve the highest possible loss for her if we make her sell the tickets at these prices, which results in a guaranteed loss of $0.01. Given Polly’s credences, she would regard the fair prices for the tickets to be $0.50 and $0.40, respectively. Selling the tickets at these prices would result in a guaranteed loss of $0.10, which is the most we can make Sally lose from any combination of permissible bets. Thus, the intuitively discernible difference in incoherence between their credences is reflected in the degrees of incoherence that the Dutch book measure assigns to each of their credence functions. Given that the maximum Dutch book measure captures the total incoherence of an agent’s credence function, independently of the size of the set of propositions over which it is defined, it is best suited to compare the incoherence of credence functions that are roughly equal in size. And this is exactly the kind of task we will be using it for in the next section. Other Dutch book measures of incoherence that have been proposed lack the virtue of capturing a credence function’s total incoherence. For example, Schevish, Seidenfeld, and Kadane, who discuss a variety of different ways of setting up Dutch book measures of incoherence, have proposed a measure on which the bets are normalized by determining the ratio between the guaranteed loss and the size of the bets involved. As a result, a credence function’s degree of incoherence is measured by the “worst” Dutch book that can be made against the agent. The worst Dutch book is taken to be the betting arrangement in which the agent can be made to lose the greatest amount of money relative to the sum of the stakes of the bets involved in the Dutch book. The Dutch book in question may involve assigning bets of different sizes to different propositions, and it may involve assigning no bets whatsoever to certain propositions in which the agent has incoherent credences. In fact, in order to capture the “worst” Dutch book, which is taken to determine the degree of incoherence of a given credence function, the measure often must exclude some of the agent’s incoherent credences from the betting arrangement. Hence, it can fail to take into account some of the agent’s incoherent credences, and so it fails to 106 provide an adequate measure of the overall incoherence of the agent’s credence function. 53 5. Should Non-Ideal Agents Imitate Ideal Agents’ Reasoning? In this section, I will employ the measure of incoherence presented in the last section to answer the question of whether a non-ideal agent should pretend to be perfect, and employ the AIR rule to find new credences in gap propositions. Hence, we will determine whether the Strong IMB Thesis is true, which claims that any application of the rule AIR by a non-ideal agent, which results in assigning an IMB credence to a gap proposition, is rational. The answer depends on whether using the AIR rule to assign new credences to gap propositions minimizes increases in incoherence, as demanded by the desideratum I introduced in section four. If we know how incoherent the agent’s credences are before assigning a credence to some new proposition p, we can then use the Dutch book measure to check which credence assignment for p increases the agent’s degree of incoherence the least. If it turns out that all IMB credences are optimal in a given case, then the Strong IMB thesis is true. By contrast, if it turns out that the optimal credence is not always an IMB credence, then both the Strong IMB Thesis and the Strong Imitation Thesis are false, and there is a parallel to the practical case. I will first use the simple example of Sally to demonstrate how to think about the kinds of cases I am considering, and to introduce a way of representing them graphically. I will then explain why one might think it’s a plausible hypothesis that IMB credences for gap propositions are always the optimal choice. However, I will subsequently present a counterexample to this hypothesis, and discuss what kinds of conclusions we can draw from it. 5.1 An Easy Example: The Case of Sally We can see how augmentative reasoning via AIR works for incoherent agents by considering an example that was introduced earlier. Sally has the following incoherent credences: 53 The problem can be illustrated best by aid of a toy example. Suppose there is an agent whose credences are defined over the propositions in the following set: {p, ~p, q, ~q}. The agent can adopt one of two credence functions F and G: F(p) = 0.6, F(~p) = 0.6, F(q) = 0.5, F(~q) = 0.5, and G(p) = 0.6, G(~p) = 0.6, G(q) = 0.6, G(~q) = 0.6. It is easy to see that if the agent adopts F, she is incoherent with respect to her credences in the partition {p, ~p}, whereas if the adopts G, she is incoherent with respect to her credences in the partition {p, ~p} and the partition {q,~q}. Intuitively, if we consider the agent’s total incoherence in each case, it seems pretty obvious that adopting G would make the agent more incoherent than adopting F, because if the agent adopts G, there is an additional partition on which she has incoherent credences. However, this is not the result we get from S, S &K’s measure. According to their measure, the agent would be equally incoherent in both cases, because the worst Dutch book that can be made is the same in each case. For a more detailed discussion of S, S&K’s view, I invite the reader to consult their work, as well as my paper “Degrees of Incoherence and Dutch Books.” 107 Cr(A&B) = 0.2 Cr(A&~B) = 0.2 Cr(A) = 0.5 She wants to assign a credence to ~A, based on her existing credences. There are two subsets of her credence function that give rise to IMB credences: First, the subset of her credence function that forms the partition {A, ~A}. For her credences in this partition of propositions to be coherent, they must sum to 1, so Sally must assign ~A a credence of 0.5. Secondly, there is also another subset of her credence function that forms a partition, which is {A&B, A&~B, ~A}. To make her credence in ~A coherent with this set, Sally must assign ~A a credence of 0.6. There are no other subsets of her credence function that prescribe a precise credence for ~A, so the only IMB credences for ~A are 0.5 and 0.6. 54 Sally’s initial credences are incoherent to degree 0.1. We can see this by Dutch booking her with the following betting strategy: Cr(A&B) = 0.2 → make her sell a bet on (A&B) for $0.20 Cr(A&~B) = 0.2 → make her sell a bet on (A&~B) for $0.20 Cr(A) = 0.5 → make her buy a bet on A for $0.50 Thus, by making Sally sell the first two bets, and buy the third one, she starts out by spending a net amount of $0.10. If she happens to be in a world in which A is false, all of the bets lose, so she is stuck with the loss of $0.10. If she happens to be in a world in which A is true, she wins $1 on her bet on A, but must pay out $1 for either the first or the second bet. Thus, either way, she is stuck with a loss of $0.10. Since there’s no other available betting strategy that leads to a higher loss, her degree of incoherence is 0.1. Now, in order to see what new credence she should optimally assign to ~A, we must investigate how different credences in ~A affect what kind of a Dutch book she is vulnerable to, and how this affects her degree of incoherence. Notice that no matter what degree of belief she assigns to ~A, Sally can never become less incoherent as a result. That is because if an agent who lacks any credence in some proposition p is vulnerable to a given Dutch book, then she will continue to be vulnerable to this Dutch book upon forming a credence in p. When one forms a credence in a new proposition, the initial set of Dutch books to which one is vulnerable is always a subset of the set of Dutch books to which one will be vulnerable after forming the credence. 54 This has been confirmed via an exhaustive search of the subsets of the agent’s credence function. 108 And so the maximal Dutch book loss can never decrease. Nor, therefore, can one’s degree of incoherence on the measure we have proposed. Thus, the best Sally can do is assign a credence to ~A that preserves her degree of incoherence of 0.1. Let’s see what happens if Sally chooses one of the IMB credences of 0.5 or 0.6 for ~A, holding all of her other credences fixed. First, suppose she chooses Cr(~A) = 0.5, giving her the following credence function: Cr(A&B) = 0.2 Cr(A&~B) = 0.2 Cr(A) = 0.5 Cr(~A) = 0.5 In order to achieve the maximum guaranteed loss from this credence function, we cannot simply take the combination of bets on the first three propositions we used initially and add a bet on or against ~A, because this combination would not lead to a guaranteed loss in every world. Rather, in order to achieve the maximum Dutch book loss here, we can either leave ~A out and simply go with the initial Dutch book, or we can instead have the agent make the following bets: Cr(A&B) = 0.2 → make her sell this bet for $0.20 Cr(A&~B) = 0.2 → make her sell this bet for $0.20 Cr(A) = 0.5 → no bet Cr(~A) = 0.5 → make her sell this bet for $0.50 In this scenario, the agent receives $0.90 from selling all three bets. However, one of them is guaranteed to win, so she must pay out $1 no matter which world is actual, which means she will lose $0.10. This is the same guaranteed loss she faces from the initial Dutch book, before she had a credence in ~A. Hence, we have learned that choosing the IMB credence Cr(~A) = 0.5 is an optimal choice, because it allows Sally to preserve her initial degree of incoherence. There is no other available betting strategy that would lead to a higher guaranteed loss, given these credences. What if she instead chooses the other IMB credence for ~A, which is 0.6? In this case, we can again stick with the original Dutch book, or alternatively make Sally bet as follows to achieve the maximum guaranteed loss: Cr(A&B) = 0.2 → no bet Cr(A&~B) = 0.2 → no bet Cr(A) = 0.5 → make her buy this bet for $0.50 109 Cr(~A) = 0.6 → make her buy this bet for $0.60 In this case, Sally will spend $1.10 to buy both bets, but only one of them can win and pay her $1, so she will be stuck with a guaranteed loss of $0.10. This is again the same loss, and the same degree of incoherence she had initially, so choosing the IMB credence of 0.6 for ~A is also an optimal choice. What happens if Sally chooses a credence other than one of the IMB credence for ~A? In order to see this more easily, I will now introduce a way of representing betting strategies graphically, that shows us Sally’s guaranteed loss for any credence she might assign for ~A. We will consider two-dimensional graphs, where the x-axis represents the credences between 0 and 1 that the agent can assign to the new proposition, and the y-axis represents the guaranteed loss the agent faces. We will assume throughout that all of the agent’s credences are fixed except for the value of the new credence that is to be assigned. Any betting strategy, which is a specific way of selling and buying bets on some or all of the propositions in the agent’s credence function, can be represented as a line in the coordinate system. A strategy that does not include a bet on or against the new proposition p will be a horizontal line, because its guaranteed loss does not vary with the credence we assign to p. The y-value of that line represents the guaranteed loss the agent faces from that betting strategy. Any strategy that includes selling a bet on p is represented by a downward sloping line, because the guaranteed loss produced by that strategy decreases as the selling price of the bet on p increases. Any strategy that includes buying a bet on p is represented by an upward sloping line, because the guaranteed loss produced by that strategy increases as the buying price of p increases. I will also use vertical orientation lines that remind us where on the x- axis the IMB credences for the new propositions lie. For a credence function defined over n propositions, there will be 3 n betting strategies, and thus just as many horizontal or diagonal lines in our diagram. However, not all of the lines are interesting, since a lot of them represent betting strategies that don’t give us the highest guaranteed loss for any credence in p. Let’s look at a diagram that represents the interesting betting strategies based on Sally’s credences to get a better idea of what’s going on: 110 Every diagram contains a horizontal line that represents the betting strategy that leads to the highest Dutch book loss before a credence is assigned to the new proposition, which is ~A in Sally’s case. For Sally’s credence function, this is the thick horizontal line that lies at y = 0.1. The two thin vertical lines are the orientation lines that remind us where on the x-axis the IMB credences lie that the agent can find by applying AIR. The two diagonal lines represent betting strategies that involve bets on the new proposition p. They are of particular interest, because they show the maximum guaranteed loss Sally is vulnerable to for particular intervals her new credence in ~A might lie in. The downward sloping line corresponds to the betting strategy that involves selling bets on ~A, A&B, and A&~B, and the upward sloping line corresponds to the betting strategy that involves buying bets on ~A and A. The final aspect of the diagram that is important is the line that tracks the maximum guaranteed loss for each possible credence in the new proposition. This is the line that traces the highest y-values overall in the diagram. In our case, it is combined out of part of the downward sloping line, the horizontal line, and the upward sloping line, and it is shaped like a V with the tip cut off. It shows for each credence assignment to ~A how high the guaranteed loss is the agent faces, and which strategy must be used to achieve this loss. It also shows which credences for ~A are optimal, i.e. lead to the minimum guaranteed loss. In this case, we can see that both of the IMB credences, marked by the vertical orientation lines, are optimal, as well as any credence in between. We can see this because these are the x- values that preserve Sally’s initial degree of incoherence. Any credence for ~A that is outside this interval leads to a higher guaranteed loss. That means that, in Sally’s case, there are some non- 111 IMB credences that are optimal besides the two IMB credences, but there are also many non- IMB credences that would worsen Sally’s incoherence. We can now reformulate our initial question about the truth of the Strong IMB Thesis as a question about diagrams that illustrate betting strategies. We want to know whether the available IMB credences are always among the optimal choices for an incoherent agent assigning a new credence to a gap proposition. In terms of our diagrams, we want to know whether the top curve that tracks the maximum guaranteed loss for all values of p always has minima in the places where the IMB credences lie. 5.2 A Counterexample to the Strong Imitation Thesis We saw that the example of Sally confirmed the prediction of the Strong IMB Thesis. All IMB credences turned out to be optimal, and it makes sense to suspect that this is generally true. The reason why this hypothesis seems initially plausible is that we know that incoherence generates Dutch book losses, whereas coherence doesn’t. If an agent assigns an IMB credence to a new proposition, this means that her new credence is coherent with at least part of her existing credence function, which means that she thereby restricts the ways in which her credences can be exploited to generate guaranteed losses. By contrast, when she assigns a non-IMB credence, she thereby makes herself potentially vulnerable to worse ways of being Dutch-booked. Yet, it turns out that are also cases in which only one of several IMB credences is optimal. Consider the following credence function: Cr (A&B) = 0.2 Cr (A&~B) = 0.2 Cr (~A&B) = 0.1 Cr (~A&~B) = 0.1 Cr (~A) = 0.7 Cr (A) = ? In this case, the agent’s initial degree of incoherence is 0.5, which is revealed by the betting strategy that involves buying the bet on ~A and selling the bets on ~A&B and ~A&~B. The agent has the following options for assigning IMB credences 55 : 55 An exhaustive search confirms that these are the only IMB credences that can be found via AIR. 112 (a) {A,~A}→ Cr(A) = 0.3 makes the credences in the set coherent and is thus an IMB credence (b) {A, A&B, A&~B}→ Cr(A) = 0.4 makes the credences in the set coherent and is thus an IMB credence (c) {A, ~A&B, ~A&~B}→ Cr(A) = 0.8 makes the credences in the set coherent and is thus an IMB credence. Interestingly, there is only one credence in this case that the agent can assign that will preserve her degree of incoherence, namely Cr(A) = 0.4, which is an IMB credence based on the set {A, A&B, A&~B}. If Cr(A) is higher or lower than 0.4, this can be exploited by the maximum Dutch book measure, and the guaranteed loss the agent faces increases to the same amount as Cr(A) diverges from 0.4. What this means is that choosing one of the other two IMB credences leads to very bad results, results that are in fact worse than for lots of non-IMB credences. If the agent assigns Cr(A) = 0.3, her degree of incoherence increases to 0.6, and if she assigns Cr(A) = 0.8, it even rises to 0.9. The diagram that shows this result looks as follows. As before, the horizontal line represents the initial Dutch book that reveals the agent’s degree of incoherence before assigning a credence to A. The vertical orientation lines mark where the IMB credences lie. The downward sloping line represents the betting strategy that involves selling bets on A, ~A&B, and ~A&~B, and buying bets on A&B, A&~B, and ~A, and the upward sloping line represents the betting strategy that involves selling bets on A&B, A&~B, ~A&B, and 113 ~A&~B, and buying bets on A and ~A. There is no other betting strategy we could include in the diagram that would lead to a higher guaranteed loss for any given credence assignment to A. We can see here that the top curve, which traces the guaranteed loss for each credence we can assign to A while holding the other credences fixed, has a minimum at 0.4. Hence, one of the IMB credences is an optimal choice, but the other two available IMB credences significantly increase the agent’s degree of incoherence. This example shows that the Strong IMB Thesis is false. It is not the case that all ways of following AIR to find an IMB credence are optimal for non-ideal agents. It follows from this that the Strong Imitation Thesis is false as well: it is not the case that any rule of reasoning that is foolproof for ideal agents is also foolproof for non-ideal agents. However, given the examples we have considered so far, it looks like at least one IMB credence is always optimal, which means it is still possible that the following weaker version of the Imitation Thesis is true. Weak Imitation Thesis (Theoretical Version): For any rule of theoretical reasoning R, if R is a foolproof rule to follow for ideal agents, then R is a decent rule to follow for non-ideal agents. By a ‘decent rule to follow’, I mean that there is always some way of applying the rule in a given situation that would be rational. If we combine the Weak Imitation Thesis with AIR, we get the following thesis: Weak IMB Thesis: For any non-ideal agent N, and any proposition p, if there are any IMB credences N could form in p, then at least one of these IMB credences would be rational for N to assign to p. Both of the examples we have considered so far confirm the prediction of the Weak IMB Thesis, so for all we know at this point, it might be true. We can furthermore learn from the two examples that it is not generally necessary for an incoherent agent to make her credences coherent before she can assign a new credence. We 114 discussed earlier that having to do so would be a very limiting constraint on reasoning processes aimed at augmenting the agent’s existing credences. We have now seen that there are definitely cases in which the agent can augment her credences directly by assigning the correct IMB credence, without thereby becoming more incoherent. Yet, as we will see in the next section, there are also cases in which only non-IMB credences are optimal. 5.3 A Counterexample to the Weak Imitation Thesis It unfortunately turns out that even though there seem to be many cases in which at least one IMB credence is optimal, we can find counterexamples to the Weak Imitation Thesis. Suppose an agent has the following credence function: Cr(A) = 0.6 Cr(~A&B) = 0.4 Cr(~A&~B) = 0.4 Cr(A∨B) = 0.5 Cr(~A∨~B) = 0.7 Cr(~A) = ? Before assigning any credence to ~A, the agent’s degree of incoherence is 0.4. This guaranteed loss can be achieved by a Dutch book that makes the agent buy bets on A, (~A&B) and (~A&~B). The agent spends a total of $1.40 on these three bets, but only one of them wins in any given world, so the agent, winning back $1, is still short of $0.40. There is no betting strategy that achieves a higher guaranteed loss that includes bets on or against (A∨B) and (~A∨~B). They are not mutually incoherent, and if we try to exploit the fact that the credence in (A∨B) is incoherent with the agent’s credence in A, and her credence in (~A∨~B) is incoherent with her credences in (~A&B) and (~A&~B), we cannot create a Dutch book that creates a higher guaranteed loss than $0.40. We now need to figure out what credence assignment for ~A is optimal. The agent’s existing credences yield three IMB credences. 56 (a) {A,~A}→ Cr(~A) = 0.4 makes the credences in the set coherent and is thus an IMB credence. 56 As in the previous examples, this has been confirmed via an exhaustive search. 115 (b) {~A&B,~A&~B, ~A}→ Cr(~A) = 0.8 makes the credences in the set coherent and is thus an IMB credence. (c) {A∨B, ~A&B, ~A}→ Cr(~A) = 0.9 makes the credences in the set coherent and is thus an IMB credence. We can represent the agent’s initial Dutch book loss and the three IMB credences in the following diagram: In order for our initial hypothesis to be true that at least one IMB credence is always optimal, it would have to be true that, once we add the strategies to the diagram that give us the maximum guaranteed loss after assigning a credence to ~A, the curve that traces the maximum guaranteed loss for each credence in ~A has a minimum at x = 0.4 or x = 0.8, or x = 0.9. However, this is not the case. Once a credence has been assigned to ~A, the we can use the following strategies to extract the maximum Dutch book loss from the agent: Highest downward sloping line: Cr(A) = 0.6 → make the agent buy this bet for $0.60 Cr(~A&B) = 0.4 → make the agent buy this bet for $0.40 Cr(~A&~B) = 0.4 → make the agent buy this bet for $0.40 Cr(A∨B) = 0.5 → make the agent sell this bet for $0.50 Cr(~A∨~B) = 0.7 →no bet Cr(~A) = x → make the agent sell this bet for $x OR 116 Cr (A∨B) = 0.5 → make the agent buy this bet for $0.50 Cr (~A&B) = 0.4 → make the agent sell this bet for $0.40 Cr (~A) = x → make the agent sell this bet for $x Highest upward sloping line: Cr(A) = 0.6 → make the agent buy this bet for $0.60 Cr(~A&B) = 0.4 → make the agent buy this bet for $0.40 Cr(~A&~B) = 0.4 → make the agent buy this bet for $0.40 Cr(A∨B) = 0.5 → no bet Cr(~A∨~B) = 0.7 → make the agent sell this bet for $0.70 Cr(~A) = x → make the agent buy this bet for $x If we add these strategies to the diagram, we end up with the following picture: It is easy to see that the line that traces the maximum guaranteed loss for every value of x does not have minima in the places where the IMB credences are. The top graph has minima between x = 0.5 and x = 0.7, but the IMB credences are at 0.4 and 0.8, and 0.9. Therefore, we have found a counterexample to the hypothesis that there is always at least one IMB credence that an agent can assign to a new proposition that is optimal. 117 The way in which the counterexample works is that the agent’s initial credence function contains some propositions that are not used in the initial maximum Dutch book, because they don’t contribute to achieving the highest guaranteed loss. However, once the agent assigns a new credence, there is now a way to augment the initial Dutch book by including the new proposition and one of the left-out propositions. In doing so, it turns out that the line that traces the maximum loss doesn’t have minima in the same places where the IMB credences lie. We have now shown that even the Weak IMB Thesis is false: it is not always the case that at least one of the IMB credences is optimal. And hence, the Weak Imitation thesis is false as well. It is not the case that any rule of theoretical reasoning that is foolproof for ideal agents is also at least a decent rule for non-ideal agents. Notice that for this result to follow, it is not necessary for me to assume that an ideal agent would in fact follow AIR to augment her credence function. What matters is only that the ideal agent could use AIR without any problem, whereas the non- ideal agent can’t. 5.4 Conditions Under Which IMB Credences are Optimal We have seen in the previous sections that even though there is not always an IMB credence that is optimal, there are many cases in which choosing an IMB credence is the best way for an agent to augment her existing credence function. One would ultimately like to be able to characterize necessary and sufficient conditions for there to be an incoherence-minimizing IMB credence, and to do so in a way that could potentially be useful to imperfect reasoners. In this section, I’ll take an important step in this direction by giving a sufficient condition for a situation in which an IMB credence is optimal. Let’s begin by describing the simplest case in which the sufficient condition holds. Suppose the initial maximum Dutch book to which the agent is vulnerable before assigning a credence to the new proposition p contains bets on or against all propositions in the agent’s credence function, except propositions q1 – qn. Assume, moreover, that q1 – qn form a set together with p that determines an IMB credence for p in the sense defined above, i.e. a set that contains no superfluous propositions that don’t contribute to prescribing a precise value for p. If there is more than one way of setting up a maximum Dutch book against the agent, then the claims I will make in the following hold as long as there is one way of setting up the maximum Dutch book in the way just described, and all other ways of achieving the maximum loss are such that the 118 propositions used for betting are a subset of the set of propositions used in the maximum Dutch book just described. 57 If the agent’s credence function meets these conditions, some particular IMB credence is the optimal credence assignment for p, because it will preserve the agent’s degree of incoherence. Here’s why: Since the initial maximum Dutch book does not contain bets on or against the propositions q1 – qn, we know that including bets on some or all of these propositions would have lowered the maximum guaranteed loss. If including them had increased the maximum Dutch book loss, then they would not have been left out. Now, since q1 – qn prescribe a precise credence for p, we know that if p is assigned this IMB credence, then no guaranteed loss can be achieved by making bets on or against the propositions in the set {p, q1 – qn}. In this scenario, the best the clever bookie can do is set up the bets in such a way that their respective gains and losses cancel each other out in every world. In such a betting setup, the p-bet has the reverse payoff of the bets based on q1 – qn. This also means that if we reverse either the sign of the p-bet, or the signs of the q1 – qn-bets (from buying to selling or vice versa, depending on how they were set up to cancel each other out), then the p-bet and the q1 – qn-bets have the same payoff structure in every world. Hence, the newly acquired IMB credence in p results in betting opportunities for the bookie that were already available based on the agent’s credences in q1 – qn. The bookie therefore does not gain any new ways of Dutch-booking the agent when she forms an IMB credence in p. Also, since the new credence in p is by assumption coherent with the agent’s old credences in q1 – qn, no new opportunity for achieving a guaranteed loss arises from the set {p, q1 – qn} either. Thus, we can conclude that, if the agent’s credences are structured in the way stipulated above, then an IMB credence is optimal for the new proposition in p. If there are multiple options for choosing IMB credences, the agent should choose the IMB credence for p that is prescribed by the propositions not included in the initial maximum Dutch book against her. As the reader can easily verify, the example discussed in section 2.2 fulfills this sufficient condition. The agent in this example has a choice between three different IMB credences, and the one that is optimal is the IMB credence that is based on the set of propositions that is not included in the original maximum Dutch book against the agent. 57 The result holds in these scenarios because if there are subsets of the agent’s credence function that are coherent, then it is optional to include them in the maximum Dutch book. Hence, there might be several ways of achieving the maximum Dutch book loss, but some of them can leave out bets whose gains and losses simply cancel out. 119 Conclusion I began my discussion with a comparison between practical reasoning and reasoning with degrees of belief. I pointed out the well-known fact that in moral philosophy, the Strong Imitation Thesis, according to which non-ideal agents can follow the same rules of practical reasoning as their ideal counterparts, has been found to be subject to a variety of counterexamples. I then asked whether the same is true in reasoning with degrees of belief. If an agent with incoherent credences tries to assign a new credence to a proposition, should she try to imitate what an ideal agent might do, and follow the rule AIR to find the new credence? While this idea seemed initially plausible – after all, choosing an IMB credence seems to avoid at least some ways of being Dutch booked – we found that AIR provides us with a counterexample to the general claim that rules of theoretical reasoning that are foolproof for ideal agents are also foolproof for non-ideal agents. We then investigated whether a weaker version of the thesis might be tenable, namely the thesis that rules of theoretical reasoning that are foolproof for ideal agents are at least decent rules to follow for non-ideal agents. We found that even the Weak Imitation Thesis was subject to counterexamples, because sometimes there was no way in which a non-ideal agent could find an optimal augmentation of her credence function by employing AIR. Hence, there is a parallel to the case of practical reasoning, because in both domains, following the same rules as an ideal agent is not always optimal. This is an important result, because it shows that care needs to be taken over this point when we ask what we ought to believe in sight of ideal agents. Since it is not advisable for non- ideal agents to simply reason according to the same rules as ideal agents, we need to develop a more sophisticated account of how ideal norms inform what non-ideal agents should believe. We need to be careful more generally about the scope of conclusions that are drawn by looking at ideal agents, or ideal models. Areas in which these concerns apply are for example discussions about updating one’s credences in light of new information, and revising existing credences, discussions about how to react to disagreement, or discussions of judgment aggregation. However, we also saw that there are cases in which the simple view gives us the correct result, where an agent can directly assign an IMB credence and thereby preserve her degree of incoherence. The fact that agents can augment their existing credence functions without becoming more incoherent also shows that it is not always necessary for an agent to make her existing credences coherent before assigning a new credence. 120 In further research, I hope to answer the question of whether there is a kind of heuristic a non-ideal agent can use in order to find out how to best assign a new credence. More specifically, it would be good to know how agents can determine whether they are in the kind of situation in which it makes sense to pretend that they are perfect. Then we can ask what an agent should do in a situation in which multiple IMB credences are available. How should the agent choose between them in order to make sure, or at least make it highly likely, that she will end up with the optimal choice of credence assignment from among the available IMB credences? Lastly, it would also be interesting to know what strategy an agent should follow to assign a new credence in a situation in which imitating an ideal agent is not optimal. I hope to have provided an attractive framework that will help us answer these questions, by proposing a well-motivated measure of the overall incoherence of a credence function, and by formulating in a precise manner what it would mean for an incoherent agent reason according to the same rules as an an ideal agent in cases of augmentative reasoning. 121 References Broome, John. 2009. Rationality Through Reasoning. unpublished manuscript. Brown, Eric. 2011. “Virtue Ethics and the Problem of Advising Fools.” unpublished manuscript. Christensen, David. 2007. “Does Murphy’s Law Apply in Epistemology?” Oxford Studies in Epistemology 2: 3-31. Christensen, David. 2004. Putting Logic in its Place. Oxford: Oxford University Press. Earman, John. 1992. Bayes or Bust. A Critical Examination of Bayesian Confirmation Theory. Cambridge, MA: MIT Press. Eells, Ellery.1985. “Problems of Old Evidence.” Pacific Philosophical Quarterly 66: 283-302. Evans, Jonathan. 2008. “Dual-Processing Accounts of Reasoning, Judgment, and Social Cognition.” Annual Review of Psychology 59: 255-278. Evans, Jonathan & Over, David. 1996. Rationality and Reasoning. East Sussex: Psychology Press. Feldman, Fred. 1986. Doing the Best We Can: An Essay in Informal Deontic Logic. Dordrecht: Reidel. Field, Hartry & Milne, Peter. 2009. “The Normative Role of Logic.” Proceedings of the Aristotelian Society Supplementary Volume LXXXIII: 251-268. Frankish, Keith. 2004. Mind and Supermind. Cambridge: Cambridge University Press. Frankish, Keith. 2009. “Systems and levels: Dual-system theories and the personal-subpersonal distinction.” In In Two Minds: Dual Processes and Beyond, ed. Jonathan Evans & Keith Frankish, 89-107. Oxford: Oxford University Press. Gallow, Dmitri. 2012. “How to Learn from Theory-Dependent Evidence; or Commutativity and Holism: A Solution for Conditionalizers.” forthcoming in British Journal for the Philosophy of Science. Garber, Daniel. 1983. “Old Evidence and Logical Omniscience in Bayesian Confirmation Theory.” In Minnesota Studies in the Philosophy of Science Volume X: Testing Scientific Theories, ed. John Earman, 99-131. Gigerenzer, Gerd. 2006. “Bounded and Rational.” In Contemporary debates in Cognitive Science, ed. Robert Stainton, 115-133. Oxford: Blackwell. Gilovich, Thomas, Griffin, Dale & Kahneman, Daniel (eds.). 2002. Heuristics and Biases. The Psychology of Intuitive Judgment. Cambridge: Cambridge University Press. Glymour, Clark. 1980. Theory and Evidence. Princeton: Princeton University Press. 122 Grice, Paul. 2001. Aspects of Reason. Oxford: Oxford University Press. Hacking, Ian. 1967. “Slightly More Realistic Personal Probability.” Philosophy of Science 34: 311- 325. Hájek, Alan. 2008. “Dutch Book Arguments.” In The Oxford Handbook of Rational and Social Choice, ed. Paul Anand, Prastanta Pattanaik & Clemens Puppe, 173-195. Oxford: Oxford University Press. Hájek, Alan. 2001. “Probability, Logic, and Probability Logic.” In The Blackwell Companion to Logic, ed. Lou Goble, 362-384. Oxford: Blackwell. Harman, Gilbert. 1986. Change in View. Cambridge, MA: MIT Press. Hawthorne, John & Stanley, Jason. 2008. “Knowledge and Action.” Journal of Philosophy 105: 571- 590. Howson, Colin. 2002. “The Logic of Bayesian Probability.” In: Foundations of Bayesianism, ed. David Corfield & Jon Williamson, 137-160. Dordrecht: Kluwer. Howson, Colin & Urbach, Peter. 2006. Scientific Reasoning. The Bayesian Approach. 3 rd ed. Chicago: Open Court. Jackson, Frank & Pargetter, Robert. 1986. “Oughts, Options, and Actualism.” The Philosophical Review 95: 233-255. Kahneman, Daniel, Slovic, Paul & Tversky, Amos (eds.). 1982. Judgments under Uncertainty: Heuristics and Biases. Cambridge: Cambridge University Press. Kant, Immanuel. 1999. Grundlegung zur Metaphysik der Sitten, Hamburg: Meiner. Keynes, John Maynard. 1921. A Treatise on Probability. London: MacMillan. Kirsh, David. 2003. “Implicit and Explicit Representation.” Encyclopedia of Cognitive Science. 478- 481. MacFarlane, John. 2004. “In What Sense (If Any) Is Logic Normative For Thought?” unpublished manuscript, presented at the Central APA 2004. Maher, Patrick. 1995. “Probabilities for New Theories.” Philosophical Studies 77: 103-115. Miller, David. 2012. “Popper’s Contributions to the Theory of Probability and its Interpretations.” Forthcoming in The Cambridge Companion to Popper, ed. Jeremy Shearmur & Geoffrey Stokes. Cambridge: Cambridge University Press. Oaksford, Mike & Chater, Nick. 2007. Bayesian Rationality. Oxford: Oxford University Press. Parfit, Derek. 2011. On What Matters. Vol. 1. Oxford: Oxford University Press. 123 Parsons, Larry & Osherson, David. 2001. “New Evidence for Distinct Right and Left Brain Systems for Deductive versus Probabilistic Reasoning.” Cerebral Cortex 11: 954-965. Priest, Graham. 1987. In Contradiction: a Study of the Transconsistent. Leiden: Martinus Nijhoff Publishers. Tanaka, Koji. 2003. “Three Schools of Paraconsistency.” Australasian Journal of Logic 1: 28-42. Portmore, Douglas. 2011. Commonsense Consequentialism: Wherein Morality Meets Rationality. Oxford: Oxford University Press. Raz, Joseph. 2010. “Reason, Reasons, and Normativity.” In Oxford Studies in Metaethics, Vol. 5, ed. Russ Shafer-Landau, 5-23. Oxford: Oxford University Press. Schervish, Mark, Seidenfeld, Teddy & Kadane, Joseph. 2000. “How Sets of Coherent Probabilities May Serve as Models for Degrees of Incoherence.” International Journal of Uncertainty, Fuzziness and Knowlegde-based Systems 8: 347-355. Schervish, Mark, Seidenfeld, Teddy & Kadane, Joseph. 2002a. “Measuring Incoherence.” Sankhya: The Indian Journal of Statistics 64: 561-587. Schervish, Mark, Seidenfeld, Teddy & Kadane, Joseph. 2002b. “Measures of Incoherence: How not to Gamble if You Must.” In: Bayesian Statistics 7, ed. José Bernardo, et. al., 385-401. Oxford: Oxford University Press. Schwitzgebel, Eric. 2010. “Belief.” The Stanford Encyclopedia of Philosophy (Winter 2010 Edition), ed. Edward Zalta, URL: http://plato.stanford.edu/archives/win2010/entries/belief/ Sidgwick, Henry. 1884. The Method of Ethics. 3 rd Edition. London: MacMillan. Sloman, Steven. 1996. “The Empirical Case for Two Systems of Reasoning.” Psychological Bulletin 119: 3-22. Streumer, Bart. 2007. “Inferential and Non-Inferential Reasoning.” Philosophy and Phenomenological Research LXXIV: 1-29. Walker, Arthur. 1985. “An Occurrent Theory of Practical and Theoretical Reasoning.” Philosophical Studies 48: 199-210. Watson, Gary. 1975. “Free Agency.” Journal of Philosophy 72: 205-220. Wedgwood, Ralph. 2006. “The Normative Force of Reasoning.” Noûs 40: 660-686. Wedgwood, Ralph. 2012. “Justified Inference.” Synthese 189: 273-295. Weisberg, Jonathan. 2009. “Commutativity or Holism? A Dilemma for Conditionalizers.” British Journal for the Philosophy of Science 60: 793-812. 124 Weisberg, Jonathan. 2011. “Varieties of Bayesianism.” In: Handbook of the History of Logic Vol.10, ed. Dov Gabbay, Stephan Hartmann & John Woods, 477-552. Amsterdam: Elsevier. Zhao, Jiaying, Shah, Anuj & Osherson, David. 2009. “On the provenance of judgments of conditional probability.” Cognition 113: 26-36. Zimmerman, Michel. 1996. The Concept of Moral Obligation. Cambridge: Cambridge University Press. Zynda, Lyle. 1996. “Coherence as an Ideal of Rationality.” Synthese 109: 175-216. 125
Abstract (if available)
Abstract
In this dissertation, I lay the groundwork for developing a comprehensive theory of reasoning with degrees of belief. Reasoning, as I understand it here, is the mental activity of forming or revising one’s attitudes based on other attitudes. I argue that we need such a theory, since degrees of belief, also called credences, play an important role in human reasoning. Yet, this type of reasoning has so far been overlooked in the philosophical literature. Discussions of reasoning, understood as a mental activity of human beings, focus almost exclusively on the traditional notion of outright belief, according to which an agent can believe, disbelieve, or suspend judgment about a proposition. The philosophical literature on degrees of belief, on the other hand, acknowledges that this model of belief is too coarse grained: agents can have different levels of confidence in a proposition, and hence we should think of belief as a graded notion. Yet, the literature on degrees of belief is hardly concerned with the question of how agents should reason. The leading research paradigm is subjective Bayesianism, a theory according to which the probability axioms constitute the norms of rationality for degrees of belief. However, the norms of subjective Bayesianism should not be construed as principles of reasoning, and so this theory does not provide an account of reasoning with degrees of belief. ❧ One important constraint on a comprehensive theory of reasoning with degrees of belief is that it must apply to non-ideal reasoners, who have incoherent degrees of belief. This constraint provides one of the reasons why subjective Bayesianism cannot be viewed as a theory of reasoning: it is widely criticized for applying at best to ideal agents, since the coherence norms on degrees of belief it postulates seem impossibly demanding for human agents. ❧ I argue that when we try to establish principles of reasoning that apply to incoherent agents, a condition of adequacy on such principles is that they should minimize increases in incoherence. In order to evaluate whether a principle of reasoning meets this condition of adequacy, we need to be able to measure the degree to which an agent’s credence function is incoherent. Yet, the standard Bayesian theory provides no way of measuring degrees of incoherence. This theory allows us to distinguish between coherent and incoherent credence functions, but it does not allow us to distinguish between credence functions with higher and lower degrees of incoherence. I propose a way of extending the standard Bayesian framework by developing and defending a formal measure of such incoherence. I then show how this measure can be applied to formulate constraints on adequate rules of reasoning for incoherent agents. In particular, I use the measure to ascertain whether it is advisable for non-ideal agents to follow the same reasoning strategies as their ideal counterparts. I show that this is not always a good idea, because doing so can sometimes make an agent more incoherent than following some alternative reasoning strategy.
Linked assets
University of Southern California Dissertations and Theses
Conceptually similar
PDF
Belief as credal plan
PDF
The virtue of reasonableness: on the normative epistemic implications of public reason liberalism
PDF
A deontological explanation of accessibilism
PDF
Rationality and the primacy of the occurrent
PDF
Beliefs that wrong
PDF
Contrastive reasons
PDF
Iffy confidence
PDF
Reasoning with uncertainty and epistemic modals
PDF
Process-oriented rationality
PDF
Responding to harm
PDF
The scope and limitations of young children’s belief understanding
PDF
Advances in linguistic data-oriented uncertainty modeling, reasoning, and intelligent decision making
PDF
Reasons, obligations, and the structure of good reasoning
PDF
Feeling good and living well: on the nature of pleasure and its role in well-being
PDF
Aggregating complaints
PDF
A perceptual model of evaluative knowledge
PDF
The dynamics of reasonable doubt
PDF
Positivist realism
PDF
A Bayesian region of measurement equivalence (ROME) framework for establishing measurement invariance
PDF
A synthesis reasoning framework for early-stage engineering design
Asset Metadata
Creator
Staffel, Julia
(author)
Core Title
Reasoning with degrees of belief
School
College of Letters, Arts and Sciences
Degree
Doctor of Philosophy
Degree Program
Philosophy
Publication Date
04/08/2013
Defense Date
03/21/2013
Publisher
University of Southern California
(original),
University of Southern California. Libraries
(digital)
Tag
Bayesianism,credences,degrees of belief,epistemology,incoherence,OAI-PMH Harvest,probability,reasoning
Language
English
Contributor
Electronically uploaded by the author
(provenance)
Advisor
Ross, Jacob (
committee chair
), Easwaran, Kenny (
committee member
), Wedgwood, Ralph (
committee member
)
Creator Email
goldtmarie@gmail.com,staffel@usc.edu
Permanent Link (DOI)
https://doi.org/10.25549/usctheses-c3-234103
Unique identifier
UC11293930
Identifier
usctheses-c3-234103 (legacy record id)
Legacy Identifier
etd-StaffelJul-1525.pdf
Dmrecord
234103
Document Type
Dissertation
Rights
Staffel, Julia
Type
texts
Source
University of Southern California
(contributing entity),
University of Southern California Dissertations and Theses
(collection)
Access Conditions
The author retains rights to his/her dissertation, thesis or other graduate work according to U.S. copyright law. Electronic access is being provided by the USC Libraries in agreement with the a...
Repository Name
University of Southern California Digital Library
Repository Location
USC Digital Library, University of Southern California, University Park Campus MC 2810, 3434 South Grand Avenue, 2nd Floor, Los Angeles, California 90089-2810, USA
Tags
Bayesianism
credences
degrees of belief
epistemology
incoherence
probability
reasoning