Close
About
FAQ
Home
Collections
Login
USC Login
Register
0
Selected
Invert selection
Deselect all
Deselect all
Click here to refresh results
Click here to refresh results
USC
/
Digital Library
/
University of Southern California Dissertations and Theses
/
Marketing strategies with superior information on consumer preferences
(USC Thesis Other)
Marketing strategies with superior information on consumer preferences
PDF
Download
Share
Open document
Flip pages
Contact Us
Contact Us
Copy asset link
Request this asset
Transcript (if available)
Content
MARKETING STRATEGIES WITH SUPERIOR INFORMATION ON CONSUMER PREFERENCES by Zibin Xu A Dissertation Presented to the Faculty of the Graduate School, University of Southern California in Fulfillment of the Requirements for the Degree of Doctor of Philosophy (Business Administration) Degree Conferral Date: August 2017 Zibin Xu 1 DEDICATION I dedicate this dissertation to my parents, for their endless support and sacrifice. 2 ACKNOWLEDGMENT I owe my deepest gratitude to my adviser, Anthony Dukes, for his continuous guidance, unwavering patience and the countless meals he bought me. This dissertation would not have been possible without his inspiration and encouragement. I am also indebted to my committee members: Matt Selove, Dina Mayzlin, and Guofu Tan, for their helpful insights and suggestions during the process. 3 TABLE OF CONTENTS DEDICATION....................................................................................................................1 ACKNOWLEDGMENT ...................................................................................................2 ABSTRACT ........................................................................................................................5 CHAPTER ONE: PERSONALIZED PRICING WITH SUPERIOR INFORMATION ON CONSUMER PREFERENCES ........................................................................................8 1.1 Introduction .............................................................................................................8 1.2 The Model ...........................................................................................................12 1.3 Uninformed Preferences, Superior Information, Suspicion ..................................13 1.4 Equilibrium ...........................................................................................................18 1.5 Firm’s Incentive for Price Discrimination ..........................................................24 1.6 Welfare Implications ...........................................................................................29 1.7 Conclusion ............................................................................................................33 CHAPTER TWO: PRODUCT LINE DESIGN WITH SUPERIOR INFORMATION ON CONSUMER PREFERENCES ..............................................................................................................36 2.1 Introduction .........................................................................................................36 2.2 The Model .............................................................................................................42 2.3 A Numerical Example ...........................................................................................44 2.4 Equilibrium ...........................................................................................................46 2.5 Analysis .................................................................................................................54 2.6 Data Collection without Learning Superior Information ....................................58 2.7 Conclusions ...........................................................................................................62 4 CHAPTER THREE: INTERMEDIARY CURATION WITH SUPERIOR INFORMATION ON CONSUMER PREFERENCES ......................................................................................66 3.1 Introduction .........................................................................................................66 3.2 The General Model .............................................................................................67 3.3 An Auction-based Product Listing Platform ......................................................68 3.4 A Commission-based Marketplace .....................................................................70 3.5 Conclusion ..........................................................................................................74 REFERENCES .................................................................................................................76 APPENDICES ..................................................................................................................79 Appendix A: Proofs to Chapter One ...........................................................................79 Appendix B: Proofs to Chapter Two ...........................................................................90 5 ABSTRACT There is certainly economic interest in the welfare implications when marketers have the statistical power to learn superior information about consumers’ preferences, especially with the development of modern data technologies that has enabled marketers to observe vast information about consumers’ usage records and behavior patterns. Whereas consumers observe only their own usage experiences, marketers can observe more consumers, aggregate their d ata to isolate environmental noises, and make better inferences of consumers’ intrinsic preferences. This situation reverses the standard economic models that assume consumers have private information on their preferences. I explore the consequences of this reversal in the preference information between consumers and an informed marketer (a seller, a manufacturer, or an intermediary curator ). A countervailing force exists whenever the incentives of the consumes and marketer are not perfectly aligned. Marketers argue that collecting superior information help them serve consumers better by offering more relevant products. Consumer protection advocates, by contrast, raise worries that these methods subject consumers to exploitation thro ugh better price discrimination. 1 Consumers are also concerned that the marketer may trick them into purchasing over-priced products or accepting irrelevant recommendations. For these reasons, we desire a better understanding of the interactions of marketers with superior information and sophisticated consumers with rational suspicion. This dissertation provides theoretical insight into the nature of this interaction as well as to how this interaction affects personalized pricing scheme, product line design, and recommendation matching. Chapter one (coauthored with Anthony Dukes) examines the implications of superior consumer information in first-degree price discrimination. Superior information 1 See “Big Data & Differential Pricing,” February 2015, from the U.S. President’s Council of Economic Advisers. 6 occurs when consumer data aggregation enables the firm to learn beyond consumers about their willingness to pay. Since consumers may suspect about being overcharged by firms with superior information, effective price discrimination requires the use of a list price to convince consumers of their value. Because list pricing incurs a signaling cost, the firm, even with price discrimination, is unable to appropriate all consumer surplus. Sometimes the firm may be worse off with superior consumer information. However, there also exists conditions under which price discrimination with superior information is a strict Pareto improvement for the firm and every consumer. Chapter two (coauthored with Anthony Dukes) examines two additional factors: product design and consumer choices in a second-degree price discrimination. In this chapter, superior in formation is defined as consumers’ marginal utility for product quality. There are two primary research questions in this chapter: Do consumers receive better fitting products or simply have more surplus extracted? Is learning superior information ever unprofitable? Our analysis suggests that data aggregation creates superior information on consumers’ preferences beyond consumers’ prior knowledge. Consumers’ rational suspicion, however, may confound the firm ’s ability to price discriminate using the superior information. Consistent with the results in Chapter 1, product line design with superior information may lead to a strict Pareto optimal outcome for both the firm and every consumer. Another contribution of this chapter is that to effectively communicate with the uninformed consumers, the firm may lower the price of the high-quality product and raise the quality of the low-quality product. This tends to reverse the classic quality distortion and thus restore the efficiency in product line design. In Chapters one and two, consumer preferences are vertically-differentiated to capture the potential conflicted interests from a monopoly seller. Alternatively, in Chapter three, I examine a situation when consumer preferences are horizontally -differentiated. In contrast to the first two chapters which focus on pricing strategies, Chapter three examines the recommendation strategy from an intermediary that obtains superior information on 7 consumer preferences. The intermediary can then match the uninformed consumer with competing sellers and share the sellers’ profits. Therefore, the superior information may affect both the matching strategy, the sellers ’ competition, and the sophisticated uninformed consumers’ choices. I illustrate two potential applications on an auction-based product listing platform and a commission-based marketplace. In either application, I show that the superior information on consumer preference can lead to more efficient economic outcomes and improve every player’s payoff, but it requires a c areful design of the matching strategy such that it convinces the consumers that the intermediary is not trying to trick them by exploiting their uninformed preferences. All three chapters make use of a common modeling framework, which is the fundamental innovation of this dissertation. Rather than knowing their preference ex-ante, consumers only observe private signals of their preference with correlated errors due to random market state. This setting enables the marketer who aggregates the consumers’ signals to partition out the errors and obtain superior information. It also implies that the marketing strategy based on the superior information may credibly communicate with the uninformed consumers, convincing them to update their beliefs. When a marketer learns superior information about consumer preferences, the informational advantage may both facilitates its abilities to exploit consumers’ surplus and confounds its ability of exploitation due to the signaling cost. 8 CHAPTER ONE: PERSONALIZED PRICING WITH SUPERIOR INFORMATION ON CONSUMER PREFERENCES 1. Introduction In typical situations of price discrimination, consumers are fully aware of their willingness - to-pay, while firms have only imprecise estimates. But often the informational circumstances are reversed. For example, with a wealth of actuarial data, an insurance agent may obtain better estimates on a client’s true value of insurance, while the client may misestimate her risk of an accident. Experienced car dealers who have served many buyers can identify novice buyers with imperfect valuations and tailor prices. And with the advancement of data collection and analysis, firms may predict consumers ’ willingness to pay when consumers are still assessing their valuations of products and services. 2 In light of these examples, we consider personalized pricing when the firm possesses superior information on consumer preferences that exceeds consumers’ own knowledge. The pricing implications of superior information are unclear. Although it enables the firm to identify consumers and price discriminate, a consumer with imprecise valuation may be suspicious of being overcharged by the firm. For example, consumer advocates in the popular press warn novice buyers that car sellers may take advantage of their naivety. 3 However, while most of this discussion concerns consumers ’ over-paying, one must also wonder about consumers who underestimate their valuation. How then, could a firm convince these consumers to accept a price that exceeds their estimated valuation? In addition, privacy advocates raise worries that superior information from data collection 2 See “Anthropology Inc.,” the Atlantic, March 25, 2013 3 See “A Former Car Salesman Reveals 4 Tricks Dealers Use to Get You to Spend,” Business Insider, 2015; http://www.businessinsider.com/car-salesman-tricks-2015-7. 9 may subject consumers to further exploitation through personalized pricing. 4 But this argument ignores the consumers’ ability to make inferences from the firm’s actions. Can personalized pricing with superior information ever benefit consumers? If so, under what conditions? To examine the above questions, we develop a model with a monopolist and two types of consumers, each of whom may be uninformed of her true value (H or L) for a firm’s product. Every consumer observes a private noisy signal about her value. The signals, which we call the “consumer data, ” are subject to uncertainties about the state of the market. We call a consumer “uninformed, ” if she cannot learn her value by observing her signal. The firm, by contrast, has access to a broader set of observations than any consumer does. By aggregating consumer data, the firm may partition out the noises from the state of the market and identify consumers’ types. Therefore, data collection not only facilitates the objectives of price discrimination but may also create superior information for the firm. Because uninformed consumers may reject their personalized price due to suspicion of the firm exploiting its superior information, the firm ’s ability to price discriminate can be confounded. We show that effective price discrimination with superior information requires the use of a list price – a publicly posted upper-bound on prices charged by the firm. Because a list price informs consumers the ceiling of all personal prices, it credibly signals the firm’s superior information and convinces some consumers to raise their estimated valuation. More importantly, if the firm personalizes prices without posting a list price, consumers may rationally update their beliefs by inferring “no list price” as the firm’s deviation strategy. Therefore, consumer suspicions of the superior information ensures the firm to include a list price in its personalized pricing scheme. We show that a list price is an essential component of the Perfect Bayesian equilibrium that survives the D1 criterion (Banks and Sobel, 1987; Cho and Kreps, 1987). This result offers a novel explanation for the practice of posting list prices when consumers 4 See “Big Data & Differential Pricing,” Februa ry 2015, from the U.S. President’s Council of Economic Advisers. 10 are quoted personalized prices, such for industrial products, automobiles, or heavy equipment. 5 We argue that list pricing is relevant in price discrimination—even when firms can perfectly identify consumers—because it communicates the firm’s additional knowledge of consumers’ values. If list pricing convinces uninformed consumers of their type, then the firm can effectively price discriminate. Interestingly, even with effective price discrimination, the firm is unable to capture the entire consumer surplus. In order to convince uninformed H-types of their value, the firm must set the list price well below their value, which leaves them with positive surplus. This signaling cost can be substantial when there is a high prior probability that H-types are uninformed. Thus the firm may prefer uniform pricing over personalized pricing to avoid the substantial signaling cost, since L-types’ overestimation can be a valuable source of profits. The possibility that a firm can prefer no price discrimination arises from the feature that consumers’ beliefs may be adversely affected when the firm acquires superior information. Superior information implies another novel result regarding the welfare effects of price discrimination. Under the conventional model of price discrimination, H-types are always worse-off (Maskin and Riley, 1984; Varian, 1985; Bergemann et al., 2015). But in our model, H-types may be better off with personalized pricing than uniform pricing, because the list price may be even lower than the uniform price. In addition, list pricing enables the separating equilibrium that prevents uninformed L-types from overpaying. Therefore, we identify situations in which price discrimination with superior information simultaneously improves the expected surplus for both H-types and L-types, and is nevertheless profitable, helping the firm serve additional consumers: the informed L-types 5 While list prices have been explored in prior literature (Horowitz 1992 and Knight et al. 1994), their role is restricted to single-unit sales, such as in the housing market. However, list prices are often used when a firm sells multiple units to differentiated consumers. For example, furniture, insurance, and industrial products typically state a list price, though selected buyers are quoted lower prices. In addition, behavioral theories suggest that a list price may have psychological impacts on consumers who receive a price discount (Kahnemann and Tversky, 1979; Lichtenstein et al., 1990). But these arguments do not apply when consumers are rational. 11 who may not buy under uniform pricing. That is, first -degree price discrimination can be a strict Pareto improvement over uniform pricing. Our paper is most related to the literature on the economics of price discrimination and the collection of consumer information. Varian (2002) suggests that consumers may want to hide their willingness-to-pay from the firm to prevent welfare loss. Most of the literature concludes that price discrimination always harms at least some consumers (Pigou, 1924; Robinson, 1933; Maskin and Riley, 1984; Varian, 1985; Taylor, 2004; Acquisti and Varian, 2005; Calzolari and Pavan, 2005; Aguirre et al., 2010). Ber gemann et al. (2015) show that price discrimination may raise average consumer surplus without reducing the firm’s profit, but it is not a Pareto improvement. By contrast, our results indicate that price discrimination may enable a strict Pareto improvement. The novel source of incremental consumer surplus in our setting is that list pricing helps communicate the firm’s superior information with uninformed consumers, which is not seen in earlier literature. Because pricing schemes in our setting serve as signals to consumers, our paper also relates to the classic literature on informed firm’s price signaling and contract design (Milgrom & Roberts 1984, Bagwell & Riordan 1991 ; Maskin and Tirole, 1990; Beaudry, 1994), which emphasizes a one -way communication of the firm’s private information to consumers. Our work departs in two ways. First, in previous price signaling models, the payoffs of both the sender and the receiver depend directly on the sender’s private information (e.g. product quality or efficiency level). But in our model, the firm ’s superior information affects only the consumer beliefs: holding the actions, that information affects neither the consumers’ utility nor the firm’s profit. This distinction implies a single- crossing condition that is driven by endogenous consumer beliefs. Second, classic price signaling models typically assume one receiver. By contrast, the signaling process in our model involves different receivers. The increase in the number of receiver types is nontrivial, because an uninformed consumer need to update her belief about her type in an iterative process by assessing firm incentives under all possible consumer beliefs. 12 2. The Model In this section, we develop a model in which the firm designs personalized pricing schem es for consumers who are uninformed of their value. A monopolist firm markets to a continuum of risk-neutral consumers (normalized to unitary measure), each of which has imperfect knowledge of her value with the firm’s product. There are two types, 6 indexed by 𝑖 ∈{𝐻 ,𝐿 }. A fraction 𝜆 ∈(0,1) of consumers are H-types with a value 𝛼 𝐻 and the others have a value 𝛼 𝐿 , where 𝛼 𝐿 <𝛼 𝐻 . Consumer 𝑖′𝑠 utility from purchase is 𝑈 𝑖 =𝛼 𝑖 −𝑝 𝑖 , where 𝑝 𝑖 is her price. Consumer 𝑖 does not directly learn 𝛼 𝑖 , instead she observes a private signal that is correlated with 𝛼 𝑖 . We denote this signal by 𝜃 𝑖 (𝑚 ) and refer to it as consumer data. We assume that the consumer data function is monotonic and differentiable. Specifically, define 𝜃 𝑖 (𝑚 )≡𝛼 𝑖 +𝛽 𝑖 𝑚 , where 𝑚 ~𝑈 [0,1] represents an unobserved market state and 𝛽 𝑖 >0. Consumers cannot observe others’ data. The consumer data 𝜃 is a key component of the model. To interpret 𝜃 , consider the example of car insurance. The number of traffic tickets (𝜃 𝑖 ) may serve as a noisy signal for the client’s risk type (𝛼 𝑖 ), which is also a function of the unobserved levels of traffic control (𝑚 ). For simplicity of the analysis we further assume that consumer data preserve the same order as values. In other words, an H-type receives more traffic tickets than an L- type does under the same level of traffic control. The assumed properties of the consumer data are reflected in the following assumption: Assumption 1: 𝜃 𝐻 >𝜃 𝐿 for all 𝑚 ∈[0,1]. As will become clear in Section 2.2, Assumption 1 allows a consumer ’s individual data to be uninformative of the consumer’s type. A monopoly firm observes consumer data 6 The restriction to only two types keeps the exposition clear, but is not essential for the results. 13 from both types. 7 Returning to the insurance example, the agen t, with access to all clients ’ data ({𝜃 𝐿 ,𝜃 𝐻 }), can deduce the unobserved variable (𝑚 ) and learn each consumer’s type. Moreover, we assume that agent can identify each consumer and therefore offer a personalized discount. More generally, the firm chooses whether to charge consumer 𝑖 a personalized price, 𝑝 𝑖 , which only she observes, and whether to use a list price, 𝑝 ̅ , for everyone in the market to observe. Because consumers can purchase at any price they observe, the list price is the cei ling of all personalized prices in the pricing scheme: 𝑝 ≥ 𝑝 𝑖 for each 𝑖 . In principal, we allow the firm to set 𝑝 =∞ to represent the case in which the firm chooses not to set a list price. Formally, we denote the firm ’s pricing scheme as a triple 𝑆 ≡(𝑝 𝐿 ,𝑝 𝐻 ,𝑝 )∈ℝ 2 ×ℝ . Since a list price is the maximum price that the firm charges for any consumer, it may become a credible signal of the firm ’s profit. The timing of the game proceeds as follows. In period 1, nature randomly draws 𝑚 , each consumer 𝑖 observes her own data 𝜃 𝑖 , and the firm chooses either to observe {𝜃 𝐿 ,𝜃 𝐻 } or commit no data collection. In period 2, the firm chooses a pricing scheme 𝑆 and each consumer 𝑖 decides whether to purchase at the lower price available to her: min{𝑝 ̅ , 𝑝 𝑖 }. 3. Uninformed Consumers, Superior Information, Suspicion In this section, we first define the relevant notions regarding how a consumer ’s individual data is uninformative of the consumer’s type. Then we discuss how data aggregation may create superior information for the firm that exceeds consumers’ prior knowledge. Finally, we use a numerical example to illustrate how list pricing in the personalized pricing scheme can become an optimal strategy by alleviating consumer suspicion of being overcharged. 7 We show that by aggregating 𝜃 𝑖 , the firm can learn the market state, and correspondingly, identify each consumer’s type 𝛼 𝑖 and perform price discrimination. 14 3.1 Uninformed Consumers Because some values of 𝜃 𝑖 may correspond to both types of consumers, it is possible that consumers do not learn their types from their observed noisy signal. We call such consumers uninformed. Definition 1: Consumer 𝑖 ∈{𝐿 ,𝐻 } is uninformed after observing her data 𝜃 𝑖 (𝑚 ) , if and only if there exists another possible market state 𝑚 ′∈[0,1]\𝑚 , such that 𝜃 𝑖 (𝑚 )=𝜃 𝑗 (𝑚 ′) for 𝑖 ≠𝑗 . Otherwise, consumer 𝑖 is informed. Definition 1 implies that a consumer is uninformed whenever her data does not reveal her type. Denote 𝑀 (𝑚 )≡𝜃 𝐻 −1 ∘𝜃 𝐿 (𝑚 ) as an uninformed L-type’s inferred market state given her belief that she is an H-type when observing the data 𝜃 𝐿 (𝑚 ) . 𝑀 (.) is well defined for all market states, since linearity of 𝜃 𝐻 assures that 𝜃 𝐻 −1 exists for all consumer data. Similarly, 𝑀 −1 (𝑚 ) , the inverse function of 𝑀 (𝑚 ) , is an uninformed H- type’s inferred market state, given her observed signal 𝜃 𝐻 (𝑚 ) and the belief that she is an L-type. From Definition 1, H-types are uninformed if and only if under the market state 𝑚 such that 𝑀 −1 (𝑚 )∈[0,1]. Conversely, L-types could believe that they are possibly H- types, if and only if 𝑀 (𝑚 )∈[0,1]. Lemma 1 specifies conditions under which there may exist a market with uninformed consumer preferences. Lemma 1: The following statements are equivalent. (i) 𝑀 (1)≥0; (ii) H-types are uninformed if and only if 𝑚 ∈[0,𝑀 (1)]; (iii) 𝑀 −1 (0)≤1; (iv) L-types are uninformed if and only if 𝑚 ∈[𝑀 −1 (0),1]. The proof of Lemma 1 is straightforward and relegated to the appendix. Note that our framework can incorporate the classic case when consumers ex-ante know their value: 15 By Lemma 1, when 𝑀 (1)<0 , then 𝑀 −1 (0)>1 , so both types of consumers are informed under any 𝑚 . Since our focus is on superior information over uninformed consumers, we assume under the condition that uninformed consumers may exist for the remainder of the analysis: Assumption 2: 𝑀 (1)≥0, i.e., 𝛽 𝐿 ≥(𝛼 𝐻 −𝛼 𝐿 ) . Figure 1 illustrates uninformed consumers in every market state. . Figure 1: Market State and Uninformed Consumers 3.2 Superior Information An informed firm observes consumer data from both types {𝜃 𝐿 ,𝜃 𝐻 }. By Assumption 1, the firm can identify each consumer’s type based on the ranking of her data. Furthermore, the firm can deduce the market state 𝑚 , which is informative of the uninformed consumers ’ prior beliefs—whether they over-estimate or under-estimate their value. Therefore, data aggregation creates superior information that exceeds consumers’ prior knowledge. The superior information enables the firm to identify each consumer, but uninformed consumers may also infer from the firm’s actions and update their beliefs. 16 Since the superior information depends on the market state 𝑚 , the firm may des ign its pricing scheme as a function of 𝑚 , i.e., 𝑆 (𝑚 )={𝑠 𝐻 ,𝑠 𝐿 }, where 𝑠 𝑖 (𝑚 )≡(𝑝 𝑖 ,𝑝 ̅ ):[0,1]→ ℝ×ℝ are the prices that consumer 𝑖 observes. Therefore, an uninformed consumer 𝑖 may infer the market state 𝑚 from the observed pricing scheme 𝑠 𝑖 (𝑚 ) whenever possible. For convenience, we denote 𝜇 (𝑠 𝑖 ) as her updated belief of the probability that she is an H-type. Consequently, her posterior estimate of her value is 𝐸 [𝛼 |𝜇 (𝑠 𝑖 )]=𝜇 (𝑠 𝑖 )𝛼 𝐻 + [1−𝜇 (𝑠 𝑖 )]𝛼 𝐿 . 3.3 Alleviating Suspicion with a List Price: A Numerical Example To illustrate the role of list pricing in alleviating consumer suspicions, we consider a simple numerical example of the above model. Suppose that 𝜆 =0.5,𝛼 𝐻 =$2,𝛼 𝐿 =$1,𝛽 𝐻 = 3,𝛽 𝐿 =2. By Section 3.1, H-types are uninformed if and only if 𝑚 ∈[0, 1 3 ], and L-types are uninformed if and only if 𝑚 ∈[0.5,1]. An uninformed consumer’s estimate should be 𝛼 𝐿 + 𝜆 ( 1 3 −0) 𝜆 ( 1 3 −0)+(1−𝜆 )(1−0.5) (𝛼 𝐻 −𝛼 𝐿 )=$1.4. We summarize the consumers’ updated estimate of their value 𝐸 [𝛼 |𝜃 𝑖 (𝑚 )] in Table 1. Table 1: Distribution of Consumers’ Updated Beliefs Market State Consumer’s Beliefs L-types H-types 𝑚 ∈[0, 1 3 ] $1 $1.4 𝑚 ∈( 1 3 , 1 2 ) $1 $2 𝑚 ∈[ 1 2 ,1] $1.4 $2 If data is collected, the firm can identify the uninformed consumers. Suppose the firm charges each consumer exactly a price that equals her estimated value. Specifically, 17 when 𝑚 ∈[0, 1 3 ], the firm charges L-types $1 and H-types $1.4, when 𝑚 ∈( 1 3 , 1 2 ), the firm charges L-types $1 and H-types $2, and when 𝑚 ∈[ 1 2 ,1], the firm charges L-types $1.4 and H-types $2. In this case, the firm can fully extract consumers ’ expected surplus given their belief. However, the firm may improve its profit by posting a list price to communicate with the uninformed consumers and change their beliefs. Specifically, when 𝑚 ∈[0, 1 3 ], the firm knows that H-types are uninformed and wants to convince them of their type. The firm may communicate with H-types by posting a list price of $1.5. Because this incentive is absent when 𝑚 > 1 3 , upon observing the list price uninformed H-types should believe that 𝑚 ∈[0, 1 3 ] and thus willing to pay this price even without any personalized discount. Therefore, it is a dominant strategy in 𝑚 ∈[0, 1 3 ] to set a list price of $1.5, and gives L- types a personalized discounted price of $1. Now, since the firm does not charge the uninformed H-types $1.4 under 𝑚 ∈[0, 1 3 ], if an uninformed L-type observes a personalized price $1.4 without a list price, she may be suspicious and infer that the market state is more likely to be 𝑚 ∈[ 1 2 ,1]. Therefore, the uninformed L-type updates her estimated value to be less than $1.4 and reject the price $1.4. Consequently, the firm ’s personalized pricing strategy of charging H-types $2 and L- types $1.4 at 𝑚 ∈[ 1 2 ,1] is suboptimal. The above example illustrates the central challenge of price discrimination with superior information. The pricing scheme from an informed firm can influence uninformed consumers’ estimation of their value. Departing from the standard pricing model, the problem with superior information must consider both the optimization based on consumers’ prior beliefs, and how the pricing may inform the consumers to update their beliefs. On the other hand, the consumers ’ beliefs must assess both the firm’s profits and 18 its estimates of their belief updates. This belief interaction incurs endogenous signaling costs that may impact the design of pricing schemes and confound the welfare implications of price discrimination. 4. Equilibrium In this section, we first solve the equilibrium of a basic model, in which we impose the condition that under any market state at least some consumers are informed (i.e. 𝑀 (1)< 𝑀 −1 (0) ). We briefly study, at the end of this section, the general case when both types of consumers may be uninformed. The basic model, however, is sufficient for illustrating the central results and is the focus of the analysis in Sections 5 and 6. 4.1 Equilibrium of the Basic Model In this section, we suppose that, for convenience of analysis, there is a positive probability that both types are informed, i.e., 𝑀 (1)<𝑀 −1 (0) . This restriction is relaxed in Section 4.2. We examine the pricing scheme and uninformed consumers’ beliefs using Perfect Bayesian Equilibrium (PBE). As in many signaling games, there are a multitude of PBE in our model. Most of these equilibria involve unreasonable out-of-equilibrium consumer beliefs. To eliminate these beliefs, we employ the D1 criterion to refine any equilibrium (Banks and Sobel 1987; Cho and Kreps 1987; Cho and Sobel, 1990). By applying D1, we specify “reasonable” beliefs 𝜇 (𝑠 𝑖 ′ ) for uninformed consumers, when the firm chooses an out-of-equilibrium pricing scheme 𝑠 𝑖 ′ . Particularly, upon observing a deviation, an uninformed consumer assigns zero probability to a given market state, if and only if she believes that whenever the firm has weak incentives to deviate in that state, it has strong incentives to deviate in a different state. 8 Formally, consider a PBE {𝑠 𝑖 ∗ ,𝑟 𝑖 ∗ ,𝜇 (𝑠 𝑖 ∗ )} 𝑖 =𝐻 ,𝐿 , where 𝑟 𝑖 ∈{0,1} is consumer 𝑖 ’s purchase decision, an d {0,1} corresponds to “No Purchase” and “Purchase, ” respectively. 8 Formal details of the equilibrium refinement process are relegated to the appendix. 19 Denote 𝓜 𝟎 ≡(𝑀 (1),1] and 𝓜 𝟏 ≡[0,𝑀 (1)] as the set of market states in which H- types are informed and uninformed, respectively. When consumer 𝑖 is uninformed, she infers the actual market state 𝑚 from {𝑚 𝐻 ,𝑚 𝐿 }, 9 in which 𝑚 𝐻 or 𝑚 𝐿 is her inferred market state given she is an H-type or L-type, respectively. Clearly, we must have 𝑚 𝐻 ∈ 𝓜 𝟏 , and 𝑚 𝐿 ∈𝓜 𝟎 , for any 𝑖 ∈{𝐻 ,𝐿 }. 10 For an out-of-equilibrium pricing scheme 𝑠 𝑖 ′ ≡ (𝑝 𝑖 ′ ,𝑝 ̅ ′)≠𝑠 𝑖 ∗ , denote 𝜇 (𝑠 𝑖 ′ ) as the posterior consumer belief upon observing 𝑠 𝑖 ′ ; denote Π ̂ (𝑚 ,𝑠 𝑖 ′ ,𝑟 (𝑠 𝑖 ′ ,𝜇 )) as the deviation profit that she attributes in the market state 𝑚 , where 𝑟 (𝑠 𝑖 ′ ,𝜇 ) is her best response to 𝑠 𝑖 ′ given a particular belief 𝜇 ; and denote the firm’s equilibrium profit as Π ∗ (𝑚 ) . Using the above notations, the firm ’s product design problem is posed as follows: max 𝑠 ∈{𝑝 𝐻 , 𝑝 𝐿 , 𝑝 ̅ } 𝜆 𝑝 𝐻 +(1−𝜆 )𝑝 𝐿 , such that (i) 𝜇 (𝑠 𝑖 )𝛼 𝐻 +(1−𝜇 (𝑠 𝑖 ))𝛼 𝐿 ≥𝑝 𝑖 , (ii) 𝑝 𝑖 ≤𝑝 ̅ ; (iii) 𝜇 (𝑠 𝑖 ) is sequentially rational, Bayesian whenever possible, and survives the D1 criterion (Banks and Sobel 1987; Cho and Kreps 1987; Cho and Sobel, 1990). We first establish the existence of a unique PBE surviving D1. Proposition 1 shows the firm’s equilibrium pricing strategy is separating between 𝓜 𝟎 and 𝓜 𝟏 , rather than between the sets of market states in which L-types are informed and uninformed. Proposition 1: Suppose 𝑀 (1)<𝑀 −1 (0) Then there exists a PBE, {𝑠 𝑖 ∗ ,𝑟 𝑖 ∗ ,𝜇 (𝑠 𝑖 ∗ )} 𝑖 =𝐻 ,𝐿 , with the following properties: (i) The firm sets a pricing scheme 𝑠 𝑖 ∗ =(𝑝 𝑖 ∗ ,𝑝 ̅ ∗ ) according to 𝑚 , 9 𝜃 𝑖 (𝑚 )=𝜃 𝐻 (𝑚 𝐻 )=𝜃 𝐿 (𝑚 𝐿 ) , thus (𝑚 𝐻 ,𝑚 𝐿 )=(𝑚 ,𝑀 −1 (𝑚 )) , or (𝑀 (𝑚 ),𝑚 ) , if 𝑖 =𝐻 ,or 𝐿 , respectively. 10 Since 𝑖 is uninformed, we must have 𝑚 𝐻 ∈𝓜 𝟏 , and 𝑚 𝐿 =𝑀 −1 (𝑚 𝐻 )∈[𝑀 −1 (𝑚 𝐻 ),1]⊂𝓜 𝟎 . 20 𝑝 𝐻 ∗ ={ 𝛼 𝐻 if 𝑚 ∈𝓜 𝟎 𝜆 𝛼 𝐻 +(1−𝜆 )𝛼 𝐿 if 𝑚 ∈𝓜 𝟏 𝑝 𝐿 ∗ =𝛼 𝐿 , and 𝑝 ̅ ∗ =𝑝 𝐻 ∗ , for all 𝑚 ∈[0,1] (ii) All consumers purchase: 𝑟 𝑖 ∗ =1, 𝑖 =𝐻 ,𝐿 . (iii) Uninformed consumer i updates her belief for any pricing strategy 𝑠 𝑖 ′ : 𝜇 (𝑠 𝑖 ′ )=1, if Π ̂ (𝑚 𝐿 ,𝑠 𝑖 ′ ,𝑟 (𝑠 𝑖 ′ ,𝜇 :𝑚 =𝑚 𝐻 ))≤Π ∗ (𝑚 𝐿 ); and 𝜇 (𝑠 𝑖 ′ )=0, if otherwise. (iv) (Uniqueness) This is the unique PBE surviving the D1 criterion. 11 Proof: See in the appendix. First note from Proposition 1 that this PBE is a separating equilibrium since 𝑠 𝐿 ∗ ≠ 𝑠 𝐻 ∗ for all 𝑚 ∈[0,1]. Therefore, any uninformed consumer can always infer her type from her personalized pricing scheme in equilibrium. Such learning is sustained by the belief updating rule in (iii). For example, suppose H-types are uninformed (i.e. 𝑚 ∈𝓜 𝟏 ), then their belief update can be intuitively expressed in the following speech: “If the firm has a strong incentive to deviate by tricking me into believing that I am an H-type with the pricing scheme that I observe, then I shall assess that I am an L-type by observing that pricing scheme.” This belief is rational and survives D1, because we show in the appendix that the firm may deviate more often when the uninformed consumers are the L-types, the uninformed consumers should thus reduce the estimates of their value, and iteratively, they should assign zero probability to be the H-types. In this way, equilibrium beliefs ensure an uninformed consumer maintains a level of suspicion that prevents the firm from misleading her. Since 𝑝 𝐿 ∗ =𝛼 𝐿 , we see that such beliefs prevent the firm from overcharging 11 Formally, 𝜇 survives the D1 criterion if and only if for any out-of-equilibrium pricing strategy 𝑠 ′ ≠𝑠 ∗ , then 𝜇 (𝑠 ′ )=0 , whenever the following condition holds: ⋃{𝑟 |Π ∗ (𝑚 𝐻 )≤Π ̂ (𝑚 𝐻 ,𝑠 ′ ,𝑟 )} 𝜇 ⊊ ⋃{𝑟 |Π ∗ (𝑚 𝐿 )<Π ̂ (𝑚 𝐿 ,𝑠 ′ ,𝑟 )} 𝜇 , and 𝜇 (𝑠 ′ )=1, whenever the following condition holds: ⋃{𝑟 |Π ∗ (𝑚 𝐿 )≤ 𝜇 Π ̂ (𝑚 𝐿 ,𝑠 ′ ,𝑟 )}⊊⋃ {𝑟 |Π ∗ (𝑚 𝐻 )<Π ̂ (𝑚 𝐻 ,𝑠 ′ ,𝑟 )} 𝜇 , where ⋃ {.} 𝜇 is a set of arbitrary beliefs 𝜇 ∈[0,1] such that the condition inside the braces holds, and 𝑟 𝑖 is a best response to 𝑠 ′ given the arbitrary belief 𝜇 , i.e., 𝑟 =𝐵𝑅 (𝑠 ′ ,𝜇 ) . 21 uninformed L-types. In contrast to L-types, uninformed H-types can obtain positive surplus from the personalized pricing scheme, even though they can correctly infer their value. Specifically, in order to convince uninformed H-types that the market state is low (𝑚 ∈𝓜 𝟏 ), the firm sets the list price,𝑝 ∗ , no higher than what it could earn, even under most favorable beliefs, when the market state is high (𝑚 ∈𝓜 𝟎 ). Furthermore, by setting the list price 𝑝 ∗ ≤ Π ∗ (𝑚 ∈𝓜 𝟎 ) , the firm expresses to any uninformed H-type “I am not charging anyone more than I am charging you, so that you know I wou ld never use this pricing scheme if you were not an H-type.” Therefore, in equilibrium, H-types receive positive expected surplus and L-types avoid paying more than their value. The inability of the firm to either extract the full surplus of H-types or overcharge L-types in equilibrium constitute the cost of signaling the firm’s private information. As we see in the next section, such signaling costs can be, in some circumstances, a significant burden to the price discriminating monopolist. In addition, th e proof to Proposition 1 leads to the following corollary: Corollary 1: Any personalized pricing equilibrium without a list price, 𝑠 𝑖 ∗ = (𝑝 𝑖 ∗ ,𝑝 ̅ ∗ =∞) , fails the D1 criterion. The intuition of Corollary 1 is that uninformed H-types are convinced of their type only if they observe more credible information from the personalized pricing scheme, such as a list price. Suppose that the market state is low (𝑚 ∈𝓜 𝟏 ), and the firm charges the uninformed H-types the same price as in equilibrium (𝜆 𝛼 𝐻 +(1−𝜆 )𝛼 𝐿 ) without posting a list price, then the H-types may be induced to believe that they are in fact the L-types, because the firm may charge the other segment a higher price 𝛼 𝐻 , it has stronger incentives to deviate to the “no list pricing” strategy when the market state is high (𝑚 ∈𝓜 𝟎 ) than low. Thus, the uninformed H-types should reject the equilibrium price (𝜆 𝛼 𝐻 +(1−𝜆 )𝛼 𝐿 ) without a list price under a D1-belief. 22 4.2 General Model The welfare results of Sections 5 and 6 are illustrated in the basic model of Section 4.1. But, as we argue here, this is without loss of generality. In addition, by considering a more general model, which relaxes the restriction that 𝑀 (1)<𝑀 −1 (0) , we can better illustrate the consumer’s belief updating process. In the basic model, there are always some informed consumers in every market state. In the general model, we consider a case when both types may be uninformed. The general model thus studies the situation when consumers’ biases are not in the same direction: some consumers overestimate their value, while others underestimate their value, under the same market state. Recall that the condition of Proposition 1 implied an equilibrium partition of the market state space by two sets 𝓜 𝟎 and 𝓜 𝟏 . Uninformed H-type consumers in some state 𝑚 ∈𝓜 𝟏 updated their equilibrium beliefs by a one-step “thought experiment” if the state were in 𝓜 𝟎 . In absence of this condition, we allow for a finer partition of the market state space 𝓜 =[0,1] , which implies a more iterative thought experiment for the uninformed consumer to update her beliefs. To study the general model, first denote 𝑀 (𝑁 ) ≡ 𝑀 ∘𝑀 …∘𝑀 ⏟ 𝑁 . By Assumption 1, there exists a unique 12 𝑁 ∈ℕ satisfying the following inequality: 𝑀 (𝑁 +1) (1)<0<𝑀 (𝑁 ) (1) . The equilibrium result presented in Proposition 1 is a special case when 𝑁 =1, because the restriction in the basic model that 0<𝑀 (1)<𝑀 −1 (0) is equivalent to 𝑀 (2) (1)< 0<𝑀 (1) . In a general parameter space where 𝑁 >1, we define a partition of the state space consisting of 𝑁 +1 intervals of the form, from right to left, 12 Because 𝜃 𝐿 (𝑚 )<𝜃 𝐻 (𝑚 ) and 𝜃 𝐻 −1 is linear, thus 𝑀 (𝑚 )<𝑚 for all 𝑚 , consequently 𝑀 (𝑁 +1) (1)< 𝑀 (𝑁 ) (1) for any 𝑁 ; and since 𝑀 (0)<0, there must exist an 𝑁 such that 𝑀 (𝑁 +1) (1)<0. Since the monotonicity of 𝑀 (𝑛 ) (1) in 𝑛 , there must exists a unique N such that 𝑀 (𝑁 +1) (1)<0<𝑀 (𝑁 ) (1) . 23 𝓜 𝟎 =(𝑀 (1),1], 𝓜 𝒏 ≡(𝑀 (𝑛 +1) (1),𝑀 (𝑛 ) (1)], 𝑛 =1,…,𝑁 −1, and 𝓜 𝑵 =[0,𝑀 (𝑁 ) (1)]. H-types are informed if and only if 𝑚 ∈𝓜 𝟎 , whereas L-types are informed only if 𝑚 ∈𝓜 𝑵 , by Definition 1. To better understand the meaning of this partition, consider an arbitrary 𝑛 , where 𝑚 ∈𝓜 𝒏 . In this case, L-types observe 𝜃 𝐿 (𝑚 ) , which is the same for an H-type under some state in the interval 𝓜 𝒏 +𝟏 . With this basic structure of consumers’ information sets, we can establish the existence of an equilibrium for an arbitrary 𝑁 in Proposition 2. Proposition 2: (General Result) For any 𝑚 ∈𝓜 𝒏 , where 𝑛 ∈{0,1,…,𝑁 } , there exists a unique perfect Bayesian equilibrium (𝑠 𝑖 ∗ (𝑀 ),𝑟 ∗ ,𝜇 𝑖 ) that survives D1 for any 𝑖 =𝐻 ,𝐿 : 𝑝 𝐻 ∗ =𝜆 𝑛 𝛼 𝐻 +(1−𝜆 𝑛 )𝛼 𝐿 , 𝑝 𝐿 ∗ =𝛼 𝐿 , 𝑝 ̅ ∗ =𝑝 𝐻 ∗ ; 𝑟 𝑖 ∗ =1: purchase at 𝑝 𝑖 ∗ ; 𝜇 (𝑠 𝑖 ∗ )=1, if Π ̂ (𝑚 𝐿 ,𝑠 𝑖 ∗ ,𝑟 𝑖 ∗ (𝑠 𝑖 ∗ ,𝑚 𝐻 ))≤Π ∗ (𝑚 𝐿 ) , and 𝜇 (𝑠 𝑖 ∗ )=0, if otherwise. An interpretation on 𝑛 is the maximum steps of iterations in an uninformed H- type’s belief updating process to be convinced that she must be an H-type. Therefore, the higher 𝑛 is, the more difficult to convince uninformed H-types of their value. The signaling cost equals 𝛼 𝐻 −𝑝 ̅ ∗ =(1−𝜆 𝑛 )(𝛼 𝐻 −𝛼 𝐿 ) , which is an increasing function of 𝑛 . It implies that the firm has lower incentives to use personalized pricing when it is more difficult to dispel the consumers’ suspicions, especially since unconvinced consumers adversely update their beliefs upon observing their personalized price. This intuitive result leads to our discussion in Section 5. For simplicity, we use the basic model (𝑁 =1) in the following static comparisons. 13 13 The results from the static comparisons can be generalized with any arbitrary 𝑁 . 24 5. Firm’s Incentive for Price Discrimination In this section, we examine the profit implications when the firm obtains superior information and designs a personalized pricing scheme. In Section 5.1, we analyze t he benchmark case when the firm does not observe consumer data and uses uniform pricing. In Section 5.2, we use static comparison to assess when the firm has incentives to collect data for price discrimination. 5.1 Uninformed Firm An uninformed firm commits in period 0 not to observe consumer data. For example, a firm may publicly announce in its privacy statement that no data will be collected or analyzed. Each consumer, as before, observes her own data and updates her belief accordingly. By Lemma 1, H-types are uninformed if and only if 𝑚 ≤𝑀 (1) , and L-types are uninformed if and only if 𝑚 ≥𝑀 −1 (0) . Henceforth, we denote 𝜌 𝑖 as the initial probability that consumer 𝑖 is uninformed: 𝜌 𝐻 ≡∫ 𝑓 (𝑚 )𝑑𝑚 𝑀 (1) 0 , and 𝜌 𝐿 ≡ ∫ 𝑓 (𝑚 )𝑑𝑚 1 𝑀 −1 (0) . Denote 𝑤 ≡ 𝜌 𝐿 𝜌 𝐻 = 𝛽 𝐻 𝛽 𝐿 . Let Θ be the random variable with realizations 𝜃 𝑖 for uninformed consumer 𝑖 , i.e., Θ∈[𝜃 𝐻 (0), 𝜃 𝐿 (1)] , and Θ|𝛼 as the conditional random variable with conditional realizations 𝜃 𝑖 |𝛼 𝑗 . The conditional probability of being an H- type after observing an uninformative signal 𝜃 𝑖 is Pr(𝛼 𝐻 |𝜃 𝑖 )= 𝑃𝑟 (𝛼 𝐻 )𝑓 𝛩 |𝛼 (𝜃 𝑖 |𝛼 𝐻 ) ∑ [𝑃𝑟 (𝛼 𝑘 )𝑓 𝛩 |𝛼 (𝜃 𝑖 |𝛼 𝑘 ) 𝑘 ] = 𝜆 𝜆 +(1−𝜆 )𝑤 , where the density function is 𝑓 Θ|𝛼 (𝜃 𝑖 |𝛼 𝑗 )=| 𝑑 𝑑 𝜃 𝑖 𝜃 𝑗 −1 (𝜃 𝑖 )|𝑓 (𝑚 )= 1 𝛽 𝑗 . Therefore, uninformed consumers have the same posterior Bayesian estimate: 𝑒 ≡𝐸 [(𝛼 |𝜃 𝑖 )]=𝛼 𝐿 + 𝜆 (𝛼 𝐻 −𝛼 𝐿 ) 𝜆 +(1−𝜆 )𝑤 . Without possessing data on consumers, the firm cannot identify the consumers and 25 use personalized pricing. That is, an uninformed firm posts only a uniform price, 𝑝 𝑈 , to all consumers. The firm, however, anticipates the uninformed consumer s’ estimates 𝑒 . Therefore, the optimal uniform price must be one of three levels: 𝑝 𝑈 ∗ ∈{𝛼 𝐿 ,𝑒 ,𝛼 𝐻 }. Denote the expected profit for the firm conditional on price 𝑝 as Π 𝑈 (𝑝 ;𝜆 ) . Then we express the firm’s problem as: max 𝑝 ∈{𝛼 𝐿 ,𝑒 ,𝛼 𝐻 } Π 𝑈 (𝑝 ;𝜆 ) where Π 𝑈 (𝛼 𝐻 ,𝜆 )=𝜆 (1−𝜌 𝐻 )𝛼 𝐻 , Π 𝑈 (𝑒 ,𝜆 )=[𝜆 +(1−𝜆 )𝜌 𝐿 ]𝑒 , and Π 𝑈 (𝛼 𝐿 ,𝜆 )=𝛼 𝐿 . Proposition 3: The unique equilibrium uniform price for the uninformed firm is characterized as follows: (i) 𝑝 𝑈 ∗ =𝛼 𝐻 , if and only if 𝜌 𝐻 ≤ 1 2 (1− √ 𝛼 𝐿 𝛼 𝐻 ) and 𝜆 ∈[max{𝜆 𝐻𝐸 1 ,𝜆 𝐻𝐿 },𝜆 𝐻𝐸 2 ], where 𝜆 𝐻𝐸 1 <𝜆 𝐻𝐸 2 both solve Π 𝑈 (𝛼 𝐻 ,𝜆 )=Π 𝑈 (𝑒 ,𝜆 ), 14 and 𝜆 𝐻𝐿 ≡ 𝛼 𝐿 𝛼 𝐻 (1−𝜌 𝐻 ) ∈ (0,1) . 15 (ii) Otherwise, there exists a unique 𝜆 𝐿𝐸 ∈(0,1) that solves Π 𝑈 (𝑒 ,𝜆 )=Π 𝑈 (𝛼 𝐿 ,𝜆 ) . The optimal uniform price is then 𝑝 𝑈 ∗ =𝑒 , if 𝜆 ≥𝜆 𝐿𝐸 ; and 𝑝 𝑈 ∗ =𝛼 𝐿 , if 𝜆 < 𝜆 𝐿𝐸 . Proposition 3 provides the benchmark used to assess the impact of price discrimination on welfare. But it also directly illustrates a novel consequence of uninformed consumer preferences. When the probability that H-types are uninformed is below the threshold, 1 2 (1−√𝛼 𝐿 /𝛼 𝐻 ), the equilibrium uniform price is non -monotonic in 𝜆 , the portion of H-types. The conventional wisdom, under the assumption that consumers are fully informed about their type, suggests that a non -discriminating monopoly will (weakly) raise the uniform price as 𝜆 increases. However, in our model t he uninformed 14 In Lemma A2 of the Appendix we show that 𝜌 𝐻 ≤ 1 2 (1−√𝛼 𝐿 /𝛼 𝐻 ) implies 𝜆 𝐻𝐸 1 < 𝜆 𝐻𝐸 2 ∈(0,1) exist. 15 In the proof of Proposition 2 we show that 𝜌 𝐻 ≤ 1 2 (1−√𝛼 𝐿 /𝛼 𝐻 ) implies that 𝜆 𝐻𝐿 ∈(0,1) . 26 firm may prefer to lower the uniform price when there are more H-types. To understand this result, note first that 𝜌 𝐻 , the probability for H-types to be uninformed, is independent of 𝜆 . However, uninformed consumers ’ estimate 𝑒 is an increasing function of 𝜆 , since the expectation of value is higher if there are more H-types. Therefore, the firm may prefer to lower the uniform price from 𝛼 𝐻 to 𝑒 to exploit this increased over-estimation from uninformed L-types. Figure 2 depicts the optimal uniform price as a function of 𝜆 and 𝜌 𝐻 . 16 Figure 2: Uniform Pricing (𝜶 𝑯 =𝟓 𝜶 𝑳 ,𝒘 =𝟎 .𝟑 ) Note: The boundary curves in Figure 2 are solved in Proposition 3 The curve between the light and darker region is 𝜆 𝐿𝐸 , the iso-profit curve along which 𝛱 𝑈 (𝑒 ,𝜆 )=𝛱 𝑈 (𝛼 𝐿 ,𝜆 ) Similarly, 𝜆 𝐻𝐿 is the iso- profit curve along which 𝛱 𝑈 (𝛼 𝐻 ,𝜆 )=𝛱 𝑈 (𝛼 𝐿 ,𝜆 ) ; and 𝜆 𝐻𝐸 1 and 𝜆 𝐻𝐸 2 are the two iso-profit curves along which 𝛱 𝑈 (𝛼 𝐻 ,𝜆 )=𝛱 𝑈 (𝑒 ,𝜆 ) The roots exist if and only if 𝜌 𝐻 ≤ 1 2 (1−√𝛼 𝐿 /𝛼 𝐻 ) 5.2 When is Having Consumer Data Profitable? We compare the firm’s profits from the equilibrium in Proposition 1 and Proposition 3. That is, we examin e whether data collection improves the firm’s profit by facilitating price discrimination with superior information relative to no data collection and uniform pricing. 16 The numerical example is 𝜌 𝐻 =𝜌 𝐿 =0.1; 𝛼 𝐻 =2𝛼 𝐿 =2. We can solve that 𝑒 =1+𝜆 ; Π(𝛼 𝐻 ;𝜆 )= 1.8𝜆 ; Π 𝑈 (𝑒 ;𝜆 )=(0.9𝜆 +0.1)(1+𝜆 ) ; Π 𝑈 (𝛼 𝐿 ;𝜆 )=1. Note: The numerical values are 𝜌 𝐻 =𝜌 𝐿 =0.1; 𝛼 𝐻 =2𝛼 𝐿 =2. Note that the list price is reduced when 𝜆 >𝜆 𝐻𝐸 2 ≈0.75. 27 Conventional wisdom suggests that acquiring the ability to discriminate in price always improves the firm’s profit. Therefore, a monopoly always prefers personalized pricing under the standard assumption. This need not be the case with superior information. Specifically, the firm prefers not to price discriminate at least under some condition s. Before the firm decides whether to collect data on consumers, it solves the expected profit given the distribution of the market states. From the analysis above, the expected profit for the informed firm is Π 𝐼 ∗ =[𝜆 2 𝛼 𝐻 +(1−𝜆 2 )𝛼 𝐿 ]𝜌 𝐻 +[𝜆 𝛼 𝐻 +(1−𝜆 )𝛼 𝐿 ](1−𝜌 𝐻 ) =𝛼 𝐿 +𝜆 (1−𝜌 𝐻 +𝜆 𝜌 𝐻 )(𝛼 𝐻 −𝛼 𝐿 ) . From Proposition 3, the expected profit for the uninformed firm is Π 𝑈 ∗ =max{Π 𝑈 (𝛼 𝐿 ),Π 𝑈 (𝑒 ),Π 𝑈 (𝛼 𝐻 )}, where we drop the 𝜆 from the arguments of Π 𝑈 . Note that, for all 𝜆 , Π 𝐼 ∗ > 𝜆 (1−𝜌 𝐻 )𝛼 𝐻 =Π 𝑈 (𝛼 𝐻 ) . That is, acquiring consumer data and price discriminating is always more profitable than skimming the high end of the market with a uniform price, 𝛼 𝐻 . Furthermore, since 𝜌 𝐻 ∈(0,1), we have Π 𝐼 ∗ >𝛼 𝐿 =Π 𝑈 (𝛼 𝐿 ) . Again, observing consumer data and price discriminating is more profitable for the firm than serving the entire market with a low uniform price, 𝛼 𝐿 . Thus, the firm prefers uniform p ricing over price discrimination, only if it can exploit L-types’ over-estimation by charging 𝑝 𝑈 ∗ =𝑒 . To assess the precise condition, we only need to compare Π 𝐼 ∗ with Π 𝑈 (𝑒 ) . Proposition 4 suggests that the firm prefers no price discrimination, if and only if 𝜌 𝐻 , the probability of H-types being uninformed, exceeds a threshold. Proposition 4: Let 𝜌 1 ≡ 𝑤 [(𝛼 𝐻 −𝛼 𝐿 )𝜆 +(1−𝜆 )𝛼 𝐿 ]+𝜆 𝛼 𝐿 (𝛼 𝐻 −𝛼 𝐿 )𝜆 2 +𝜆𝑤 [(1−𝜆 )(𝛼 𝐻 −𝛼 𝐿 )+𝛼 𝐻 ]+𝑤 2 𝛼 𝐿 (1−𝜆 ) >0. (i) If 𝜌 𝐻 ≥𝜌 1 , then it is unprofitable for the firm to collect data on consumers. The firm will set a uniform price 𝑝 𝑈 ∗ =𝑒 . (ii) Otherwise, if 𝜌 𝐻 <𝜌 1 , then it is profitable for the firm to collect data and price discriminate according to Proposition 1. 28 Collecting consumer data can be unprofitable because it forces the firm to distort prices from the full information optimum to signal the market state – and therefore inform consumers of their types. As in most signaling models, such dist ortion reveals the firm’s signaling costs. This cost is increasing in 𝜌 𝐻 : As the probability of H-types being uninformed increases, the firm must compromise larger margins on H-types to convince them of their high value. But, while the informed firm ’s profit is decreasing in 𝜌 𝐻 , the uninformed firm’s profit Π 𝑈 (𝑒 ) is increasing in 𝜌 𝐻 , because uninformed consumers raise their estimates of value when 𝜌 𝐻 increases. As such, the firm prefers not observing consumer data when 𝜌 𝐻 >𝜌 1 to avoid costly signaling. The reverse intuition holds when 𝜌 𝐻 <𝜌 1 . Figure 3: Equilibrium Region for Price Discrimination (𝜶 𝑯 =𝟓 𝜶 𝑳 ,𝑤 =𝟎 .𝟑 ) Note: The boundary curve in Figure 3 is solved in Proposition 4 as 𝜌 𝐻 =𝜌 1 , which is the iso-profit curve along which Π 𝐼 =Π 𝑈 (𝑒 ) 𝑤 ≡ 𝛽 𝐻 𝛽 𝐿 . Finally, it is instructive to consider the comparative statics of profits with respect to 𝜆 . Both the uninformed and informed firms benefit from a larger portion of H-types. The rate of increase, however, is different. For the price discriminat ing firm, increasing 𝜆 raises only the uninformed H-types’ personalized price. By contrast, for the uninformed 29 firm, increasing 𝜆 raises all uninformed consumers’ price, 𝑒 . Therefore, increasing 𝜆 increases Π 𝑈 (𝑒 ) at a faster rate than Π 𝐼 ∗ . Figure 3 depicts the region in which price discrimination is profitable for the firm, as implied by Proposition 4. The impact just described is reflected by the fact that the boundary line is downward sloping. 6. Welfare Implications In this section, we examine the e ffect of price discrimination on consumer surplus and total social welfare. Proposition 1 quickly establishes that price discrimination never reduces the total social welfare because the informed firm serves every consumer in equilibrium. In addition, it i s immediate from Proposition 4 that the firm collects data if and only if 𝜌 𝐻 <𝜌 1 . Therefore, we can also assess whether equilibrium price discrimination is a Pareto improvement by examining its impact on consumer surplus, and whether the improvement is strong or weak. When the firm is informed and price discriminates, t he average consumer surplus is 𝐶 𝑆 𝐼 =𝜆 𝜌 𝐻 (1−𝜆 )(𝛼 𝐻 −𝛼 𝐿 ) . When the uninformed firm charges the list price 𝑝 𝑈 ∗ , we denote the average consumer surplus as 𝐶 𝑆 𝑈 (𝑝 𝑈 ∗ ) : 𝐶 𝑆 𝑈 (𝛼 𝐻 )=0; 𝐶 𝑆 𝑈 (𝑒 )= 𝜆 (1−𝜆 )(𝛼 𝐻 −𝛼 𝐿 )𝑤 𝜆 +𝑤 (1−𝜆 ) (1−𝜌 𝐻 ) ; and 𝐶 𝑆 𝑈 (𝛼 𝐿 )=𝜆 (𝛼 𝐻 −𝛼 𝐿 ) . From Proposition 4, the firm always prefers personalized pricing to uniform pricing when 𝑝 𝑈 ∗ =𝛼 𝐻 or 𝑝 𝑈 ∗ =𝛼 𝐿 . In addition, direct comparison of the 𝐶 𝑆 𝑈 (𝑝 𝑈 ∗ ) suggests the ordering 𝐶 𝑆 𝑈 (𝛼 𝐻 )<𝐶 𝑆 𝐼 <𝐶 𝑆 𝑈 (𝛼 𝐿 ) . Therefore, price discrimination improves both average consumer surplus and the firm’s profit when 𝑝 𝑈 ∗ =𝛼 𝐻 . In this case, price discrimination is a weak Pareto improvement, because L-types’ surplus remains zero as they are either unserved by the uninformed firm or fully-exploited by the discriminating firm. This intuition is consistent with the conventional wisdom that price discrimination 30 may help consumers on average only if the total demand increases (Varian 1985, Bergemann et al. 2015). Now suppose the uninformed firm charges a uniform price 𝑝 𝑈 ∗ =𝑒 . Collecting data is profitable if and only if 𝜌 𝐻 <𝜌 1 by Proposition 4. This raises the average consumer surplus if and only if 𝐶 𝑆 𝐼 >𝐶 𝑆 𝑈 (𝑒 ) , which is equivalent to 𝜌 𝐻 > 𝑤 𝜆 +𝑤 (2−𝜆 ) ≡𝜌 2 . Larger 𝜌 𝐻 implies (i) higher a uniform price 𝑒 from the uninformed firm, thereby reducing 𝐶 𝑆 𝑈 (𝑒 ) , as well as (ii) a greater price distortion 𝑝 𝐻 ∗ <𝛼 𝐻 to signal to H-types, thereby raising 𝐶 𝑆 𝐼 . Therefore, when 𝑝 𝑈 ∗ =𝑒 and 𝜌 𝐻 ∈(𝜌 2 ,𝜌 1 ) , price discrimination benefits both average consumers and the firm, and thus raises overall economic welfare. Further, we can establish conditions when collecting data strictly improves every consumer’s surplus. Suppose again that 𝑝 𝑈 ∗ =𝑒 . Uninformed L-types always benefit from the price discrimination, because they avoid over -paying (𝑒 >𝛼 𝐿 ). Therefore L-types have strictly higher expected surplus under price discrimination. Uninformed H-types may also be better off if 𝑝 𝐻 ∗ =𝐸 [𝛼 ]<𝑒 . But since informed H-types are worse off by paying a higher price 𝛼 𝐻 , we need to examine whether price discrimination may improve H-types’ expected surplus. Denote each H-type’s expected surplus with price discrimination as 𝐶 𝑆 𝐼𝐻 , and that with the uninformed firm who charges the list price 𝑝 𝑈 ∗ =𝑒 as 𝐶 𝑆 𝑈𝐻 (𝑒 ) . We have 𝐶 𝑆 𝐼𝐻 =𝜌 𝐻 (1−𝜆 )(𝛼 𝐻 −𝛼 𝐿 ) ; 𝐶 𝑆 𝑈𝐻 (𝑒 )=𝛼 𝐻 −𝑒 . The condition 𝐶 𝑆 𝐼𝐻 >𝐶 𝑆 𝑈𝐻 (𝑒 ) is equivalent to 𝜌 𝐻 > 𝑤 𝜆 +𝑤 (1−𝜆 ) ≡𝜌 3 ∈(𝜌 2 ,𝜌 1 ) . Therefore, if 𝜌 𝐻 ∈(𝜌 3 ,𝜌 1 ) , price discrimination strictly raises the expected surpluses of both types as well as the firm’s profit, making it a strict Pareto improvement. Otherwise, if 𝜌 𝐻 ∈(𝜌 2 ,𝜌 3 ], price di scrimination harms H-types but still benefits both the consumers on 31 average and the firm. 17 Proposition 5 formalizes the above results: Proposition 5: Price discrimination (i) is a weak Pareto improvement, if 𝑝 𝑈 ∗ =𝛼 𝐻 ; (ii) is a strict Pareto improvement, if 𝑝 𝑈 ∗ =𝑒 and 𝜌 𝐻 ∈(𝜌 3 ,𝜌 1 ) ; (iii) harms H-types but benefits consumers on average, if 𝑝 𝑈 ∗ =𝑒 and 𝜌 𝐻 ∈ (𝜌 2 ,𝜌 3 ], where 𝜌 1 is given in Proposition 4, 𝜌 2 ≡ 𝑤 𝜆 +𝑤 (2−𝜆 ) and 𝜌 3 ≡ 𝑤 𝜆 +𝑤 (1−𝜆 ) . Figure 4 illustrates the conditions described in Proposition 5. The conditions in part (i) of the proposition correspond to the triangular section in the lower-center portion of Figure 4 (Region V) and that of Figure 2. The conditions in part (ii) of Proposition 5 correspond to the second upper region in Figure 4 (Region II). In this region, the uninformed firm is charging the uniform price 𝑝 𝑈 ∗ =𝑒 . On the consumer-side, there is an increase in consumer surplus stemming from the benefit to uninformed L-types who no longer overpay. Price discrimination also imposes a strict gain in states where uninformed H-types pay the informed firm a price lower than 𝑒 . The intuition is that when 𝜌 𝐻 is sufficiently high, i.e., 𝜌 𝐻 >𝜌 3 , then the signaling cost is so high (but not unprofitably so) that H-types need a low-enough price to be convinced of their type. In this case, the gain of H-types when they are uninformed more than compensates the loss when they are informed, thus price discrimination raises H-types’ expected surplus. Otherwise if 𝜌 𝐻 ∈ (𝜌 2 ,𝜌 3 ], H-types are worse-off under price discrimination, but their loss is less than L- types’ gain, thus price discrimination still raises consumer surplus on average. 17 It remains to show that there exists a feasible parameter space in which 𝜌 3 <𝜌 1 . This can be verified by a numerical example: 𝛼 𝐻 /𝛼 𝐿 =5, 𝑤 =0.4, 𝜆 =0.5, then 𝜌 3 =4/7≤𝜌 1 =75/124. 32 Figure 4: Welfare Impacts of Price discrimination (𝜶 𝑯 =𝟓 𝜶 𝑳 ,𝑤 =𝟎 .𝟑 ) Note: The boundary curves in Figure 4 are solved in Proposition 5 Price discrimination helps both the firm and average consumers in Region II, III and V The upper boundary curve of Region II is 𝜌 𝐻 =𝜌 1 , which is the iso-profit curve under which 𝐸 (𝛱 𝑐 )=𝛱 𝐸 in Figure 3, the left boundary curve is 𝜆 =𝜆 𝐿𝐸 in Figure 2, and the lower boundary curve on Region II is 𝜌 𝐻 =𝜌 3 , which is the iso-surplus curve under which 𝐶 𝑆 𝐼𝐻 =𝐶 𝑆 𝑈𝐻 (𝑒 ) ; The lower boundary curve on Region III is 𝜌 𝐻 =𝜌 2 , which is the iso- surplus curve under which 𝐶 𝑆 𝑈 (𝑒 )=𝐶 𝑆 𝐼 The boundary curves on Region V are 𝜆 =max{𝜆 𝐻𝐸 1 ,𝜆 𝐻𝐿 } and 𝜆 =𝜆 𝐻𝐸 2 from Figure 2 In summary, Proposition 5 implies three distinct economic gains from price discrimination. First, uninformed H-types who may undervalue their reservation value are served under price discrimination at a lower price, due to the firm ’s signaling cost. Second, uninformed L-types, who overpay and obtain negative ex -post surplus with uniform pricing, obtain strictly more surplus (zero) with price discrimination. Third, informed L-types, who may be unserved with uniform pricing, will be profitably served due to price discrimination. These gains are obtainable only when the discriminating firm (1) collects data from uninformed consumers and (2) and credibly signals through the pricing scheme. Correspondingly, Proposition 5 implies that consumers can be worse -off when third-party regulates that either (1) the firm is prohibited from collecting data on consumers or (2) the 33 firm must disclose the collected data to the consumers. Because market forces alone may enable uninformed consumers to correctly infer their type, mandatory disclosure of data only harms consumers by allowing the firm to avoid the signaling cost and fully exploit their willingness to pay. This stands in contrast to policies, such as RACAP, that advocate firms to reveal the private information they collected to the consumers for the sake of consumer welfare (Kamenica et al., 2011). 7. Conclusion This work sought to understand the implication of superior information and the role of data collection on the firm’s ability to price discriminate uninformed consumers. While it is often assumed that consumers have perfect knowledge about their willingness to pay, there are many situations in which they have some degree of uncertainty about their value for a product or service (e.g. insurance). Furthermore, in the digital era, firms have access to a wealth of consumer data, which, through aggregated analysis, can enable the firm to learn about individual consumers better than consumers themselves do. As we demonstrated, this situation creates a challenge for a price discriminating monopolist: How to price discriminate when consumers are less informed of their value? In answering this question, we found that the presence of superior information and uniformed consumers provokes the collected wisdom about monopolistic price discrimination. First, we showed that even if the firm has perfect knowledge about consumers ’ valuations, it may not be able to ex tract the entire surplus. Particularly, if consumers know that the firm has superior information, then consumers anticipate the possible exploitation by the firm. Therefore, the firm must signal to consumers their value through its pricing scheme to facilitate exploitation. As is the case in signaling models, the sender must expend costs to convince the receivers. Second, we showed that the ability to price discriminate, through the collection of consumer data, may not always benefit the firm. We compared equilibrium pricing with 34 and without price discrimination. When the firm commits no data collection and is restricted to uniform pricing, consumers are not suspicious of being overcharged and thus would not be adversely affected by the firm’s pricing. In addition, the firm can exploit consumers’ (rational) over-estimation of their value. Indeed, we show that the firm ’s uniform price may decrease when the distribution of consumers shifts toward those with higher valuation. If the firm can exploit this form of overpaying with a uniform price, then it may be better off by committing not to collect data for price discrimination. Third, we demonstrated that both consumer surplus and economic welfare can increase with price discrimination. A firm collects data only if it is profitable to do so. What is not so clear is how that can also benefit consumers. We found two mutually exclusive conditions under which this happens. First, if the firm can better exploit the high end of the market through price discrimination, then it may be willing to give up the opportunity to over-charge the low end of the market with a uniform price. Second, the firm can expand the market by serving the low-end of the market via discriminatory prices, as per the usual benefit of price discrimination. However, because of uninformed preferences, the firm incurs a signaling cost to the high-end of the market, which generates positive consumer surplus. These results may have implications for the debate about consumer privacy and the collection of consumer’s personal data. As we showed, through the collection of consumer data, the firm acquires more information than the individual sum of consumer ’s private data. That is, it is through aggregation that the firm assembles its own informational advantage, which then becomes private to the firm. As a result, despite its purpose of price discrimination, data collection is a process of not only one -way information transmission but also knowledge creation and bilateral communication. The additional knowledge from data collection helps the firm, not only learn the consumer types, but also whether consumers are informed of their types. Therefore, the definition of private consumer information may be broader than typically assumed: consumers’ private information should 35 include not only their private value but also their private knowledge (prior belief). If consumers are rational, then the firm can inform them of their type. And, as we showed, this information can be communicated through personalized pricing schemes. 36 CHAPTER TWO: PRODUCT LINE DESIGN WITH SUPERIOR INFORMATION ON CONSUMER PREFERENCES 1. Introduction The development of modern data technologies has enabled marketers to observe vast information about consumers’ usage records and behavior patterns. For instance, websites and smartphone app developers collect information on consumers’ preferences for digital products and services. Video streaming companies record consumers’ viewing histories and ratings. Even traditional firms are adopting new technologies to better analyze their customers and fine tune their products and prices. Disney, for instance, scans the shoes of visitors to track walking patterns around its theme parks. 18 The Absolute vodka manufacturer, Pernod Ricard USA, uses behavioral data to understand consumers’ preferences beyond what consumers understood themselves. 19 As the data technologies improve, firms have the statistical power to learn superior information about consumers’ preferences beyond consumers’ own knowledge. Whereas consumers observe only their own usage experiences, firms can observe more consumers and aggregate their data to isolate environmental noises and make better inferences of consumers’ intrinsic preferences. Marketers argue that these methods help them serve consumers better by offering more relevant products. Consumer protection advocates, by contrast, raise worries that these methods subject consumers to exploitation through better price discrimination. 20 In this research, we attempt to shed light on this debate by asking the following questions: How does a firm design its product line using the superior 18 See “Look who's walking: Disney wants to track park visitors by scanning their feet,” the International Business Times, July 29, 2016. 19 See “Anthropology Inc.,” the Atlantic, March 25, 2013 20 See “Big Data & Differential Pricing,” February 2015, from the U.S. President’s Council of Economic Advisers. 37 information on consumer preferences? Can consumers benefit from the products designed with superior information? Under what conditions should consumers avoid sharing their data (i.e. opt out)? Is learning superior information ever unprofitable? This research focuses on inferences from aggregated anonymous consumer data, which is used by marketing researchers to estimate the distribution of individual level variables such as brand awareness or intrinsic product needs. These variables are especially important for marketers when consumers’ valuation of a product or service is imperfect due to environmental noises. We first show how a firm may learn beyond consumers’ prior knowledge with the data that are imperfect indicators of consumers’ preferences. We then examine the product line design strategies with the superior information and show how they differentiate from the standard case when consumers are perfectly informed of their preferences. Finally, we analyze whether the superior information benefits consumers and the firm. Previous research on data collection suggests that consumers are typically exploited when a firm has access to information on their willingness to pay. Therefore, without further inducements, consumers should always “opt-out” of data sharing initiatives (e.g. Acquisti & Varian, 2005, Fudenburg & Villas -Boas, 2004). By contrast, we show that when data aggregation creates superior information, consume rs may sometimes benefit from sharing their data and therefore “opt in.” We identify two potential benefits: first, some consumers obtain better-suited products and greater surpluses despite firm’s improved pricing abilities; second, consumers may avoid ov er-purchases by observing the offered product menu and inferring their true type. In other cases, consumers should opt out to retain their information rent. We build an analytical model of a monopoly firm who specifies a vertically differentiated product line, akin to the standard models of Mussa & Rosen (1978) and Moorthy (1985). The firm faces rational consumers with two types (H and L) of marginal utility for product quality, which we call “consumer values.” Each consumer receives a 38 private signal from her previous experience. The signals, which we call “consumer data, ” are either “clean” indicators (H or L) or uninformative “noises” (∅ ) of their value, depending on uncertainties of the market state. If a consumer receives a clean signal, then we call her an “informed consumer;” otherwise, she is an “uninformed consumer.” Before receiving the signal, consumers decide whether the firm can access their data by either opting in or opting out. If consumers opt in, then the firm has access to consumer data. By collecting data on both types of consumers, the firm can infer the market state and incorporate this information into the design of its product line. The correlation between consumer data and the market state is shown to play an important role for the incentives of data collection and opt in decisions. Consider, for example, an online streaming service provider that offers paid subscription options, which differ on the level of access to movies. 21 Users can observe the price and feature of each subscription option and even utilize a free, one -month trial period. Nevertheless, they may still be a priori uncertain of their true valuation of the service and may misestimate their needs. For example, their usage experience from the free one -month trial period could be subject to noise and therefore be an imperfect indicator of their true value. Noise in their usage experience would be caused by external market environmental factors, such as a recent spate of strong or weak productions. If there were unusually more interesting movie productions recently (a good state), then an L-type user may over-estimate her needs and buy a high-end subscription. Similarly, an H-type user may underestimate her needs and subscribe with a lower level of access in a bad state. By contrast, the streaming service provider can observe and compare multiple users’ data, such as users ’ choices of the movies as well as the time and location of each view, therefore by aggregating the data across the users, the firm may control for the environ mental factors and thus have better estimates of users’ intrinsic values, which is superior information beyond the uninformed consumers ’ prior knowledge. 21 Some service providers, such as Netflix, differentiate the access levels by allowing users to watch videos in either a single device (such as a laptop) or multiple devices (such as a laptop and a TV). 39 But the informational advantage from data aggregation is not always beneficial for the firm. If a consumer is rational, she should be suspicious that the firm with superior information may try to take advantage of her uninformed preferences by tricking her to overestimate her value and pay for a more-expensive product. As a result, to convince the uninformed H-types to buy a more-expensive product, the firm may have to distort the product line in a particular way. That is, to credibly convince the consumer of her type, the firm incurs an endogenous profit loss as a signaling cost (e.g. Milgrom & Weber, 1985 ), which we show can be beneficial for H-types. Furthermore, the signaling costs that arise can be so large that the firm would prefer to avoid consumer suspicions by not collecting data that creates superior information. Without access to consumer data, the firm estimates the distribution of consumer believes as if it faces three segments of consumers: informed H-types, informed L-types, and uninformed consumers. Therefore, the firm may provide up to three vertically - differentiated products. By offering a “medium” quality product for the uninformed consumer, the firm can have uninformed L-types overpaying in some situations. This turns out to be a silver lining for the firm that does not have access to the data, because consumers cannot be suspicious of being tricked. The downside of offering a long product line with three products is that it induces strong cannibalization concerns. Particularly, the firm must maintain additional incentive compatibility constraints to prevent an H-type from buying cheaper alternatives. There are several insights resulting from our analysis. First, we find that consumers can reap net benefits from opting in. Even though the firm may have superior information about consumers through data aggregation, the additional informatio n may confound the firm’s ability to extract surplus from consumers. Specifically, when the firm learns that H- types are uninformed, it must offer a significant price discount on the high -quality product to convince H-types that they are not being tricked into over-paying. This signaling “discount” of the high-quality product raises the incentives for uninformed H-types to opt 40 in. Furthermore, opting in benefits the uninformed L-types ex post, because they can avoid over-paying for products that exceed their needs. A silver lining of the signaling requirement for the firm is that, by relaxing the standard incentive compatibility constraint, the firm can “undistort” the lower quality product to improve its profit from serving L-types. In the classic product line model (Mussa and Rosen 1978), the firm downward distorts the low -quality product to mitigate cannibalization. However, with uninformed consumer preferences, the firm ’s incentive compatibility constraint can be outweighed by the firm’s signaling constraint. We show that data collection may reverse the quality distortion of adverse selection and restore efficiency to the product line. Efficient products may be provided to both consumer types. This result extends the product line design literature (Mussa & Rosen, 1978, Moorthy, 1984, Villas-Boas, 1998, 2004 , and Desai, 2001), which empha sizes the downward distortion of low-quality products from first-best optimal to deter “defection” (i.e. cannibalization) by H-types. In addition, our findings contrast wit h classic product line design literature with respect to the welfare implications of second-degree price discrimination. Firstly, Mussa and Rosen (1978) suggests that second-degree price discrimination by a monopolist is always profitable. Stokey (1979) and Anderson & Dana (2009) show that a monopolist may forgo product versioning only if there exists constraints on technology capacities or product qualities. By contrast, our research suggests that even without these constraints, the monopolist may find it unprofitable to enhance its ability to price discriminate. Secondly, although the existing models suggest that endowing the firm with the ability to price discriminate can lead to a Pareto improvement (Varian, 1985; Anderson and Dana, 2009), this result oc curs only under the condition that the firm serves more consumers as a result of price discrimination. The existing models also suggest that the Pareto improvement is always weak, leaving at least some consumers with no additional surplus. By contrast, our research shows that second-degree price discrimination with superior 41 information can be a strict Pareto outcome, both ex ante and ex post, even without output expansion. Our research also extends the literature on consumer preference discovery (Wernerfelt, 1995; Kamenica, 2008; Guo & Zhang, 2012) by demonstrating that uninformed consumers may infer their type from product lines. Specifically, Guo & Zhang (2012) consider consumer deliberation as endogenous efforts with exogenous costs. By contrast, our rese arch examines deliberation as rational inferences and belief updating, when the informed firm’s product line becomes a credible message in equilibrium, 22 even though in principal the firm can manipulate consumers’ beliefs. In addition, our model requires less restrictions on consumers’ prior knowledge. For example, Wernerfelt (1995) and Kamenica (2008) assume that consumers are uninformed of the absolute value of their willingness to pay, but they have to know their relative ranking in the market ex ante. By contrast, we assume that an uninformed consumer may not know her type (ranking in the market). Furthermore, our model of uninformed preferences leads to a micro -foundation in which data aggregation can create superior information. Finally, our research offers two different insights to the recent discussions about consumers’ private information (Taylor, 2004; Acquisti & Varian, 2005 ; Calzolari & Pavan, 2005). First, the literature focuses on price exploitation rather than product design, suggesting that data collection strictly reduces consumers’ average welfare by depriving consumers’ information rent. This result holds only when consumers know their preferences ex ante, which implies that the purpose of data collection is only to identify consumers for price exploitation. But when superior information is possible, data collection can also help the firm design products that better fit with consumers’ preferences. Therefore the tradeoff between product designs and price discrimination requires formal examination (Varian, 2002). This research shows that data collection can benefit consumers and improve 22 It is both firm-profitable and socially-efficient to signal through a product line than other signaling strategies using dissipative spending, such as advertising (Milgrom and Roberts, 1986; Kihlstrom and Riordan, 1984). The proof is available upon request. 42 the efficiency of product line design. Second, unlike the literature that assumes data collection as a one-way information transmission, this research demonstr ates that data collection may become a two-way information exchange (Samuelson, 1999; Solove, 2007) : First the firm learns from consumers by observing their data; then it processes the data to create superior preference information; finally it credibly shares the information with the uninformed consumers. The role of data collection in helping uninformed consumers learn their preference is not seen in earlier literature. 2. The Model The model consists of a monopoly firm selling vertically-differentiated products to consumers of two types. The starting point for this model is found in Mussa & Rosen (1978) and Moorthy (1985). We depart from the classic model by assuming that consumers may not have perfect knowledge of their type. Consumers observe private signals from their previous experiences, which only imperfectly reflect their intrinsic preference. In this section, we present the basic structure of consumer preferences, consumer data, and the corresponding inferences by consumers and the firm. There are two types of consumers with a total mass of one. Let 𝛼 𝑖 be consumer 𝑖′s intrinsic value for marginal product quality, which we call the “consumer value.” A fraction 𝜆 of consumers are H-types and have a value of 𝛼 𝐻 , while the others are L-types and have a lower value 𝛼 𝐿 <𝛼 𝐻 . Instead of observing her value 𝛼 𝑖 ex ante, consumer 𝑖 observes a private noisy signal 𝑠 𝑖 , which we call the “consumer data.” The signals are correlated with an unobserved common random variable, which we c all the “market state, ” 𝑚 ∈{𝑔 ,𝑏 }, where “𝑔 ” refers to “good” and “𝑏 ” refers to “bad.” . Consumers observe only their own signals. Specifically, consumer 𝑖 in a market state 𝑚 observes either a clean signal, 𝑠 𝑖 =𝛼 𝑖 , with probability 𝛽 𝑖 𝑚 , or an uninformative noise, which we denote by 𝑠 𝑖 = ∅. 23 23 Though consumers in this model are fully rational, the preference structure is consistent with recent psychological views on inherent preference (Simonson, 2008), in which view consumer p reference is well- 43 The firm, at the start of the game, can ask consumers to share their data, {𝑠 𝑖 } 𝑖 =𝐻 ,𝐿 . If data are collected, then it may use the data to design a product line to maximize its profit. If data are not collected, then it designs products based on its prior knowledge about consumers’ believes. The products are differentiated in quality 𝑞 𝑘 and a corresponding price 𝑝 𝑘 , where 𝑘 refers to the quality rank of products. We will denote the order of quality as 𝑞 1 >𝑞 2 >𝑞 3 >⋯, without loss of generality. For each unit of product 𝑘 the firm incurs a production cost of 1 2 𝑞 𝑘 2 . All consumers observe the product line (the price and quality of all available products) without ambiguity and may choose one or zero product to purchase. Specifically, c onsumer i may purchase a product 𝑘 and obtain the utility 𝑈 𝑖 (𝑞 𝑘 ,𝑝 𝑘 )=𝛼 𝑖 𝑞 𝑘 −𝑝 𝑘 . Because it is possible that she observes an uninformative noise and consequently is uninformed of 𝛼 𝑖 , she may choose a less-relevant product that does not maximize her utility ex post. The correlational structure 𝛽 𝑖 𝑚 needs to satisfy certain conditions to ensure that (1) uninformed consumer preferences may exist, and (2) the firm may learn superior information by collecting data in aggregate. Without loss of generality, we assume a stylized market scenario: 𝛽 𝐻 𝑔 =𝛽 𝐿 𝑏 =1 , and 𝛽 𝐻 𝑏 =𝛽 𝐿 𝑔 =0 . In this scenario, either H- types or L-types are uninformed. Therefore, data aggregation enables the firm to observe {𝐻 ,∅} in a good state or {∅,𝐿 } in a bad state, therefore the firm can deduce the market state and infer the type of uninformed consumers, which is superior information that exceeds the uninformed consumers’ prior knowledge. The superior information from data aggregation is valuable for both the firm and uninformed consumers, because it affects the design of product line and consequently consumers’ purchase decisions. In Section 6, we relax the second condition above and examine another correlational structure: 𝛽 𝐻 𝑔 =𝛽 𝐿 𝑔 = 1, and 𝛽 𝐻 𝑏 =𝛽 𝐿 𝑏 =0, in which data aggregation enables the firm to only keep up with what defined and has both a stable, time -invariant component and an unstable, time -variant construction factor. Because consumers may not identify the influence of construction factors, it is difficult for them to retrieve their inherent preference and predict how their future valuation may change. For simplicity, we restrict the construction factors to be market forces that uniformly affect all consumers. 44 consumers learn about their type but not beyond any consumer’s knowledge. Therefore, this market scenario serves as a benchmark to analyze the welfare implications when data aggregation creates superior information in the main model. The game unfolds through three periods. In the first period, the firm decides whether to collect the data {𝑠 𝑖 } on consumers, and consumers decide to o pt in/out. In the second period, nature draws the market state 𝑚 ∈{𝑔 ,𝑏 } with equal probabilities, consumer 𝑖 learns 𝑠 𝑖 , and the firm designs its product line {𝑞 𝑘 ,𝑝 𝑘 } 𝑘 =1,2,3… . In the third period, each consumer observes the product line and makes a purchase decision. 3. A Numerical Example In this section, we use a numerical example to illustrate the problem of designing products when the market has consumers who do not perfectly know their value. This stylized example suggests the informed firm’s dilemma of exploiting consumers with superior information. Consider again a movie streaming service provider that offers a menu of subscription plans, which differ on the level of access. Lower -end plans are more restrictive in the number of movies that a consumer can watch in the subscription period, while higher -end plans permit more viewing (e.g. allow for streaming on multiple digital devices or access to exclusive videos). For convenience, we assume the subscription plans limit the maximum number of movies each consumer can watch. The marginal cost of serving more consumers to watch more movies is not trivial, because by allowing consumers to watch more movies, the firm needs to maintain a large movie library and pay more licensing fees. We assume that the minimum size of the movie library is a linear function of the number of consumers and a quadratic function of their access levels, so that the cost structure in Section 2 applies in this example. There are two segments of consumers with a total mass of one. Suppose that each segment has the same population, H-types are willing to pay $8 per movie, L-types are 45 willing to pay $4 per movie, i.e., 𝜆 =0.5, 𝛼 𝐻 =8, 𝛼 𝐿 =4. If the consumer is unsure about her type, then she m ay estimate her value as $6 for each movie. Nature randomly chooses either a good or a bad market state at equal probabilities. In the good state, L-types are uninformed; in the bad state, H-types are uninformed. Suppose that data are not collected, then the firm cannot learn the market state. The estimated distribution of consumer believes based on the prior is that a quarter of consumers are informed H-types, a quarter of consumers are informed L-types, and half of consumers are uniformed. We obtain the following optimal strategy: the firm should skip the informed L-types to serve only the informed H-types and uninformed consumers, offering a menu of two differentiated subscriptions: a “premium” plan that allows eight movies for $54, and a “standard” plan that allows five movies for $30. This menu reflects the classic solution, with “no distortion at the top” (𝑞 1 =𝛼 𝐻 = 8, 𝑞 2 =5<E[𝛼 ]=6 , and 𝑞 3 =0<𝛼 𝐿 ), and the expected profit is (54−8 2 /2) 4 + (30−5 2 /2) 2 =$14.25. Now suppose that data are collected, and therefore the firm learns the market state and the uninformed consumers’ type. If consumers are naïve and lack rational suspicion, the optimal design of product line is a standard problem by Mussa & Rosen (1978) and Moorthy (1984). If the firm learns that the market state is good and L-types are overestimating their willingness to pay, then the optimal strategy is to design a premium plan with eight movies for $56 and an “economy-plus” plan with four movies for $24, earning a profit of $20. Alternatively, if the firm learns that the market state is bad and H- types are underestimating their willingness to pay, then the optimal strategy is to design a “premium economy” plan with six movies for $32 and an “economy” plan with two movies for $8, earning a profit of $10. The expected profit of learning superior information is 20+10 2 =$15, which is greater than $14.25. However, if uninformed consumers are sophis ticated and know that the firm has the data, then they may expect that the firm ’s product line design strategy reflects the superior information from data aggregation. Specifically, if they observe that the firm offers 46 the premium and the economy-plus plans, they can rationally infer that the market state is more likely to be good and the firm finds them to be L-types rather than H-types. As a result, the uninformed L-types should reduce their estimate of value and reject the economy-plus plan that is over-priced for L-types. Because of consumer rational suspicion and purchase reluctance, the expected profit of the informed firm with this particular product line is only $11, which is lower than $14.25. The above example illustrates a conceptual challenge of designing product line with superior preference information. Departing from the standard model, the product line design problem with superior information must consider both the optimization based on consumers’ beliefs, and how the firm ’s action may inform the consumers to update their beliefs. Similarly, the consumers ’ beliefs must assess both the firm’s profits and its estimates of their belief updates. This belief interaction incurs endogenous signaling costs that may impact the design of product line and welfare implications for both the firm and every consumer. 4. Equilibrium 4.1. No Data Collection (Consumer Opt-Out) We start with the benchmark when the firm does not observe the consumer data {𝑠 𝑖 }. In this case, the firm learns less than consume rs, thus it uses only the prior knowledge on market distribution to design products. In expectation, the firm anticipates three distinct segments of consumers. A portion 𝜆 /2 are informed H-types and a portion (1−𝜆 )/2 are informed L-types. The uninformed consumers estimate their value as 𝑒 ≡ E[𝛼 𝑖 |𝑠 𝑖 =∅]=𝜆 𝛼 𝐻 +(1−𝜆 )𝛼 𝐿 . The firm’s product design problem is then posed as follows: max {𝑝 1 ,𝑝 2 ,𝑝 3 ,𝑞 1 ,𝑞 2 ,𝑞 3 } 𝜆 2 (𝑝 1 − 𝑞 1 2 2 )+ 1 2 (𝑝 2 − 𝑞 2 2 2 )+ (1−𝜆 ) 2 (𝑝 3 − 𝑞 3 2 2 ), such that for every 𝑘 ,𝑗 =1,2,3, and 𝑘 ≠𝑗 , 𝛼 1 =𝛼 𝐻 ; 𝛼 2 =𝑒 ; 𝛼 3 =𝛼 𝐿 , (i) 𝛼 𝑗 𝑞 𝑗 −𝑝 𝑗 ≥max{𝛼 𝑗 𝑞 𝑘 −𝑝 𝑘 ,0}, and (ii) 𝑝 𝑗 ,𝑞 𝑗 ≥0, 47 The firm is not required to serve all consumers and may skip L-types to mitigate cannibalization. The profitability of serving L-types depends on the relative portion of H- types and the ratio α≡𝛼 𝐿 /𝛼 𝐻 ∈(0,1], which we call consumer homogeneity. Small 𝛼 is interpreted as a large disparity between consumer types. For small levels of consumer homogeneity, 𝛼 ≤ 𝜆 +𝜆 2 1+𝜆 2 , the firm has a stronger incentive to extract surplus from informed H-types and uninformed consumers. Thus, it simply abandons its lowest quality product to keep 𝑝 ̂ 1 and 𝑝 ̂ 2 high. Lemma 1 characterizes the equilibrium for the different parametric conditions. Lemma 1 Suppose the firm does not observe consumer data. The equilibrium product line strategy Ψ ̂ without data collection is characterized as follows: Ψ ̂ 0<𝛼 ≤ 𝜆 +𝜆 2 1+𝜆 2 𝜆 +𝜆 2 1+𝜆 2 <𝛼 <1 Informed H-types 𝑞̂ 1 𝛼 𝐻 𝑝 ̂ 1 (𝛼 𝐻 −𝛼 𝐻 𝜆 2 +𝜆 2 𝑒 )(𝛼 𝐻 −𝛼 𝐿 ) +𝑒 𝛼 𝐿 (𝛼 𝐻 −𝛼 𝐻 𝜆 2 +𝜆 2 𝑒 )(𝛼 𝐻 −𝛼 𝐿 ) +𝑒 𝛼 𝐿 − (1+𝜆 2 )𝛼 𝐻 (𝑒 −𝛼 𝐿 ) 1−𝜆 (𝛼 − 𝜆 +𝜆 2 1+𝜆 2 ) Uninformed Consumers 𝑞̂ 2 𝛼 𝐿 +𝜆 2 (𝛼 𝐻 −𝛼 𝐿 ) 𝑝 ̂ 2 𝑒 𝛼 𝐿 +𝜆 2 𝑒 (𝛼 𝐻 −𝛼 𝐿 ) 𝑒 𝛼 𝐿 +𝜆 2 𝑒 (𝛼 𝐻 −𝛼 𝐿 )− (1+𝜆 2 )𝛼 𝐻 (𝑒 −𝛼 𝐿 ) 1−𝜆 (𝛼 − 𝜆 +𝜆 2 1+𝜆 2 ) Informed L-types 𝑞̂ 3 0 𝛼 𝐿 − 𝜆 (1+𝜆 ) 1−𝜆 (𝛼 𝐻 −𝛼 𝐿 ) 𝑝 ̂ 3 0 𝛼 𝐿 2 − 𝜆 (1+𝜆 ) 1−𝜆 𝛼 𝐿 (𝛼 𝐻 −𝛼 𝐿 ) Accordingly, the firm earns the profit: Π 𝑂𝑢𝑡 ∗ ={ 1 4 (1+𝜆 )(𝜆 (𝛼 𝐻 −𝑒 ) 2 +𝑒 2 )+ 1 4 (1−𝜆 )[𝛼 𝐿 − 𝜆 (1+𝜆 ) 1−𝜆 𝛼 𝐿 (𝛼 𝐻 −𝛼 𝐿 )] 2 if 𝛼 > 𝜆 +𝜆 2 1+𝜆 2 1 4 (1+𝜆 )(𝜆 (𝛼 𝐻 −𝑒 ) 2 +𝑒 2 ), if 𝛼 ≤ 𝜆 +𝜆 2 1+𝜆 2 . and consumers have the surplus 48 CS 𝑂𝑢𝑡 ∗ ={ 2𝛼 𝐿 − 𝜆 (1+𝜆 +3𝜆 2 −𝜆 3 )(𝛼 𝐻 −𝛼 𝐿 ) 1−𝜆 if 𝛼 > 𝜆 +𝜆 2 1+𝜆 2 (1−𝜆 )[𝛼 𝐿 +𝜆 2 (𝛼 𝐻 −𝛼 𝐿 )] if 𝛼 ≤ 𝜆 +𝜆 2 1+𝜆 2 . The equilibrium described in Lemma 1 has some noteworthy properties. Uninformed consumer preferences lead to three expected segments of consumers even though there are only two types. The uninformed consumers may obtain a negative surplus. Specifically, L- types may over-buy and be worse-off than making no purchases. Furthermore, the firm utilizes the standard distortions of lower quality products to mitigate cannibalization, which increases both the product line’s length (number of products) and degree of differentiation (disparity of quality) relative to the standard case. 4.2. Data Collection (Consumer Opt In) Suppose that the firm collects the consumer data in aggregate, then it can deduce the market state by observing either {𝐻 ,∅} or {∅,𝐿 }. In this case, the firm obtains an informational advantage over the uninformed consumers by observing more consumers and isolating the market state. The superior information is valuable for both the firm and consumers, because it informs the firm the uninformed consumers’ type and consequently affects the product line. Since uninformed consumers are rational, they may try to interpret the product line chosen by the informed firm to make better purchase decisions. Since the product line design may change the uninformed consumers’ beliefs and consequently purchase decisions, it implies a signaling game for wh ich we employ the concept of Perfect Bayesian Equilibrium (PBE). As in many signaling games, there is a multitude of equilibria, most which involve unreasonable out -of-equilibrium consumer beliefs. To eliminate these beliefs, we employ the D1 criterion to refine any equilibrium (Banks and Sobel 1987; Cho and Kreps 1987; Cho and Sobel, 1990 ). Particularly, upon observing a deviation, an uninformed consumer assigns zero probability to a given market state, in which whenever the firm has weak incentives to dev iate, it has strong incentives 49 to deviate in another state. 24 Further, if there exists multiple PBE with the same firm profit, then we select only those with the most efficient outcome. Before presenting the equilibria, we first introduce some notation. Denote the uninformed consumer i’s purchase decision as 𝑟 𝑖 ∈ PPurchase, No Purchase,, and the firm’s product line design as Ψ. Let 𝜇 𝑖 ≡𝜇 (𝑚 =𝑏 |Ψ) denote consumer 𝑖 ’s estimated probability of the 𝑏 -state by observing Ψ. Since H-types are the only uninformed consumers when 𝑚 =𝑏 , the belief 𝜇 𝑖 can be interpreted as consumer i’s self-estimated probability that she is an H-type. Let (Ψ ∗ ,𝑟 𝑖 ∗ ,𝜇 𝑖 ) 𝑖 =𝐻 ,𝐿 denote a PBE, which specifies the equilibrium product line, consumer choice and beliefs. Denote Ψ ′ as the firm’s deviation strategy and BR(Ψ ′ ,𝜇 𝑖 ) as consumer i’s best response to that deviation given belief 𝜇 𝑖 . Finally, denote Π ̂ (𝑚 ,𝑟 𝑖 ,Ψ ′ ) as her calculation of the firm’s deviation profit under the 𝑚 - state. Consider, for example, w hen 𝑚 =𝑏 , the firm ’s product design problem is posed as follows: max Ψ∈{𝑝 𝐻 ,𝑝 𝐿 ,𝑞 𝐻 ,𝑞 𝐿 ≥0} 𝜆 (𝑝 𝐻 − 𝑞 𝐻 2 2 )+(1−𝜆 )(𝑝 𝐿 − 𝑞 𝐿 2 2 ), such that (i) [𝜇 𝐻 𝛼 𝐻 +(1−𝜇 𝐻 )𝛼 𝐿 ]𝑞 𝐻 −𝑝 𝐻 ≥max{[𝜇 𝐻 𝛼 𝐻 +(1−𝜇 𝐻 )𝛼 𝐿 ]𝑞 𝐿 −𝑝 𝐿 ,0}; (ii) 𝛼 𝐿 𝑞 𝐿 −𝑝 𝐿 ≥max{𝛼 𝐿 𝑞 𝐻 −𝑝 𝐻 ,0}; (iii) 𝜇 𝐻 is sequentially rational, Bayesian whenever possible, and survives the D1 criterion (Banks and Sobel 1987; Cho and Kreps 1987; Cho and Sobel, 1990) . 25 Lemma 2 Suppose the firm observes consumer data. There exists a unique perfect Bayesian equilibrium (Ψ ∗ ,𝑟 𝑖 ∗ ,𝜇 𝑖 ) 𝑖 =𝐻 ,𝐿 , characterized as follows: 24 Formal details of the equilibrium refinement process are relegated to the proof of Lemma 2 in Appendix. 25 Formally, 𝜇 𝐻 survives D1 criterion if and only if for any out-of-equilibrium pricing strategy Ψ ′ ≠Ψ ∗ , 𝑖 ∈{𝐻 ,𝐿 } and 𝑗 ≠𝑘 ∈{𝑔 ,𝑏 } , then 𝜇 𝐻 (Ψ ′ )=1 if 𝑘 =𝑏 , and 𝜇 𝐻 (Ψ ′ )=0 , if 𝑘 =𝑔 , whenever the following condition holds: ⋃ {𝑟 𝐻 |Π ∗ (𝑚 =𝑗 )≤Π ̂ 𝐻 (𝑗 ,Ψ ′ ,𝑟 𝐻 )} 𝜇 ⊊⋃ {𝑟 𝐻 |Π ∗ (𝑚 =𝑘 )<Π ̂ 𝐻 (𝑘 ,Ψ ′ ,𝑟 𝐻 )} 𝜇 , where ⋃ {.} 𝜇 is a set of arbitrary beliefs 𝜇 ∈[0,1] such that the condition inside the braces holds, and 𝑟 𝑖 is a best response to Ψ ′ given the arbitrary belief 𝜇 , i.e., 𝑟 𝐻 =𝐵𝑅 (Ψ ′ ,𝜇 ∈[0,1]) . 50 (i) The equilibrium product line Ψ ∗ is given in the following tables. 𝒎 Ψ ∗ (𝑚 ) 𝛼 ≤𝜆 𝛼 >𝜆 𝜆 ≤ 1 2 𝜆 > 1 2 𝜆 ≤ 1 2 𝜆 > 1 2 𝒎 =𝒈 Informed H-types 𝑞 ∗ (𝑔 ,𝐻 ) 𝛼 𝐻 𝑝 ∗ (𝑔 ,𝐻 ) 𝛼 𝐻 2 (𝛼 𝐻 −𝛼 𝐿 ) 2 1−𝜆 +𝛼 𝐻 𝛼 𝐿 Uninformed L-types 𝑞 ∗ (𝑔 ,𝐿 ) 0 𝛼 𝐿 −𝜆 𝛼 𝐻 1−𝜆 𝑝 ∗ (𝑔 ,𝐿 ) 0 𝛼 𝐿 2 −𝜆 𝛼 𝐻 𝛼 𝐿 1−𝜆 𝒎 =𝒃 Uninformed H-types 𝑞 ∗ (𝑏 ,𝐻 ) 𝛼 𝐻 𝑝 ∗ (𝑏 ,𝐻 ) (1+𝜆 )𝛼 𝐻 2 2 (𝛼 𝐻 −𝛼 𝐿 ) 2 2(1−𝜆 ) +𝛼 𝐻 𝛼 𝐿 Informed L-types 𝑞 ∗ (𝑏 ,𝐿 ) 𝛼 𝐿 (1−𝜆 )𝛼 𝐻 2 2(𝛼 𝐻 −𝛼 𝐿 ) 𝛼 𝐿 (1−2𝜆 )𝛼 𝐻 +𝛼 𝐿 2(1−𝜆 ) 𝑝 ∗ (𝑏 ,𝐿 ) 𝛼 𝐿 2 (1−𝜆 )𝛼 𝐻 2 𝛼 𝐿 2(𝛼 𝐻 −𝛼 𝐿 ) 𝛼 𝐿 2 (1−2𝜆 )𝛼 𝐻 𝛼 𝐿 +𝛼 𝐿 2 2(1−𝜆 ) (ii) In any market state m, Consumer 𝑖 ’s decision rule 𝑟 𝑖 ∗ is to purchase the product 𝑞 ∗ (𝑚 ,𝑖) at the price 𝑝 ∗ (𝑚 ,𝑖) , whenever it is available. (iii) Consumer 𝑖 updates her belief upon observing any product line Ψ as follows: 𝜇 𝑖 =1, if and only if Π ̂ (𝑔 ,𝑟 𝑖 ,Ψ)≤Π 𝐹 ∗ , where 𝑟 𝑖 ∈ BR(Ψ,𝜇 𝑖 =1) , Π 𝐹 ∗ stands for the firm’s first-best profit when all consumers know their type ex ante. Otherwise 𝜇 𝑖 =0. (iv) The equilibrium survives the D1 criterion. To interpret Lemma 2, consider the equilibrium in each market state. When 𝑚 = 𝑔 , H-types are informed, but L-types are uninformed. However, the equilibrium product line does not enable the firm to take advantage of uninformed L-types, who are overestimating their value. Suppose that the firm could offer a product line that would convince the uninformed L-type that she is an H-type. But her equilibrium belief specified 51 in part (iii) implies that she will believe 𝑚 =𝑏 when observing Ψ ′ , if and only if the firm has no incentive to deviate with Ψ ′ at 𝑚 =𝑔 . Because Π ̂ (𝑔 ,𝑟 𝑖 ,Ψ ′ )≤Π 𝐹 ∗ =Π ∗ (𝑚 = 𝑔 ), her suspicions are sufficient to make any deviation to a deceptive strategy sub-optimal for the firm. Therefore, superior information over L-types does not help the firm over- exploit L-types, thus the equilibrium product line is the same as first -best under full information, with the usual distortion: 𝑞̂(𝑔 ,𝐿 )<𝛼 𝐿 . When 𝑚 =𝑏 , however, superior kno wledge over uninformed H-types poses a challenge to the firm. The firm would like any H-type to know that her intrinsic value is 𝛼 𝐻 so that it can extract more surplus from her. Because an uninformed H-type might suspect the 𝑔 -state and she is an L-type, the firm must signal the 𝑏 -state by designing a product line that convinces H-types it is not trying to mislead them into over-buying. This is accomplished by setting 𝑝 ∗ (𝑏 ,𝐻 ) sufficiently low. In addition, because the firm may also serve informed L-types, in principal it still needs to consider the cannibalization constraint in the product line design. By (iii) in Lemma 2, the firm ’s product line design problem when 𝑚 =𝑏 is posed as follows: max Ψ∈{𝑝 𝐻 ,𝑝 𝐿 ,𝑞 𝐻 ,𝑞 𝐿 ≥0} 𝜆 (𝑝 𝐻 − 𝑞 𝐻 2 2 )+(1−𝜆 )(𝑝 𝐿 − 𝑞 𝐿 2 2 ), such that (i) 𝛼 𝐻 𝑞 𝐻 −𝑝 𝐻 ≥𝛼 𝐻 𝑞 𝐿 −𝑝 𝐿 ; (ii) 𝛼 𝐻 𝑞 𝐻 −𝑝 𝐻 ≥0 (iii) 𝛼 𝐿 𝑞 𝐿 −𝑝 𝐿 ≥max{𝛼 𝐿 𝑞 𝐻 −𝑝 𝐻 ,0}; (iv) 𝑝 𝐻 − 𝑞 𝐻 2 2 ≤Π 𝐹 ∗ Because of the additional signaling constraint (iv), the conventional incentive compatibility constraint (i) may no longer bind. As a result, the firm can raise the quality 𝑞 ∗ (𝑏 ,𝐿 ) of the low-quality product above the standard distorted level to charge L-types a higher price. Putting it differently, the burden of signaling to uninformed H-types may 52 reverse the quality distortion of adverse selection and restore efficiency to the product line. Proposition 1 summarizes and formalizes this finding. Proposition 1 Compared to the full information case (of the standard model), consumers obtain higher quality products (on average) when the firm has access to the data. In addition, the efficiency of product line design is fully restored (i.e. 𝑞 ∗ (𝑏 ,𝐻 )= 𝛼 𝐻 and 𝑞 ∗ (𝑏 ,𝐿 )=𝛼 𝐿 ), if and only if 𝜆 ≤ 1 2 . Figure 1 illustrates the intuition in Proposition 1. The two U-shaped curves are the indifference curves of the firm’s profit margin to serve each type of consumers. The dashed curve is the signaling constraint that that 𝑝 𝐻 − 𝑞 𝐻 2 2 ≤Π 𝐹 ∗ , a condition that is unfound in the standard model (Mussa & Rosen, 1978). The two solid lines are the participation constraints, above which consumers prefer not to purchase. The dashed line is the cannibalization constraint, which is equivalent to 𝛼 𝐻 𝑞 𝐻 −𝑝 𝐻 ≥ 𝛼 𝐻 𝑞 𝐿 −𝑝 𝐿 . Above the dashed line informed H-types may prefer the cheaper alternative (the low-quality product). When H-types are uninformed, the firm needs to incorporate the signaling constraint (the dashed curve). Because the dashed curve may intercept at a lower point with the vertical line 𝑞 =𝑞 𝐻 Efficient than the dashed line, the slope between the two equilibrium points (𝑞 𝐻 ∗ ,𝑝 𝐻 ∗ ) and (𝑞 𝐿 ∗ ,𝑝 𝐿 ∗ ) may be lower than the dashed line, i.e., the cannibalization constraint may not be binding. The intuition is that when the firm lowers the equilibrium price for the premium product, not only that it convinces the uninformed H-types to learn their type, but also it reduces their incentives to choose the cheaper alternative. When the cannibalization constraint is not binding, the firm will upgrade the low -quality product in equilibrium to increase the profit margin from L-types, consequent ly restoring the efficiency of the product line design. 53 Figure 1: The Equilibrium Product Line when 𝒎 =𝒃 The condition on 𝜆 to obtain the efficient product line design is simply that the firm finds it profitable to raise the profit margin serving L-types. The effort to signal the uninformed H-types implies a “corrective” force against the monopoly distortions of product line cannibalization. The intuition behind this corrective force will help explain the incentives for data collection and opt in decisions in Section 5. Another implication of Proposition 1 relates to mandatory disclosure laws proposed by privacy advocates (e.g. RCAAP and Kamenica et al., 2011), in which firms are required to share their collected data with consumers through a trusted third-part. The corollary suggests that such regulatory disclosure of firm information may actually reduce total consumer surplus and social welfare. When consumers are credibly informed by the third- party, the firm uses a full-information equilibrium to better exploit H-types without the burden of signaling, therefore it reduces consumers ’ expected surplus. In addition, since data collection enables the firm to raise the quality of the low product closer to the efficient level, the total social welfare without mandatory disclosure is higher relative to the full - information case. 54 5. Analyses 5.1. Data Collection in Equilibrium In this section, we examine the incentives for data collection and start with the consumers ’ decision to opt in or opt out. Note that although opting in is modeled as a collective decision, an individual consumer has no incentive to deviate from a collective strategy, because she has very little influence on the overall data collection. We start by identifying the conditions when data collection does not occur in equilibrium. This is the case whenever either the consumers prefer to opt out or the firm prefers to avoid consumer suspicion by not collecting data. As noted above, there will be no data collection in equilibrium if consumer suspicion so severely constrains the firm that it prefers not having superior information. A comparison of Lemmas 1 and 2 shows the precise condition when this occurs, which is given in Proposition 2 (i). To assess when the consumers opt out, we use static comparisons from the results of Lemma 2. Uninformed L-types correctly update their belief and purchase the lower- quality product from the informative product line. They receive zero surplus when opting in. By contrast, without data collection, L-types may overestimate their value and obtain a negative surplus by over-purchasing. Therefore, opting in is ex post preferable for L-types. Lemma 2 also points to the signaling benefit from opting in for the uninformed H-types. But this benefit accrues only when 𝑚 =𝑏 . Otherwise, the firm is subject to the incentive compatibility constraint and the corresponding cannibalization concern. In fact, when 𝛼 < 𝜆 and 𝑚 =𝑔 , the firm abandons the low -quality product 𝑞 ∗ (𝑔 ,𝐿 )=0, so that it does not need to offer H-types any discount to prevent cannibalization. In the extreme case of large 𝜆 and small 𝛼 , the consumer expects that this possibility is sufficiently detrimental and consequently opts out. Proposition 2 (ii) offers the condition when this occurs. Outside of these conditions, data collection occurs in equilibrium. 55 Proposition 2 There is no data collection in equilibrium if either of the following conditions holds: (i) 𝜆 >(3−√5)/2 and 𝛼 >𝛼 , where α∈(0,1) solves the equation Π 𝐼𝑛 ∗∗ = Π 𝑂𝑢𝑡 ∗ , or (ii) 𝜆 >√2/2 and <𝛼 ≡ √2𝜆 2 −1−2𝜆 2 −1 2(1−𝜆 2 ) . Otherwise, data collection occurs in equilibrium. Figure 2: Equilibrium Data Collection The conditions in Proposition 2 (i) are depicted in Figure 2 as the upper right region of the 𝜆 -𝛼 parameter space. To interpret the results, first consider the region when 𝛼 ≥ 𝜆 +𝜆 2 1+𝜆 2 . On either side of the vertical portion of the boundary, 𝜆 = 3−√5 2 , the informed firm offers two products (since 𝛼 ≥ 𝜆 +𝜆 2 1+𝜆 2 implies 𝛼 >𝜆 ). In other words, cannibalization concerns are not too severe in either case to be decisive for data collection. Therefore, as 𝜆 increases across this verticle line, the incentives for data collection reflect the benefits of over-charging uninformed L-types relative to convincing uninformed H-types to purchase the higher quality product. The intuition that the firm prefers not to collect data when 𝜆 increases is that, while 56 larger 𝜆 generally implies higher profits overall, Π 𝑂𝑢𝑡 ∗ and Π 𝐼𝑛 ∗ increase at different rates. To see this, first consider Π 𝑂𝑢𝑡 ∗ : An increase in 𝜆 means the uninformed consumer is willing to pay more for the medium product 𝑞̂ 2 because her expected value (𝑒 ) is higher. Now consider Π 𝐼𝑛 ∗ : The firm’s benefit to signal to an uninformed H-type means an increment of 𝛼 𝐻 −𝑒 in her valuation, which is decreasing in 𝜆 . In other words, as 𝜆 increases, the incremental benefit of educating H-types is decreasing. Thus, for large 𝛼 , the firm is better off over-charging uninformed 𝐿 -types by commiting no data collection (Region I). In this case, uninformed L-types know the firm is uninformed and are not suspicious of being tricked. Therefore, they are willing to pay 𝑒 for marginal quality. Finally, consider 𝛼 < 𝜆 +𝜆 2 1+𝜆 2 . By Lemma 1, cannibalization concerns are so strong in this region that the firm, without data collection, offers only two products (informed L- types are not served). The boundary defined by 𝛼 (𝜆 ) is everywhere below the threshold 𝜆 +𝜆 2 1+𝜆 2 . This is depicted in Figure 2 by the lower boundary of Region I. Suppose the firm is faced with parameter values just below this boundary, for the firm collecting data, the benefit of signaling to the uninformed H-types is relatively higher, because the increment of H-types’ valuation 𝛼 𝐻 −𝑒 is decreasing in 𝛼 . The conditions in Proposition 2 (ii) are depicted in Figure 2 as the lower right region of the 𝜆 -𝛼 parameter space. As 𝛼 decreases and 𝜆 increases, consumers become more heterogenous and their prior expectation to be an H-type is greater. Therefore data collection may reduce consumers’ average surplus in this region (Region II), since consumers may lose their information rent by sharing data with the firm. As a result, consumers should opt out in Region II. In summary, while collecting data is often beneficial, it can sometimes be a burden to the firm. Knowing that the firm has advantageous information makes consumers suspicious that they might be tricked into paying more than they should. The suspicion requires the firm to undertake costly signaling, which benefits uninformed consumers. 57 5.2. Welfare Implications It is an immediate consequence of Proposition 2 that data sharing may lead to an ex ante Pareto optimal outcome. That is, under some conditions, consumers are willing to share their data even though it helps the firm make more profit. The ex ante condition requires only that consumers, on average, are better off by sharing their data. However, one can ask the question of whether both types of consumers are better off ex post. This stronger condition requires that consumers are better off in all possible realizations of the equilibrium. Proposition 3 affirms this is the case. Proposition 3 The equilibrium outcome with data collection is ex post strict Pareto optimal. Specifically, if neither Condition (i) nor (ii) of Proposition 2 holds, then the firm and both types of consumers are strictly better off when data are collected. The intuition for this result can be seen by first noting that L-types are strictly better off when opting in. Recall that L-types receive a negative expected surplus with no data collection. With data collection, they can avoid over -purchasing and obtain zero surplus. Second, H-types are better off from the firm’s signaling efforts when they are uninformed. Recall that when signaling occurs in equilibrium, it surpasses the incentive compatibility constraint and returns more surplus to H-types. The ex post optimality implied in Proposition 3 is created by the improved efficiency of product line design due to the signaling constraint, which counters the incentive compatibility constraint. In the opt-out condition, the long product line with three products induces strong cannibalization concerns, re ducing the efficiency of product line design. In addition, uninformed L-types over buy and obtain a negative surplus. With data collection, however, the firm possesses private information about the market state. The consumers know that the firm has private information and their suspicions of being tricked confound the firm’s price discrimination ability, forcing the firm to use the product line as a credible message to prevent consumers’ reluctance in purchase. This signaling allows the 58 firm to raise the quality of its lower quality product back toward the efficient level 𝑞 ∗ (𝑏 ,𝐿 )→𝛼 𝐿 . This process adds a surplus to the economy, which is subsequently split across all three agents: the firm, H-types and L-types. Hence, equilibrium data collection shows how market forces can correct the classic inefficiencies (Mussa & Rosen 1978) associated with adverse selection in product line design. 6. Data Collection without Learning Superior Information In this section, we examine a case in which superior information is not inferable by aggregating preference data. Here we suppose that data collection and aggregation only permit the firm to keep up with what consumers already know about themselves. As we show, the firm ’s lack of private, superior information, eliminates the need for consumers to be suspicious of the firm in its choice of its product line. The lack of this suspicion has implications for firm’s profit and consumers’ welfare. Thus, by comparing this benchmark with the previous analysis, we highlight the role of superior information for our earlier results. In this benchmark model, consider a stylized market scenario in which both types of consumers are initially either uninformed or informed (𝛽 𝐻 𝑔 =𝛽 𝐿 𝑔 =1, and 𝛽 𝐻 𝑏 =𝛽 𝐿 𝑏 = 0). Consider, for example, new products with different levels of brand awareness in the market. When the market-level awareness is high (good state), all consumers are informed regardless of their type. Otherwise all consumers are unfamiliar with the product and thus uninformed of their value. The firm who observes the noisy data of consumers cannot learn beyond any consumers’ knowledge, as a result, data aggregation does not create superior information in this scenario. We first solve the product line equilibrium when the firm collects data and infers the market state. When 𝑚 =𝑔 , both types are informed. Therefore, the firm’s optimal product line design problem is equivalent to the standard model with informed consumers (Mussa & Rosen 1978 and Moorthy 1985). If 𝛼 >𝜆 , then it is profitable to serve both 59 types. The firm offers 𝑞 ∗∗ (𝑔 ,𝐻 )=𝛼 𝐻 , 𝑝 ∗∗ (𝑔 ,𝐻 )= (𝛼 𝐻 −𝛼 𝐿 ) 2 1−𝜆 +𝛼 𝐻 𝛼 𝐿 for H-types, and 𝑞 ∗∗ (𝑔 ,𝐿 )= 𝛼 𝐿 −𝜆 𝛼 𝐻 1−𝜆 ,𝑝 ∗∗ (𝑔 ,𝐿 )= 𝛼 𝐿 2 −𝜆 𝛼 𝐻 𝛼 𝐿 1−𝜆 for L-types, where the upper script “∗∗ ” differentiates the equilibrium from “∗” in the main model. Otherwise when 𝛼 ≤𝜆 , the firm offers only one product 𝑞 ∗∗ (𝑔 ,𝐻 )=𝛼 𝐻 , 𝑝 ∗∗ (𝑔 ,𝐻 )=𝛼 𝐻 2 for H-types. Alternatively, when 𝑚 =𝑏 , all consumers are uninformed. Because the firm has no additional information about their preferences either, uninformed consumers cannot learn from the product line and thus maintain use of the ex ante estimate. In this case, the firm only needs to design one product {𝑞 1 ,𝑝 1 } to maximize (𝑝 1 −𝑞 1 2 /2) , subject to the individual rationality constraint, 𝑒 𝑞 1 −𝑝 1 ≥0 . The equilibrium is 𝑞 ∗∗ (𝑏 ,𝐻 )= 𝑞 ∗∗ (𝑏 ,𝐿 )=𝑒 and 𝑝 ∗∗ (𝑏 ,𝐻 )=𝑞 ∗∗ (𝑏 ,𝐿 )=𝑒 2 . In this case, the firm cannot price discriminate through a differentiated product line, but t he upside to the firm is that L-types over buy, since 𝑈 𝐿 =𝛼 𝐿 𝑒 −𝑒 2 <0. Overall, the firm ’s expected profit when consumers opt in is expressed as follows: Π 𝐼𝑛 ∗∗ ={ 𝜆 (𝛼 𝐻 −𝛼 𝐿 ) 2 4(1−𝜆 ) + 𝛼 𝐿 2 4 𝛼 ≥𝜆 𝜆 𝛼 𝐻 2 4 𝛼 <𝜆 . If data are not collected, the equilibrium is the same as indicated in Lemma 1, i.e., Π 𝑂𝑢𝑡 ∗∗ ≡Π 𝑂𝑢𝑡 ∗ . The uninformed firm offers three products despite there being only two types of consumers. It offers a “medium” option in case consumers are uninformed. This compromise, while optimal for the firm in expectation, is clearly suboptimal in each market state. Intuitively, i t is always beneficial for the firm to collect data, because, without superior information, consumers have no r eason to be suspicious of being tricked, therefore collecting data does not incur the burden of signaling costs. Lemma 3 indicates that without superior information, data collection always raise the firm ’s profit: Lemma 3 When data collection does not lead to superior information, t he firm always prefers to collect data, i.e., Π 𝐼𝑛 ∗∗ >Π 𝑂𝑢𝑡 ∗∗ . 60 Now turn to the consumer welfare analysis. From the analysis above, if 𝛼 <𝜆 , then consumers obtain zero surplus in expectation in either market state. If 𝑚 =𝑔 , and 𝛼 ≥𝜆 , H-types benefit from the incentive compatibility constraints. Thus when 𝛼 ≥𝜆 , the overall consumer surplus is positive: 𝐶 𝑆 𝐼𝑛 ∗∗ = 𝜆 (𝛼 𝐻 −𝛼 𝐿 )(𝛼 𝐿 −𝜆 𝛼 𝐻 ) 2(1−𝜆 ) . Generally, consumers benefit from the availability o f more products, holding the firm’s information constant. With more products, the firm distorts the quality of its lower end products a lesser extent. Opting in leads to weakly fewer products for consumers, because there are two or three products from the firm when they opt out but only one or two products when they opt in. But the benefit of opting in is that consumers may have more-relevant products, despite less available options . Lemma 4 formally indicates the distribution of consumer types who benefit from opting in. Lemma 4 The impact of sharing data with the firm differs by consumer types. In particular, (i) Opting in strictly reduces L-types’ surplus. (ii) If 𝛼 <𝜆 , then opting in raises H-types’ surplus if and only if 𝜆 <0.5, and 𝛼 < 𝜆 (1−2𝜆 ) (1−𝜆 )(1+2𝜆 ) . (iii) If 𝛼 ≥𝜆 , then opting in raises H-types’ surplus if and only if 𝜆 > 3−√3 2 ≈0.634, and 𝛼 ∈( 𝜆 (4−5𝜆 +2𝜆 2 ) 1+(1−𝜆 )(3−2𝜆 )𝜆 , 𝜆 (7𝜆 −2𝜆 2 −2) (1−𝜆 ) 3 +(4−𝜆 )𝜆 2 ). Note that opting in is never an ex post Pareto improvement. This contrasts with the main model (Proposition 3) when data collection gives the firm superior information. Clearly, informed L-types obtain zero surplus by either opting in or out. But uninformed L- types may lose the incentive-compatibility surplus and obtain negative surplus with an informed firm. Therefore, L-types are worse-off in expectation by sharing data in this market setting. 61 However, o pting in has two potential sources of benefits for H-types. These two benefits arise in separate, mutually exc lusive, conditions as given in parts (ii) and (iii) of Lemma 4. The conditions in part (ii) imply that the uninformed firm will only offer two products, skipping the informed L-types. In this case, uninformed H-type in state 𝑚 =𝑏 will under-buy the cheaper product, which quality is downward distorted (𝑞 <𝑒 ) and less relevant to her preference. By contrast, with data collection, uninformed H-type in b-state will buy the efficient product (𝑞 =𝑒 ). The interesting insight in part (ii) is that the firm’s attendance to cannibalization concerns is not always beneficial to H-types, because uninformed H-types may also suffer a utility loss from the inefficient cheaper product. The conditions of part (iii) are quite the opposite. They imply that the uninformed firm may offer three products, thus the uninformed H-types may obtain additional surplus due to the cannibalization constraint by choosing the medium product. The conventional wisdom of price discrimination is that the consumers with higher valuation always suffer from sharing willingness-to-pay with the firm, but those w ith lower valuation may benefit with output expansion. Lemma 4 counters this intuition. When consumers opt in, uninformed L-types are worse-off by losing their information rent, but uninformed H-types may benefit because a higher-quality product is available for them, i.e., 𝑞 ∗∗ (𝑏 ,𝐻 )>𝑞̂ 2 , which better fits with their preference. In addition, informed H-types can also obtain higher surplus from opting in, because the informed firm ’s product line design reflects accurate market distribution. Since the gain of H-types by opting in may more than compensate the loss of L-types, there exists a region in which consumers would prefer to opt in ex ante. Proposition 4 characterizes these conditions. Proposition 4 Even if data collection does not create superior information, c onsumers may still opt in. This occurs if and only if both conditions hold: (i) 𝜆 >1−2cos( 4 9 𝜋 )≈0.653, and (ii) 𝛼 ∈( 1+𝜆 (1−𝜆 ) 2 2−2𝜆 2 +𝜆 3 , 𝜆 2 +3𝜆 3 −𝜆 4 1−𝜆 +𝜆 2 +3𝜆 3 −𝜆 4 ) . 62 Figure 3: Consumers Opting In when Superior Information is Impossible Figure 3 illustrates Proposition 4 by contour plotting the regions for consumers to opt in. The horizontal axis represents H-type’s population 𝜆 and vertical axis represents the consumer homogeneity 𝛼 ≡ 𝛼 𝐿 𝛼 𝐻 . For better illustration, we restrict the range of each parameter between 0.5 and 1. The white (gray) region represents where consumers prefer opting out (in). Along the left dashed curve we have 𝑞̂ 3 =0 , along the right diagonal dashed line, 𝑞 ∗∗ (𝑔 ,𝐿 )=0 , and along the vertical dashed line, 𝜆 ≈0.653 . The upper boundary curve of the opt-in region is 𝛼 = 𝜆 2 +3𝜆 3 −𝜆 4 1−𝜆 +𝜆 2 +3𝜆 3 −𝜆 4 and the lower boundary curve is 𝛼 = 1+𝜆 (1−𝜆 ) 2 2−2𝜆 2 +𝜆 3 . From Figure 3, the region under which consumers prefer to opt in contains two scenarios, regardless of whether the uninformed firm offers two or three products. 7. Conclusion This paper examined the incentives and implications of collecting consumer data to obtain superior preference information. We sought to understand who benefits when a firm has informational advantage over consumers who do not know ex ante which product best fit 63 their intrinsic preference. The focus on superior preference information was intended to capture the modern development of intensive market research and digital technologies and to examine whether the additional information learned by the firms can be credibly shared and eventually beneficial to the uninformed consumers. When consumers are perfectly informed of their own preference, previous research suggests that sharing individual data generally diminishes consumer surplus, because the firm is better equipped to price discriminate. With uninformed consumer preferences, however, we showed that this is not always the case. This research uncovered a novel mechanism that shows how data aggregation can create superior information and mutually benefit both the firm and every consumer. When a consumer’s knowledge of her preference is correlated with her type, data aggregation enables the firm to learn superior preference information that exceeds consumers’ prior knowledge. While this suggests an informational advantage for the firm, consumers ’ rational suspicions may confound the firm’s ability of price discrimination using the superior information. In fact, to convince consumers of their higher val ue (H-types in the model), the firm must reverse the classic product line distortions to convince the uninformed H-types to buy the high-end product. As a result, the cannibalization constraint may no longer bind. It is worth noting that despite the additional signaling constraint, the firm may still benefit from collecting data, because it obtains more profit by serving L- types a product at the efficient level. By contrast, when superior information is not obtainable with data aggregation (Section 6), the firm designs the “second best” product line to extract the usual monopoly rents without raising indirect costs of consumer suspicion. If consumers opt out, the firm is forced to provide up to three products, which includes a medium -quality product for the uninformed consumer. The longer product line is beneficial for the consumers, since the firm must attend to increased cannibalization concerns relative to fewer products. Despite the above mechanism, consumer opt -in can also occur in equilibrium, because H-types may 64 benefit ex post by either paying a lower price or purchasing a better-fit product, but it is only under much narrower circumstances than in the main model with superior information. The above results provoke some conventional wisdom in the debate on consumer data collection. First, a firm ’s informational advantage over consumer preference data does not necessarily lead to higher degree of price exploitation, because consumers ’ rational inferences can prevent them from being exploited. While this addresses the concerns of consumer advocacy groups on data collection, it also suggests that a firm may have a market interest in communicating its knowledge about consumers through its product line, which can correct a classic monopoly distortion in product line design. Second, this research challenges the meaning of private consumer information. In the main model, the aggregation of consumer data {𝑠 𝑖 } 𝑖 =𝐿 ,𝐻 is more informative than the sum of the individual knowledge. This is because by accessing to data on more consumers than any individual consumer observes, the firm has the advantage of controlling for the environmental noises that affect individual consumers’ experiences. Particularly, data aggregation enables the firm to acquire private information on consumers’ imperfect beliefs of their value. This suggests that definitions of private consumer information may be broader than traditionally assumed. Specifically, if data aggregation creates new knowledge, then one could ask whether the new knowledge is the firm ’s own intellectual property. Related to the above point is the issue of forced disclosure of firm’s private information. Some researchers have suggested forced disclosure of firm’s consumer data would generally benefit consumers (Kamenica at al. 2011). Our research suggests that this may not always be the case. Whenever the firm has superior information on consumer preferences and consumers know it, then consumers ’ rational suspicions may force the firm to return surpluses via the firm’s incentive to communicate to high value consumers. If a credible third-party were to communicate the information to consumers directly without the need of convincing, then the firm would be able to implement its second best optimal, 65 which can be worse for consumers because it alleviates the firm’s burden of communicating its private information. The provocative implications highlighted here are subject to the conditions of our model. Perhaps most critically is our assumption of sophisticated consumers and their knowledge and understanding of firms’ incentives for data collection. To be sure, much of the debate about data collection by commercial interests revolves around the issue of consumers’ naïveté. If data are collected without the consumers’ explicit consent or knowledge, then the pure equilibrium from the main model may not hold, instead, a mixed strategy may be more appropriate when consumers need to infer whether data collection has occurred from the observed product line. It is also clear from previous research that if consumers have perfect knowledge of their preferences, then consumer data collection can reduce their surplus. Therefore, our fi ndings that consumers can benefit from sharing their data with for-profit firms are shown under the specific conditions of uninformed consumer preference. With these caveats in mind, our hope with this research is to show that there is still much research needed to understand the full implications of consumer data collection. 66 CHAPTER THREE: INTEMEDIARY CURATION WITH SUPERIOR INFORMATION ON CONSUMER PREFERENCES 1. Introduction In standard models on online marketplace or product listing platforms, the consumers involved are assumed to have private information about their product preferences, and the intermediaries only serve as two-sided platforms for sellers to reduce transaction costs and consumers to reduce search costs. With the advancement of data technologies, however, the informational circumstances are reversed. Consumers who may not know perfectly about their product preferences may seek help from an intermediary curator for its recommendations of the most relevant product or service. For example, with a wealth of consumer user data, Netflix may obtain better estimates on a consumer ’s true product preferences and can recommend movies or TV shows to the consumer. Content curators may learn beyond consumers’ knowledge of their V odka preference and recommend manufacturers accordingly (Wood, 2013). In these situations, the intermediaries with superior information on consumer preferences may help match uninformed consumers with the fitted products or sellers. We explore the consequences of this reversal when an intermediary has asymmetric preference information. Two bare-bone models, on commission -based marketplace and auction-based product-listing internet platforms, examine how matching uninformed consumers with sellers affects prices and welfare. We first establish that, without superior information, consumers ’ uninformed preference may harm both consumers and the sellers. Then we examine the conflict of interests between the intermediary curators and uninformed consumers. 67 These models are not intended to provide a complete analysis of the issue but merely to illustrate a potential countervailing force between the curator’s need of acquiring superior information for matching strategies and possible consumers’ rational suspicion that the curator may trick them using the superior preference information. The issue is important because, as consumers become more sophisticated, they may change their responses to the curator’s recommendations, and correspondingly change the sellers ’ prices and the curator’s matching strategies. The remainder of the chapter is structured as follows. Next section presents the general model. Section 3 discusses the application in an auction-based product listing platform. Section 4 examines a similar scenario in a commission-based marketplace. Section 5 concludes with theoretical issues to consider. 2. The General Model There is a contingent of consumers who are distributed uniformly on a Hotelling line, 𝑖~C:U[0,1]. An intermediary curator has the technology to identify each consumer and personalize a recommendation for the consumer. There are two sellers competing for the recommendation slot, 𝑗 ~{A,B}, which locate on both ends of the Hoteling line. C onsumer 𝑖 has higher matching value with the seller who locates closer to her ideal preference point 𝑖 . The consumers’ utility function is U ij , which is linear on 𝑚 𝑖𝑗 . Sellers’ profit functions Π ij are also linear on 𝑚 𝑖𝑗 and depend on the consumer’s action 𝑑 𝑖𝑗 . Based on different profit sharing mechanisms that the intermediary imposes, the selected seller and the intermediary will share the profit if the consumer chooses the seller. Clearly, when consumers know their preferences, th e game is straightforward that each consumer will select only the most relevant seller, and anticipating that action, the curator will only select that seller to recommend to the consumer. However, when consumers are uninformed of their matching value 𝑚 𝑖𝑗 , then the intermediary ’s matching 68 problem is more complicated. Suppose 𝑖 is unknown ex-ante, consumers may observe an imperfect signal, which is influenced by a random shock 𝛾 , which captures the environment noise for consumers to evaluate their matching value with the sellers. Specifically, assume that 𝛾 ∈(0,1) with a uniform distribution. When Consumer 𝑖 observes seller 𝑗 , she receives a signal 𝑠 𝑖𝑗 ≡ 𝑚 𝑖𝑗 +𝛾 . Then the consumer’s action would be based on the noise signal and the observed recommendation. The game unfolds in three periods. In period one, sellers make decision; In period two, the intermediary observes the random shock 𝛾 ; In period three, Consumer 𝑖 arrives. The intermediary identifies the consumer and assigns 𝑗 . Consumer 𝑖 observes the signal and decides whether to click. All other information is common knowledge. 3. An Auction-based Product Listing Platform 3.1 Model Specification Consider a matching platform that allows product listers to bid on cost-per-click 26 rates based on a second price auction mechanism 27 . Consumer 𝑖 has a matching value of (1−𝑖) , and 𝑖 with the seller A and B, respectively. The platform has the technology to identify each consumer and personalize the ranking of product sellers for the consumer. The process can be described simply as this: When a consumer enters, the platform first identifies the consumer by her data of browse/search history, it then evaluates both the sellers’ bids and relevance of their products with the consumer’s type and matches the consumer with one seller. Consumers can view the seller and then click. If consumers click, they will be transferred to the seller’s website. For each click, there is a cost of 1 2 . This cost 26 Cost Per Click (CPC): sellers pay only when their ad is clicked, not each time an ad is shown. 27 Second price auction is commonly used on the platforms that charges sellers CPC fee, such as Google Shopping. The rule is that the higher bidder pays the next highest bid, if that bidder is selected. However, Google Shopping also considers the relevance of the bidder through a matching algorithm, therefore, the highest bidder is not guaranteed to win the auction. 69 represents the cost of time in exploring for the seller’s website. 1 2 ensures that the consumer only clicks the most relevant seller. The utility function is therefore U ij =I[click](𝑚 𝑖𝑗 − 1 2 ), where I[.] is an indicator function which takes value 1 if the consumer clicks 𝑗′𝑠 ads, and 0 if the consumer does not click. The sellers’ profit function has two components. The first captures the benefit of brand exposure without consumer clicks. Specifically, A ’s profit is 𝑚 𝑖𝐴 , and B ’s profit is 𝛽 𝑚 𝑖𝐵 , where 𝛽 ∈(0,1) represents the seller’s heterogeneous needs of showing their brand on the website (incentives for brand awareness). Because the platform charges only a cost-per-click fee, the seller does not need to pay for anything if the consumer views their advertisement without clicking. The second captures the benefit of potential conversion when consumers click the advertisement. Each seller obtains 𝑚 𝑖𝑗 −Min{𝑏 𝑖𝐴 ,𝑏 𝑖𝐵 }, where 𝑏 𝑖𝑗 is the cost per click fee 𝑗 bids to show the advertisement to 𝑖 . Equation (3) summarizes the expected profit function for each seller: Π iA =𝑚 𝑖𝐴 +𝑝 𝑖𝐴 (𝑚 𝑖𝐴 −Min{𝑏 𝑖𝐴 ,𝑏 𝑖𝐵 }), Π iB =𝛽 𝑚 𝑖𝐵 +𝑝 𝑖𝐵 (𝑚 𝑖𝐵 −Min{𝑏 𝑖𝐴 ,𝑏 𝑖𝐵 }), where 𝑝 𝑖𝑗 ≡Pr[click ij ] is the probability that Consumer 𝑖 clicks 𝑗 . The matching platform’s objective is to maximize its own profit by selecting the seller 𝑗 : Π iP =𝑝 𝑖𝑗 Min{𝑏 𝑖𝐴 ,𝑏 𝑖𝐵 } 3.2 Informed Consumers In this section, we examine the benchmark when consumers already learn their preference point ex-ante. We then contrast this result with the following sections to examine the role of superior information that the platform learns on consumers. 70 When consumers learn their 𝑖 , then 𝛾 does not affect their knowledge. Consumers click on A, if 𝑖 ∈[0, 1 2 ], and B, if 𝑖 ∈[ 1 2 ,1]. Therefore, the platform will match A with 𝑖 ∈ [0, 1 2 ) , and B with 𝑖 ∈( 1 2 ,1]. Since for 𝑖 ∈[0, 1 2 ) , B estimates that 𝑝 𝑖𝐵 =0, therefore, it bids 𝑏 𝑖𝐵 =𝛽 𝑖 . A can win the auction by bidding any price above 𝑏 𝑖𝐵 and obtains the net profit of 𝑚 𝑖𝐴 +(𝑚 𝑖𝐴 −𝛽 𝑖)=2−(2+𝛽 )𝑖 . Similarly, for 𝑖 ∈( 1 2 ,1] , A bids 𝑏 𝑖𝐴 = (1− 𝑖) since it estimates that 𝑝 𝑖𝐴 =0 , and B wins the auction with the net profit of 𝛽 𝑚 𝑖𝐵 +(𝑚 𝑖𝐵 −(1− 𝑖))=(2+𝛽 )𝑖 −1 . If 𝑖 = 1 2 , then B’s expected gross profit by serving the consumer is (1+𝛽 ) 2 , which is less than A’s. Therefore, A wins the auction by paying (1+𝛽 ) 2 and obtains (1−𝛽 ) 2 . Table 1 summarizes the equilibrium result in this scenario. Table 1: Auction Outcome with Informed Consumers Consumer 𝑖 Matched Ads. Consumer action Seller 𝐴 ’s Profit Seller 𝐵 ’s Profit Platform’s Profit [0, 1 2 ) A Clicks 2−(2+𝛽 )𝑖 0 𝛽 𝑖 1 2 A Clicks 1−𝛽 2 0 1+𝛽 2 ( 1 2 ,1] B Clicks 0 (2+𝛽 )𝑖 −1 1− 𝑖 3.3 Uninformed Consumers In this section, we study the impact of uninformed consumer preference when the plat form also do not have superior information on 𝛾 . In this scenario, uninformed consumers rely only on the noisy signal 𝑚 𝑖𝑗 +𝛾 to estimate their match value with the advertisement. Our objective is to understand whether consumers can be unambiguously exploited by the matching platform when they are naïve. We derive the consumer belief updates in this scenario. If Consumer 𝑖 observes 𝑗 , she updates her belief from the signal 𝑠 𝑖𝑗 : E[𝑖|𝑠 𝑖𝑗 ]. Since f(𝛾 )=1, 𝑖 estimates 71 E[𝑚 𝑖𝑗 |𝑠 𝑖𝑗 ]=∫ (𝑠 𝑖𝑗 −𝛾 )𝑓 (𝛾 ) 1 0 𝑑𝛾 =𝑠 𝑖𝑗 − 1 2 =𝑚 𝑖𝑗 +𝛾 − 1 2 Since clicking cost is 1 2 , 𝑖 clicks A if and only if E[𝑚 𝑖𝐴 |𝑠 𝑖𝐴 ]≥ 1 2 , which is equivalent to 𝑖 ≤𝛾 , and clicks B if and only if E[𝑚 𝑖𝐵 |𝑠 𝑖𝐵 ]≥ 1 2 , which i s equivalent to 𝑖 ≥ 1−𝛾 . But since the sellers do not observe 𝛾 before bidding, A estimates the probability that 𝑖 clicks as 𝑝 𝑖𝐴 =Pr[𝑖 ≤𝛾 ]=Pr[𝛾 ≥𝑖]=1−𝑖 . Therefore, A’s expected gross profit by serving the consumer is 𝑚 𝑖𝐴 +(1−𝑖)𝑚 𝑖𝐴 , and therefore bids at 𝑚 𝑖𝐴 +(1−𝑖 )𝑚 𝑖𝐴 Pr[𝛾 ≥𝑖 ] = 2−𝑖 Similarly, B estimates its expected gross profit as 𝑝 𝑖𝐵 =Pr[𝑖 ≥1−𝛾 ]= Pr[𝛾 ≥1−𝑖]=𝑖 , and thus values her as 𝛽 𝑚 𝑖𝐵 +𝑖 𝑚 𝑖𝐵 =(𝛽 +𝑖)𝑖 . Therefore, B bids (𝛽 +𝑖 )𝑖 Pr[𝛾 ≥1−𝑖 ] =𝛽 +𝑖 . Clearly, in a second -price auction, A wins the auction if and only if 2− 𝑖 >𝛽 +𝑖 , or equivalently, 𝑖 <1− 𝛽 2 . Intuitively, we have the following result: Result 1: The platform matches naïve consumers with A if 𝑖 ∈[0,1− 𝛽 2 ) , and B if 𝑖 ∈ [1− 𝛽 2 ,1]. Result 1 shows that when consumers are naïve, the platform ’s dominating matching strategy is to select the winner from the second-price auction. Table 2 summarizes the equilibrium results when consumers are naïve. Table 2: Auction Outcome with Naïve Consumers Consumer 𝑖 Matched Ads. Consumer Action Total Sellers’ Profit Platform’s Expected Profit [0,1− 𝛽 2 ) A Clicks if 𝛾 ≤1−𝑖 (2−4𝑖 −𝛽 )(1−𝑖) (1−𝑖)(𝛽 +𝑖) [1− 𝛽 2 ,1] B Clicks if 𝛾 ≤𝑖 (−2+2𝑖 +𝛽 )𝑖 𝑖(2−𝑖) Comparing Table 1 and Table 2 we obtain the following result. Proposition 1: The consumers and both sellers are worse off when consumers are 72 uninformed and naïve. Specifically, 1.1. The platform matches consumers with the irrelevant seller, if 𝑖 ∈( 1 2 ,1− 𝛽 2 ) ; 1.2. Consumers do not click on the relevant seller, and the platform loses profit. 1.3. Sellers pay a higher cost 1.4. Platform may obtain higher profit Proposition 1 suggests two important results when consumers have uninformed preference and the platform has superior information. First, consumers ’ incorrect beliefs will affect the sellers’ bidding strategies, and consequently lead to mismatch, even though the platform can identify the consumer preference. This is because the platform has the conflict of interest by serving the seller with higher valuation on consumers. Second, the platform has incentives to inform the consumers with uninformed preference to convince them click on the relevant advertisement. In this way, the platform can increase its profit from the cost-per-click revenue. However, suspicious consumers may not be convinced due to the conflict of interests between the platform and consumers. 4. A Commission-based Marketplace 4.1 Model Specification Again, consider a matching platform that charges a commission, s uch as a royalty rate on sellers’ revenue. 28 The platform has the technology to identify each consumer and recommend a seller for the consumer. Rather than in Section 3 that consumers click on product lister’s link, the consumers only need to observe the se ller’s price and decide on purchasing. The utility function is therefore U ij =𝑚 𝑖𝑗 −𝑝 𝑗 , where 𝑝 𝑗 is seller j’s price for the market. 𝑚 𝑖𝐴 =(1−𝑖) , and 𝑚 𝑖𝐵 =𝑖 𝛽 , where 𝛽 >1. 28 This mechanism is commonly used by online marketplace, such as Amazon and eBay. 73 The seller’s profit function is determined by the price, the expected demand D 𝑖 (𝑝 𝑖,𝑝 𝑗 ) and the platform’s commission rate, which is assumed to be constant 𝑐 . Π ij =(1−𝑐 )𝑝 𝑗 D 𝑖 (𝑝 𝑖 ,𝑝 𝑗 ) The matching platform’s objective is to maximize its own profit by selecting the seller 𝑗 for each 𝑖 such that Π P =𝑐 ∑[𝑝 𝑗 D 𝑖 (𝑝 𝑖,𝑝 𝑗 )] 𝑗 4.2 Informed Consumers When consumers learn their 𝑖 , then their purchase decision depends on the match value and either seller’s price. For fixed price 𝑝 𝐴 and 𝑝 𝐵 , we can obtain the following decision rules: 𝑖 chooses A if 𝑖 ≤1−𝑝 𝐴 , and choose B if 𝑖 ≥ 𝑝 𝐵 𝛽 . Clearly, when 1−𝑝 𝐴 < 𝑝 𝐵 𝛽 , then the sellers do not compete for the same consumer. However, if 1−𝑝 𝐴 ≥ 𝑝 𝐵 𝛽 , then the platform may select the higher priced seller for the consumer who locates between ( 𝑝 𝐵 𝛽 , 1−𝑝 𝐴 ) . However, since the first best strategy for A is to charge 𝑝 𝐴 ∗ = 1 2 , and that for B is to charge 𝑝 𝐵 ∗ = 𝛽 2 , therefore, both sellers do not have incentives to deviate such that 1− 𝑝 𝐴 > 𝑝 𝐵 𝛽 . Accordingly, the platform will match consumer 𝑖 < 1 2 with A, and 𝑖 ≥ 1 2 with B. This matching strategy, like Section 3.2, is consistent with the consumers ’ most relevant choice. 4.3 Uninformed Consumers Similarly, since E[𝑚 𝑖𝑗 |𝑠 𝑖𝑗 ]=𝑚 𝑖𝑗 +𝛾 − 1 2 , 𝑖 chooses A if and only if E[𝑚 𝑖𝐴 |𝑠 𝑖𝐴 ]≥𝑝 𝐴 , which is equivalent to 𝑖 ≤ 1 2 +𝛾 −𝑝 𝐴 . Therefore, A estimates the expected demand as 74 E[ 1 2 +𝛾 −𝑝 𝐴 ]=1−𝑝 𝐴 and therefore determines the price 𝑝 𝐴 ∗ = 1 2 . Similarly, 𝑖 chooses B if and only if 𝑖 ≥ 1 𝛽 (𝑝 𝐵 + 1 2 −𝛾 ) , B estimates its expected demand as E[1− 1 𝛽 (𝑝 𝐵 + 1 2 −𝛾 )]=1− 𝑝 𝐵 𝛽 and thus determines the price 𝑝 𝐵 ∗ = 𝛽 2 , so if the platform does not have information on 𝛾 , it will match consumer 𝑖 < 1 2 with A, and 𝑖 ≥ 1 2 with B. The matching strategy will be the equivalent to the full information case. However, two issues arise: First, when γ< 1 2 , then there exists potential loss of consumer surpluses and seller profits. For example, consumers who locate in the interv al (γ, 1 2 ) will not purchase from A even if they are correctly recommended. Similarly, consumers who locate in the interval ( 1 2 , 1 2 + 1 𝛽 ( 1 2 −𝛾 )) will not purchase from B If consumers are sophisticated, then they may learn that given the matchi ng strategy, they should always trust the platform and purchase from the recommended seller. Second, but if the consumers change their action as above, then the platform will have incentives to recommend to every consumer B’s product, since 𝑝 𝐵 ∗ >𝑝 𝐴 ∗ . This potential deviation will also affect the sellers’ pricing decision, and correspondingly the sophisticated consumers’ choices. 5. Conclusion Technologies on information collection and processing are allowing intermediary platforms to learn superior preference information on their consumers than the consumers themselves. In many cases, this knowledge allows the platform to serve as a curator to match consumers with the relevant seller or service provider. However, because the information processing creates an information asymmetry, the platform may abuse the information to exploit consumers’ uninformed preferences, especially when there is a conflict of interest such that the platform has incentives to match the consumers with an irrelevant seller. The two models presented in this chapter show that superior information 75 on consumer preference may improve the efficiency of matching and all participants’ payoffs, however, due to rational consumer suspicion, this requires a careful design of the matching strategy, such that the recommendations can credibly communicate with the uninformed consumers about the superior information. 76 REFERENCES Acquisti, A ., & Varian, H . R. 2005. Conditioning prices on purchase history. Marketing Science 24(3), 367–381. Aguirre, I., Cowan, S., & Vickers, J. 2010. Monopoly price discrimination and demand curvature. The American Economic Review, 100(4), 1601 -1615. Banks, J. S., & Sobel, J. 1987 . Equilibrium selection in signaling games. Econometrica: Journal of the Econometric Society, 647 -661. Bagwell, K., & Riordan, M. H. 1991. High and declining prices signal product quality. The American Economic Review, 81(1), 224 –239. Beaudry, P. 1994 . Why an informed principal may leave rents to an agent. International Economic Review, 821-832. Bergemann, D ., Brooks, B ., & Morris, S . 2015. The limits of price discrimination. American Economic Review, 105(3), 921 -957 Calzolari, G ., & Pavan, A . 2005. On the optimality of privacy in sequential contracting. Journal of Economic Theory, 30 (1), 168–204. Cho, I . K. & Kreps, D. M. 1987 . Signaling games and stable equilibria. Quarterly Journal of Economics, 102, 179-221. Cho, I. K. & Sobel, J. 1990. Strategic stability and uniqueness in signaling games. Journal of Economic Theory, 50 (2), 381-413. Guo, L. & Zhang, J., 2012. Consumer deliberation and product line design. Marketing Science, 31(6), pp.995 -1007. Horowitz, Joel. 1992. The role of the list price in housing markets: Theory and an Econometric Model. Journal of Applied Econometrics, 7, 11 5-129. Hui, K . L., & Png, I . 2003. Piracy and the legitimate demand for recorded music. Contributions in Economic Analysis & Policy, 2 (1). Kamenica, Emir. 2008. Contextual inference in markets: on the informational content of product lines. American Economic Review. 98(5), 2127 –2149. 77 Kamenica, Emir, Sendhil Mullainathan, & Richard Thaler. 2011. Helping consumers know themselves. American Economic Review: Papers & Proceedings. 101(3), 417-422. Knight, John, C.F. Sirmans, and Geoffrey Turnbull. 1994. List pr ice signaling and buyer behavior in the housing market. Journal of Real Estate Finance and Economics, 9, 177-192. Larréché, J . C. 2008. The momentum effect: How to ignite exceptional growth. Wharton School Publishing, New Jersey, Pearson Education. Maskin, E., & Riley, J . 1984. Monopoly with incomplete information. The RAND Journal of Economics, 15(2), 171 -196. Maskin, E., & Tirole, J. 1990 . The principal-agent relationship with an informed principal: The case of private values. Econometrica, 379 -409. Milgrom, P. & Roberts , J. 1986, Price and advertising as a signal of quality, Journal of Political Economy, 94, 796-812. Pigou, A. C. 1924. The economics of welfare. Transaction Publishers. Robinson, J. 1933. The economics of imperfect competition, Macmillan, L ondon. Samuelson, P . 1999. Privacy as intellectual property. Stanford Law Review, 52, 1125. Solove, D . J. 2007. I've got nothing to hide and other misunderstandings of privacy. San Diego Law Review, 44, 745. Taylor, C . R. 2004. Consumer privacy and the market for customer information. The RAND Journal of Economics, 35(4), 631–650. Varian, H. R. 1985. Price discrimination and social welfare. The American Economic Review, 75(4), 870-875. Varian, H. R. 2002. Economic aspects of personal privacy. In Cyber Policy and Economics in an Internet Age, 127 -137. Springer. Wernerfelt, B . 1995. A rational reconstruction of the compromise effect: using market data to infer utilities. Journal of Consumer Research, 21(4): 627 -33 Kahneman, D; T versky, A. Prospect theory: an analysis of decision under risk. 78 Econometrica, v. 47, n. 2, p. 263 -290, 1979. Lichtenstein, D. R., Netemeyer, R. G., & Burton, S. (1990). Distinguishing coupon proneness from value consciousness: An acquisition-transaction utility theory perspective. The Journal of Marketing, 54 -67. Horowitz, Joel L (1986). "Bidding models of housing markets." Journal of Urban Economics 20, no. 2 168 -190. Knight, John R., C. F. Sirmans, and Geoffrey K. Turnbull (1994). "List price signaling and buyer behavior in the housing market." The Journal of Real Estate Finance and Economics 9, no. 3 177 -192. 79 APPENDICES Appendix A: Proofs to Chapter One This appendix contains the technical details omitted from the main text, including the proof to all lemmas and propositions. Proof of Lemma 1 First, we show that (i) and (ii) are equivalent. By (i), 𝑀 (1)≥0 implies that [0,𝑀 (1)] is non-empty. Consider an arbitrary 𝑚 such that 0≤𝑚 ≤𝑀 (1) . Since 𝑀 −1 (∙) is increasing, we must have 𝑀 −1 (𝑚 )≤1 . In addition, since Assumption 1 requi res that 𝜃 𝐿 (𝑚 )<𝜃 𝐻 (𝑚 ) , we must have 𝑚 <𝑀 −1 (𝑚 ) , therefore 0≤𝑀 −1 (𝑚 ) . Set 𝑚 ′ = 𝑀 −1 (𝑚 ) . We established that 𝑚 ′ ∈[0,1] and 𝑚 ′ ≠𝑚 . Since 𝜃 𝐿 (𝑚 ′ )=𝜃 𝐻 (𝑚 ) , by Definition 1, H-types are uninformed under 𝑚 . By contrast, for any m with 𝑀 (1)<𝑚 , we must have 1<𝑀 −1 (𝑚 ) . Thus H-types are informed in this market state. Conversely, when (ii) holds, [0,𝑀 (1)] is non-empty, which implies 𝑀 (1)≥0. Next we show that (i) and (iii) are equivalent. Suppose 0≤𝑀 (1) , since 𝑀 −1 (∙) is increasing, we must have 𝑀 −1 (0)≤𝑀 −1 [𝑀 (1)]=1 . Now conversely, suppose 𝑀 −1 (0)≤1. Again, because 𝑀 (∙) is increasing, 0=𝑀 [𝑀 −1 (0)]≤𝑀 (1) . Finally, we show that (iii) and (iv) are equivalent. By (iii), [𝑀 −1 (0),1] is non- empty. Thus, we can take any 𝑚 with 𝑀 −1 (0)≤𝑚 ≤1, such that (1) 0≤𝑀 (𝑚 ), since 𝑀 (∙) is increasing, and (2) 𝑀 (𝑚 )≤1, since 𝑀 (𝑚 )≤𝑚 by Assumption 1. Therefore 𝑀 (𝑚 )∈[0,1]. Set 𝑚 ′ ≡𝑀 (𝑚 )=𝜃 𝐻 −1 [𝜃 𝐿 (𝑚 )]. Re-writing gives 𝜃 𝐻 (𝑚 ′ )=𝜃 𝐿 (𝑚 ) , thus L-types are uninformed under 𝑚 by Definition 1. By contrast, if 𝑚 <𝑀 −1 (0) , then we must have 𝑀 (𝑚 )<0 . Thus L-types are informed. Conversely, (iv) implies that 𝑀 −1 (0)≤𝑚 . ∎ 80 Equilibrium Concept and Refinement Definition A (Perfect Bayesian Equilibrium): Any triple {𝑠 𝑖 ∗ ,𝑟 𝑖 ∗ ,𝜇 (𝑠 𝑖 ∗ )} 𝑖 =𝐻 ,𝐿 is a Perfect Bayesian Equilibrium (PBE) if it satisfies both the conditions below. (A.1) Sequential Rationality: (i) 𝑟 𝑖 ∗ (𝑠 𝑖 ,𝜇 (𝑠 𝑖 ))=1, if and only if 𝐸 [𝛼 |𝜇 (𝑠 𝑖 )]≥min{𝑝 , ̅𝑝 𝑖 }. (ii) 𝑆 ∗ =(𝑠 𝐿 ∗ ,𝑠 𝐻 ∗ ) maximizes the firm’s profit: Π(𝑆 )=𝜆 min{𝑝 , ̅𝑝 𝐻 }𝑟 𝐻 ∗ +(1−𝜆 )min{𝑝 , ̅𝑝 𝐿 }𝑟 𝐿 ∗ . (A.2) Consistency (Bayes’ Rule): If an uninformed consumer 𝑖 observes 𝑠 𝑖 =(𝑝 𝑖 ,𝑝 ̅ ) , then her updated belief 𝜇 (𝑠 𝑖 ) follows Bayes’ Rule whenever possible. As noted in the main text, we apply the D1 Criterion to eliminate unreasonable equilibria arising in the model with the uninformed firm. In this model D1 is equivalent to D2 (Cho and Kreps, 1987; Banks and Sobel, 1987; Cho and Sobel, 1990), since there are only two possible market states, either 𝑚 𝐿 or 𝑚 𝐻 , one of which is the true market state. D1 can interpret this criterion as a two-step check: The first step is to check whether there exists a market state under which firm deviates less often under arbitrary consumer belief. For instance, if the firm has a strong incentive to deviate in market state 𝑚 𝐿 whenever it has a weak incentive to deviate in 𝑚 𝐻 , then we say that the firm less often deviates under 𝑚 𝐻 and more often deviates under 𝑚 𝐿 . The second step is to check whether the consumer’s updated belief assigns zero weight to the market state under which the firm less often deviates. For instance, in the prev ious case if the uninformed consumer assigns positive probability that she is an H-type, then the equilibrium fails D1. Definition B (D1 Criterion): A perfect Bayesian equilibrium {𝑠 𝑖 ∗ ,𝑟 𝑖 ∗ ,𝜇 (𝑠 𝑖 ∗ )} 𝑖 =𝐻 ,𝐿 survives D1, if and only if for any out-of-equilibrium pricing strategy 𝑠 𝑖 ′ ≠𝑠 𝑖 ∗ , 𝑖 ∈ {𝐻 ,𝐿 }, such that whenever the following condition holds for 𝑗 ≠𝑘 ∈{𝐻 ,𝐿 }: ⋃ {𝑟 𝑖 |Π ∗ (𝑚 𝑗 )≤Π ̂ 𝑖 (𝑚 𝑗 ,𝑠 𝑖 ′ ,𝑟 𝑖 )} 𝜇 ⊊⋃ {𝑟 𝑖 |Π ∗ (𝑚 𝑘 )<Π ̂ 𝑖 (𝑚 𝑘 ,𝑠 𝑖 ′ ,𝑟 𝑖 )} 𝜇 , 81 where 𝑟 𝑖 is a best response to 𝑠 𝑖 ′ given an arbitrary belief 𝜇 ∈[0,1] , i.e., 𝑟 𝑖 = 𝐵𝑅 (𝑠 𝑖 ′ ,𝜇 ∈[0,1]) , consumer 𝑖 updates her belief as 𝜇 (𝑠 𝑖 ′ )={ 0, 𝑗 =𝐻 1, 𝑗 =𝐿 . Proof of Proposition 1(i)-(iii) When 𝑚 ∈𝓜 𝟎 , H-types are informed and, therefore, purchase at any price at or below 𝛼 𝐻 . Since 𝑀 (1)<𝑀 −1 (0) , 𝓜 𝟎 =(𝑀 (1),𝑀 −1 (0))∪[𝑀 −1 (0),1] . If 𝑀 ∈ (𝑀 (1),𝑀 −1 (0)) , t hen L-types are also informed, thus they are willing to pay at most 𝛼 𝐿 . Therefore, the firm’s optimal pricing strategy is 𝑝 𝐻 ∗ =𝛼 𝐻 , and 𝑝 𝐿 ∗ =𝛼 𝐿 . Otherwise if 𝑀 ∈[𝑀 −1 (0),1] , then 𝐿 -types are uninformed. To show 𝑠 𝐿 ∗ is optimal for the firm, consider an arbitrary deviation 𝑠 𝐿 ′ =(𝑝 𝐿 ′ ,𝑝 ̅ ′ )≠𝑠 𝐿 ∗ . A deviation 𝑠 𝐿 ′ is profitable if and only it convinces L-types to pay more, which requires a belief 𝜇 (𝑠 𝐿 ′ )>0. Therefore by (iii), Π ̂ 𝐿 (𝑚 𝐿 ,𝑠 𝐿 ′ ,𝑟 𝐿 (𝑠 𝐿 ′ ,𝑚 𝐻 ))≤Π ∗ (𝑚 𝐿 ) , wh ere 𝑟 𝐿 (𝑠 𝐿 ′ ,𝑚 𝐻 )=1 if 𝑝 𝐿 ′ ≤𝛼 𝐻 . Since the belief 𝑚 𝐻 is most favorable for the firm, we have Π ̂ 𝐿 (𝑚 𝐿 ,𝑠 𝐿 ′ ,𝑟 𝐿 (𝑠 𝐿 ′ ,𝑚 𝐻 ))≥ Π ̂ 𝐿 (𝑚 𝐿 ,𝑠 𝐿 ′ ,𝐵𝑅 (𝑠 𝐿 ′ ,𝜇 )) for any 𝜇 ≤1, where 𝐵𝑅 represents best response. It then implies that Π ̂ 𝐿 (𝑚 𝐿 ,𝑠 𝐿 ′ ,𝐵𝑅 (𝑠 𝐿 ′ ,𝜇 ))≤Π ∗ (𝑚 𝐿 ) . Thus, no deviation 𝑠 𝐿 ′ is more firm-profitable than 𝑠 𝐿 ∗ . In addition, 𝑟 𝐿 ∗ =1 is a best response, since 𝑝 𝐿 ∗ =𝛼 𝐿 . Therefore (i) and (ii) satisfy the sequential rationality conditions. Finally, since by (i) 𝑠 𝐿 ∗ is offered only when 𝑚 ∈𝓜 𝟎 , the updated belief 𝜇 (𝑠 𝐿 ∗ )=0 satisfies Bayes’ rule. Now suppose 𝑚 ∈𝓜 𝟏 =[0,𝑀 (1)] . Since 𝑀 (1)<𝑀 −1 (0) , therefore 𝑚 < 𝑀 −1 (0) , so L-types are informed and the only uninformed consumers are H-types. For H- types, 𝑚 𝐿 =𝑀 −1 (𝑚 )∈[𝑀 −1 (0),1]⊂[𝑀 (1),1]=𝓜 𝟎 , therefore, by (i) the firm ’s equilibrium profit under 𝑚 𝐿 is Π ∗ (𝑚 𝐿 )=𝜆 𝛼 𝐻 +(1−𝜆 )𝛼 𝐿 =𝑝 𝐻 ∗ . Next, we e xamine H- types’ estimate of the firm’s profit for an arbitrary deviation pricing scheme 𝑠 𝐻 ′ = (𝑝 𝐻 ′ ,𝑝 ̅ ′ )≠𝑠 𝐻 ∗ . Clearly, 𝑠 𝐻 ′ is never profitable unless H-types held the belief 𝜇 (𝑠 𝐻 ′ )>0. 82 Then, by (iii), we must have 𝜇 (𝑠 𝐻 ′ )=1 with Π ̂ 𝐻 (𝑚 𝐿 ,𝑠 𝐻 ′ ,𝑟 𝐻 (𝑠 𝐻 ′ ,𝑚 𝐻 ))≤Π ∗ (𝑚 𝐿 ) . When H-types consider that the market state is 𝑚 𝐿 , then they assume the following: (1) they are L-types (i.e. 𝜇 (𝑠 𝐻 ′ )=0 ); (2) the other consumers are informed H-types and pay at min{𝑝 ̅ ′ ,𝛼 𝐻 } . Thus, H-types assess the firm’s profit in 𝑚 𝐿 when they are tricked into overpaying as Π ̂ 𝐻 (𝑚 𝐿 ,𝑠 𝐻 ′ ,𝑟 𝐻 (𝑠 𝐻 ′ ,𝑚 𝐻 ))=𝜆 min{𝑝 ̅ ′ ,𝛼 𝐻 }+(1−𝜆 )𝑝 𝐻 ′ 𝑟 𝐻 . Since 𝑟 𝐻 =1 only if H-types observe 𝑝 𝐻 ′ ≤𝛼 𝐻 , and since it is a dominant strategy to set 𝑝 ̅ ′ ≥𝑝 𝐻 ′ , we have Π ̂ 𝐻 (𝑚 𝐿 ,𝑠 𝐻 ′ ,𝑟 𝐻 (𝑠 𝐻 ′ ,𝑚 𝐻 ))≥𝑝 𝐻 ′ . Because Π ∗ (𝑚 𝐿 )=𝑝 𝐻 ∗ , the inequality Π ̂ 𝐻 (𝑚 𝐿 ,𝑠 𝐻 ′ ,𝑟 𝐻 (𝑠 𝐻 ′ ,𝑚 𝐻 ))≤Π ∗ (𝑚 𝐿 ) implies 𝑝 𝐻 ′ ≤𝑝 𝐻 ∗ . In addition, L-types are informed and pay no more than 𝛼 𝐿 , thus 𝑝 𝐿 ′ ≤𝛼 𝐿 =𝑝 𝐿 ∗ . Therefore, the firm could never be better off with any deviation 𝑠 𝐻 ′ than the equilibrium 𝑠 𝐻 ∗ . Since 𝑟 𝐻 ∗ =1 is a best response, since 𝑝 𝐻 ∗ <𝛼 𝐻 , the equilibrium satisfies the sequential rationally conditions. Finally , since , by (i), 𝑠 𝐻 ∗ (𝑚 ∈𝓜 𝟏 ) is offered to only uninformed H-types, thus 𝜇 (𝑠 𝐻 ∗ )=1 as specified in (iii) satisfy Bayes’ rule. ∎ Proof of Proposition 1(iv) We first establish that the PBE described in (i)-(iii) survives D1. Suppose that 𝑗 =𝐻 and 𝑘 =𝐿 , the strict inclusion condition in Definition B implies that for any 𝑠 𝑖 ′ , the set ⋃ {𝑟 𝑖 |Π ∗ (𝑚 𝐿 )<Π ̂ 𝑖 (𝑚 𝐿 ,𝑠 𝑖 ′ ,𝑟 𝑖 )} 𝜇 ≠∅, which then implies 𝜇 (𝑠 𝑖 ′ )=0 by (iii). Therefore, the equilibrium survives D1 when 𝑗 =𝐻 . Next, we show the case for 𝑗 =𝐿 and 𝑘 =𝐻 . Specially we want to show that 𝜇 (𝑠 𝑖 ′ )=1 for any 𝑠 𝑖 ′ such that ⋃ {𝑟 𝐿 |Π ∗ (𝑚 𝐿 )≤ 𝜇 Π ̂ 𝑖 (𝑚 𝐿 ,𝑠 𝑖 ′ ,𝑟 𝑖 )}⊊⋃ {𝑟 𝐿 |Π ∗ (𝑚 𝐻 )<Π ̂ 𝑖 (𝑚 𝐻 ,𝑠 𝑖 ′ ,𝑟 𝑖 )} 𝜇 . Suppose 𝜇 (𝑠 𝑖 ′ )≠1 by contradiction, then by (iii), 𝜇 (𝑠 𝑖 ′ )=0 and Π ∗ (𝑚 𝐿 )<Π ̂ 𝑖 (𝑚 𝐿 ,𝑠 𝑖 ′ ,𝑟 𝑖 ) , ⋃ {𝑟 𝐿 |Π ∗ (𝑚 𝐿 )> 𝜇 Π ̂ 𝑖 (𝑚 𝐿 ,𝑠 𝑖 ′ ,𝑟 𝑖 )}=∅ , violating the strict inclusion condition. Therefore 𝜇 (𝑠 𝑖 ′ )=1 . The equilibrium survives D1 when 𝑗 =𝐿 . Next, we show the uniqueness claim of (iv). T his is done in two steps. 83 Step 1: Any separating PBE that is not specified in (i)-(iii) fails D1 For convenience of the notation, we denote E[𝛼 ]≡𝜆 𝛼 𝐻 +(1−𝜆 )𝛼 𝐿 and E 2 [𝛼 ]≡𝜆 2 𝛼 𝐻 +(1−𝜆 2 )𝛼 𝐿 . Suppose there exist another PBE, i.e., (𝑠 ̃ 𝑖 ,𝑟 ̃ 𝑖 ,𝜇̃) 𝑖 =𝐻 ,𝐿 ≠ (𝑠 𝑖 ∗ ,𝑟 𝑖 ∗ ,𝜇 ) 𝑖 =𝐻 ,𝐿 that survives D1, where 𝑟 ̃ 𝑖 =𝐵𝑅 (𝑠 ̃ 𝑖 ,𝜇̃) . Denote Π ̃ (𝑚 ) as the equilibrium profit under (𝑠 ̃ 𝑖 ,𝑟 ̃ 𝑖 ,𝜇̃) 𝑖 =𝐻 ,𝐿 . First we derive the equilibrium profit Π ̃ (𝑚 ) : When 𝑚 ∈𝓜 𝟎 , 𝐻 -types are informed. Since (𝑠 ̃ 𝑖 ,𝑟 ̃ 𝑖 ,𝜇̃) 𝑖 =𝐻 ,𝐿 is separating, 𝜇̃(𝑠 ̃ 𝐿 )=0, L-types pay at most 𝛼 𝐿 . Therefore Π ̃ (𝑚 ∈𝓜 𝟎 )=E[𝛼 ] . When 𝑚 ∈𝓜 𝟏 , L-types are informed. Since 𝜇̃(𝑠 ̃ 𝐻 )=1 , Π ̃ (𝑚 ∈ℳ 1 )=𝜆 𝑝̃ 𝐻 +(1−𝜆 )𝛼 𝐿 . Since (𝑠 ̃ 𝑖 ,𝑟 ̃ 𝑖 ,𝜇̃) 𝑖 =𝐻 ,𝐿 is a PBE and 𝑠 ̃ 𝐻 ≠ 𝑠 𝐻 ∗ , we must have 𝑝̃ 𝐻 <E[𝛼 ]. To see this, suppose the contrary that either 𝑝̃ 𝐻 >E[𝛼 ], or 𝑝̃>𝑝̃ 𝐻 =E[𝛼 ] , then the firm always deviate to 𝑠 ̃ 𝐻 in 𝑚 ∈𝓜 𝟎 under the belief 𝜇̃(𝑠 ̃ 𝐻 )=1 , because Π ̂ 𝐿 (𝑚 ∈𝓜 𝟎 ,𝑠 ̃ 𝐻 ,𝐵𝑅 (𝑠 ̃ 𝐻 ,𝜇̃(𝑠 ̃ 𝐻 )))=𝜆 𝑝̃+(1−𝜆 )𝑝̃ 𝐻 >E[𝛼 ]= Π ̃ (𝑚 ∈𝓜 𝟎 ) , contradicting with 𝜇̃(𝑠 ̃ 𝐻 )=1 . Therefore, 𝑝̃ 𝐻 <E[𝛼 ] , and consequently Π ̃ (𝑚 ∈𝓜 𝟏 )<E 2 [𝛼 ]. Let 𝑠 𝑖 ′ ={𝑝 𝑖 ′ ,𝑝 ′ } and 𝑝 𝑖 ′ =𝑝 ′ ∈(𝑝̃ 𝐻 ,E[𝛼 ]) . We show that 𝜇̃(𝑠 𝑖 ′ )<1 : Suppose the contrary that 𝜇̃(𝑠 𝑖 ′ )=1 , then the firm always prefers deviating 𝑠 𝐻 ′ to 𝑠 ̃ 𝐻 in 𝑚 ∈ 𝓜 𝟏 because 𝑝 𝐻 ′ >𝑝̃ 𝐻 , therefore 𝑠 ̃ 𝐻 cannot be on a PBE. Next we want to show that for an arbitrary 𝜇̃ and 𝑟 𝑖 =𝐵𝑅 (𝑠 𝑖 ′ ,𝜇̃) , Π ̃ (𝑚 𝐻 )<Π ̂ 𝑖 (𝑚 𝐻 ,𝑠 𝑖 ′ ,𝑟 𝑖 ) whenever Π ̃ (𝑚 𝐿 )≤ Π ̂ 𝑖 (𝑚 𝐿 𝑠 𝑖 ′ ,𝑟 𝑖 ) . When consumer 𝑖 is uninformed, 𝑚 =𝑚 𝐿 is equivalent to 𝑚 ∈𝓜 𝟎 , thus Π ̃ (𝑚 𝐿 )=Π ̃ (𝑚 ∈𝓜 𝟎 )=E[𝛼 ] . To substitute in the profit function, consider 𝑟 𝑖 = 𝐵𝑅 (𝑠 𝑖 ′ ,𝜇̃) as that consumer 𝑖 accepts 𝑝 𝑖 ′ in probability 𝜇̃ , thus Π ̂ 𝑖 (𝑚 𝐿 ,𝑠 𝑖 ′ ,𝑟 𝑖 )=(1− 𝜆 )𝜇̃𝑝 ′ +𝜆 𝑝 ′ . Since 𝑝 ′ <E[𝛼 ] and Π ̃ (𝑚 𝐿 )≤Π ̂ 𝑖 (𝑚 𝐿 ,𝑠 𝑖 ′ ,𝑟 𝑖 ) imply that 𝜇̃𝑝 ′ ≥ E[𝛼 ]−𝜆 𝑝 ′ 1−𝜆 > E[𝛼 ] , thus Π ̂ 𝑖 (𝑚 𝐻 ,𝑠 𝑖 ′ ,𝑟 𝑖 )=𝜆 𝜇̃𝑝 ′ +(1−𝜆 )𝛼 𝐿 ≥E 2 [𝛼 ] . But from previous discussion, Π ̃ (𝑚 𝐻 )=Π ̃ (𝑚 ∈𝓜 𝟏 )<E 2 [𝛼 ], thus Π ̃ (𝑚 𝐻 )<Π ̂ 𝑖 (𝑚 𝐻 ,𝑠 𝑖 ′ ,𝑟 𝑖 ) . By Definition B, we must have 𝜇̃(𝑠 𝑖 ′ )=1 , which contradicts with the assumption that 𝜇̃(𝑠 𝑖 ′ )<1 . Therefore, the separating equilibrium with 𝑠 𝑖 ′ fails D1. 84 Step 2: Any pooling PBE fails D1 Since (𝑠 ̃ 𝑖 ,𝑟 ̃ 𝑖 ,𝜇̃) 𝑖 =𝐻 ,𝐿 is pooling, then 𝜇̃(𝑠 ̃ 𝐿 )=𝜇̃(𝑠 ̃ 𝐻 )>0, and since uninformed consumers cannot update their belief from the pricing scheme, they estimate their value as E[𝛼 |𝑠 ̃ 𝑖 ]=𝑒 , therefore it is optimal for the firm to set 𝑝̃≤𝑒 , and Π ̃ (𝑚 ∈𝓜 𝟎 )≤𝑒 . Comparing to the separating equilibrium in Proposition 1, Π ∗ (𝑚 ∈𝓜 𝟎 )=E[𝛼 ] , thus when 𝑒 ≤E[𝛼 ], there is no pooling because the firm prefers to use 𝑠 𝐿 ∗ to inform L-types their true type. Now consider the case when 𝑒 >E[𝛼 ] . We show that the belief 𝜇̃(𝑠 ̃ 𝑖 )>0 violates D1. Particularly, we show that for any 𝜇̃ and 𝑟 𝑖 =𝐵𝑅 (𝑠 ̃ 𝑖 ,𝜇̃) , Π ∗ (𝑚 ∈𝓜 𝟎 )< Π ̂ 𝑖 (𝑚 𝐿 ,𝑠 ̃ 𝑖 ,𝑟 𝑖 ) , whenever Π ∗ (𝑚 ∈𝓜 𝟏 )≤Π ̂ 𝑖 (𝑚 𝐻 ,𝑠 ̃ 𝑖 ,𝑟 𝑖 ) . Consider the pricing scheme in pooling, 𝑝̃=𝑝̃ 𝑖 =𝑒 . Since Π ̂ 𝑖 (𝑚 𝐻 ,𝑠 ̃ 𝑖 ,𝑟 𝑖 )=𝜆 𝜇̃𝑒 +(1−𝜆 )𝛼 𝐿 , Π ∗ (𝑚 ∈𝓜 𝟏 )≤ Π ̂ 𝑖 (𝑚 𝐻 ,𝑠 ̃ 𝑖 ,𝑟 𝑖 ) implies that 𝜇̃𝑒 ≥E[𝛼 ] . Since 𝑒 >E[𝛼 ] , Π ̂ 𝑖 (𝑚 𝐿 ,𝑠 𝑖 ′ ,𝑟 𝑖 )=(1−𝜆 )𝜇̃𝑒 + 𝜆𝑒 ≥(1−𝜆 )E[𝛼 ]+𝜆𝑒 >E[𝛼 ]=Π ∗ (𝑚 ∈𝓜 𝟎 ) . Therefore by Definition A, 𝜇̃(𝑠 ̃ 𝑖 )=0, which contradicts with 𝜇̃(𝑠 ̃ 𝑖 )>0 in the pooling equilibrium. Thus consumer i should never assign positive probability that she is an H-type when she observes 𝑝̃=𝑝̃ 𝑖 =𝑒 . Therefore, the pooling equilibrium is not optimal. Steps 1 and 2 establish the uniqueness claim in (iv). ∎ Proof of Proposition 2: In Proposition 1, we have shown the case when 𝑁 =1. To show that the claim holds for an arbitrary 𝑁 >1, we use the mathematical induction method: Suppose that the claim holds for 𝑁 =𝑘 , where 𝑘 >1 is arbitrary, we show that the claim is true for 𝑁 =𝑘 + 1. For convenience of notation, denote E 𝑛 [𝛼 ]≡𝜆 𝑛 𝛼 𝐻 +(1−𝜆 𝑛 )𝛼 𝐿 for any n. First, w e establish existence of a PBE satisfying properties (i)-(iii). Since the claim holds for 𝑁 =𝑘 , the equilibr ium in (i)-(iii) is a PBE for any 𝑚 ∈ 𝓜 𝒏 , 𝑛 =1,…,𝑘 . Now suppose 𝑁 =𝑘 +1. When 𝑚 ∈𝓜 𝒌 +𝟏 , L-types are informed. H- types are uninformed and consider two possible states 𝑚 𝐿 or 𝑚 𝐻 . Since 𝑀 −1 (𝑚 )∈𝓜 𝒌 , 85 we have 𝑚 𝐿 ∈𝓜 𝒌 for H-types, thus the equilibrium price 𝑝 𝐻 ∗ (𝑚 𝐿 )=E 𝑘 [𝛼 ] and the firm’s profit Π ∗ (𝑚 𝐿 )=E 𝑘 +1 [𝛼 ] . The estimated profit in deviation to 𝑠 𝑖 ′ ={𝑝 𝑖 ′ ,𝑝 ̅ ′ } is Π ̂ 𝑖 (𝑚 𝐿 ,𝑠 𝑖 ′ ,𝐵𝑅 (𝑠 𝑖 ′ ,𝑚 𝐻 ))=𝜆 𝑝 ̅ ′ +(1−𝜆 )𝑝 𝑖 ′ for any 𝑝 𝑖 ′ ≤𝑝 ̅ ′ ≤𝛼 𝐻 . By the consumer belief (iii), 𝜇 (𝑠 𝑖 ′ )=1 if and only if Π ∗ (𝑚 𝐿 )≥ Π ̂ 𝑖 (𝑚 𝐿 ,𝑠 𝑖 ′ ,𝐵𝑅 (𝑠 𝑖 ′ ,𝑚 𝐻 )). If 𝑝 ̅ ′ >Π ∗ (𝑚 𝐿 ) , then Π ∗ (𝑚 𝐿 )≥Π ̂ 𝐻 (𝑚 𝐿 ,𝑠 𝐻 ′ ,𝐵𝑅 (𝑠 𝐻 ′ ,𝑚 𝐻 )) implies 𝑝 𝐻 ′ <Π ∗ (𝑚 𝐿 ) . Because 𝑠 𝑖 ′ is arbitrary, the firm could do no better than with the equilibrium pricing scheme 𝑠 𝐻 ∗ defined in (i), therefore (i) is firm-optimal given the belief in (iii). Additionally, since by (i) 𝑠 𝐻 ∗ is offered only in 𝑚 ∈𝓜 𝒌 +𝟏 when the uniformed consumers are H-types, thus (iii): 𝜇 (𝑠 𝐻 ∗ )=1 satisfies Bayes’ rule, and (ii): H-types purchase, 𝑟 𝐻 ∗ =1 is a best response, since 𝑝 𝐻 ∗ <𝛼 𝐻 . Second, we establish that the PBE described in (i)-(iii) satisfies D1. The argument is the same as in the proof to proposition 1: 𝜇 (𝑠 𝑖 ′ )={0,1} in (iii) implies that the PBE cannot fail D1. Third, we establish the uniqueness claim by using the same notations in the proof to proposition 1. The proof is similar so we only show Step 1: Any separating PBE that is not specified in (i)-(iii) fails D1. Denote an arbitrary separating equilibrium (𝑠 ̃ 𝑖 ,𝑟 ̃ 𝑖 ,𝜇̃) 𝑖 =𝐻 ,𝐿 ≠(𝑠 𝑖 ∗ ,𝑟 𝑖 ∗ ,𝜇 ∗ ) 𝑖 =𝐻 ,𝐿 and Π ̃ (𝑚 ) as the equilibrium profit. Suppose that for the uninformed consumer 𝑖 , 𝑚 𝐿 ∈𝓜 𝒌 and 𝑚 𝐻 ∈𝓜 𝒌 +𝟏 , then since (𝑠 ̃ 𝑖 ,𝑟 ̃ 𝑖 ,𝜇̃) 𝑖 =𝐻 ,𝐿 is separating, we have 𝑠 ̃ 𝐿 (𝑚 𝐿 )≠𝑠 ̃ 𝐻 (𝑚 𝐻 ) . We derive the equilibrium profit Π ̃ (𝑚 ) : When 𝑚 ∈𝓜 𝒌 , since there is no other separating equilibrium for 𝑁 =𝑘 , we have Π ̃ (𝑚 ∈𝓜 𝒌 )=Π ∗ (𝑚 ∈𝓜 𝒌 )=E 𝑘 +1 [𝛼 ] . When 𝑚 ∈𝓜 𝒌 +𝟏 , L-types are informed. Since 𝜇̃(𝑠 ̃ 𝐻 )=1 , Π ̃ (𝑚 ∈𝓜 𝒌 +𝟏 )=𝜆 𝑝̃ 𝐻 + (1−𝜆 )𝛼 𝐿 . Since (𝑠 ̃ 𝑖 ,𝑟 ̃ 𝑖 ,𝜇̃) 𝑖 =𝐻 ,𝐿 is a PBE and 𝑠 ̃ 𝐻 ≠𝑠 𝐻 ∗ , we must have 𝑝̃ 𝐻 <E 𝑘 +1 [𝛼 ]. To see this, suppose the contrary that either 𝑝̃ 𝐻 >E 𝑘 +1 [𝛼 ], or 𝑝̃>𝑝̃ 𝐻 , then the firm always deviate to 𝑠 ̃ 𝐻 in 𝑚 ∈𝓜 𝒌 under the belief 𝜇̃(𝑠 ̃ 𝐻 )=1, contradicting with 𝜇̃(𝑠 ̃ 𝐻 )=1 is 86 on a PBE. Therefore, we must have Π ̃ (𝑚 ∈𝓜 𝒌 +𝟏 )<E 𝑘 +2 [𝛼 ]. Let 𝑠 𝑖 ′ ={𝑝 𝑖 ′ ,𝑝 ′ } and 𝑝 𝑖 ′ =𝑝 ′ ∈(𝑝̃ 𝐻 ,E 𝑘 +1 [𝛼 ]) . First we show that 𝜇̃(𝑠 𝑖 ′ )<1 , otherwise the firm always prefers deviating 𝑠 𝑖 ′ to 𝑠 ̃ 𝐻 in 𝑚 ∈𝓜 𝒌 , therefore 𝑠 ̃ 𝐻 cannot be on a PBE. Next we want to show that Π ̃ (𝑚 𝐻 )<Π ̂ 𝑖 (𝑚 𝐻 ,𝑠 𝑖 ′ ,𝑟 𝑖 ) for an arbitrary 𝜇̃ and 𝑟 𝑖 =𝐵𝑅 (𝑠 𝑖 ′ ,𝜇̃) , whenever Π ̃ (𝑚 𝐿 )≤Π ̂ 𝑖 (𝑚 𝐿 ,𝑠 𝑖 ′ ,𝑟 𝑖 ) . When 𝑚 =𝑚 𝐿 ∈𝓜 𝒌 , Π ̃ (𝑚 𝐿 )= Π ̃ (𝑚 ∈𝓜 𝒌 )=E 𝑘 +1 [𝛼 ]. To substitute in the profit function, consider 𝑟 𝑖 =𝐵𝑅 (𝑠 𝑖 ′ ,𝜇̃) as that consumer 𝑖 accepts 𝑝 𝑖 ′ in probability 𝜇̃ , thus Π ̂ 𝑖 (𝑚 𝐿 ,𝑠 𝑖 ′ ,𝑟 𝑖 )=[(1−𝜆 )𝜇̃+𝜆 ]𝑝 ′ . Since 𝑝 ′ <E 𝑘 +1 [𝛼 ] and Π ̃ (𝑚 𝐿 )≤Π ̂ 𝑖 (𝑚 𝐿 ,𝑠 𝑖 ′ ,𝑟 𝑖 ) imply that 𝜇̃𝑝 ′ ≥ E 𝑘 +1 [𝛼 ]−𝜆 𝑝 ′ 1−𝜆 > E 𝑘 +1 [𝛼 ] , thus Π ̂ 𝑖 (𝑚 𝐻 ,𝑠 𝑖 ′ ,𝑟 𝑖 )=𝜆 𝜇̃𝑝 ′ +(1−𝜆 )𝛼 𝐿 >𝜆 E 𝑘 +1 [𝛼 ]+(1−𝜆 )𝛼 𝐿 =E 𝑘 +2 [𝛼 ] . Because Π ̃ (𝑚 𝐻 )=Π ̃ (𝑚 ∈𝓜 𝒌 +𝟏 )<E 𝑘 +2 [𝛼 ] , we must have Π ̃ (𝑚 𝐻 )<Π ̂ 𝑖 (𝑚 𝐻 ,𝑠 𝑖 ′ ,𝑟 𝑖 ) . Thus by Definition A, 𝜇̃(𝑠 𝑖 ′ )=1 , contrad icting with the assumption that 𝜇̃(𝑠 𝑖 ′ )<1 . Therefore any separating equilibrium with 𝑠 𝑖 ′ fails D1. ∎ Lemma A2 There exist two real values of 𝜆 , 𝜆 𝐻𝐸 1 <𝜆 𝐻𝐸 2 ∈(0,1) , such that Π 𝑈 (𝛼 𝐻 ;𝜆 )−Π 𝑈 (𝑒 ;𝜆 )=0, if and only if (1−2𝜌 𝐻 )≥√𝛼 𝐿 /𝛼 𝐻 . Proof of Lemma A2 We start by writing Π 𝑈 (𝛼 𝐻 ;𝜆 )−Π 𝑈 (𝑒 ;𝜆 )= 𝐺 1 (𝜆 ) 𝜌 𝐿 (1−𝜆 )+𝜌 𝐻 𝜆 , where 𝐺 1 (𝜆 )={𝜌 𝐿 2 𝛼 𝐿 − 𝜌 𝐻 2 𝛼 𝐻 −𝜌 𝐿 [(𝛼 𝐻 −𝛼 𝐿 )−2(𝜌 𝐻 𝛼 𝐻 −𝜌 𝐿 𝛼 𝐿 )]}𝜆 2 +𝜌 𝐿 [(𝛼 𝐻 −𝛼 𝐿 )−2(𝜌 𝐻 𝛼 𝐻 −𝜌 𝐿 𝛼 𝐿 )]𝜆 − 𝜌 𝐿 2 𝛼 𝐿 . Since 𝐺 1 (𝜆 ) is a quadratic function with the following properties: 𝐺 1 (0)= −𝜌 𝐿 2 𝛼 𝐿 <0 , 𝐺 1 (1)=−𝜌 𝐻 2 𝛼 𝐻 <0 . Therefore, Π 𝑈 (𝛼 𝐻 ;𝜆 )≥Π 𝑈 (𝑒 ;𝜆 ) for some 𝜆 ∈ (0,1) only if there exists two roots 𝜆 𝐻𝐸 1 ,𝜆 𝐻𝐸 2 ∈(0,1) for 𝐺 1 (𝜆 ). Because 𝐺 1 (𝜆 ) is quadratic, it suffices to examine three conditions to check whether such roots may exist: (1) The discriminant of 𝐺 1 (𝜆 )=0 is positive. (2) 𝜕 𝐺 1 𝜕𝜆 | 𝜆 =0 >0; 87 (3) 𝜕 𝐺 1 𝜕𝜆 | 𝜆 =1 <0; Below we show that the intersection of the above three conditions (1) through (3) is equivalent to (1−2𝜌 𝐻 )≥ √ 𝛼 𝐿 𝛼 𝐻 . Note that the above condition (1) is equivalent to (1−2𝜌 𝐻 ) 2 > 𝛼 𝐿 𝛼 𝐻 . Condition (2) is equivalent to (1−2𝜌 𝐻 )>(1−2𝜌 𝐿 ) 𝛼 𝐿 𝛼 𝐻 , which then implies 𝜌 𝐻 ≤ 1 2 , otherwise if 𝜌 𝐻 > 1 2 , t hen 𝜌 𝐿 > 1 2 and 𝜌 𝐻 +𝜌 𝐿 >1, contradicting with the restriction that 𝑀 𝐻 𝐿 (𝑚 )< 𝑀 𝐿 𝐻 (−𝑚 ) . Condition (3) is equivalent to (2𝜌 𝐻 2 ) 𝜌 𝐿 −2𝜌 𝐻 +1≥ 𝛼 𝐿 𝛼 𝐻 . If 𝜌 𝐻 ≤ 1 2 , then (2𝜌 𝐻 2 ) 𝜌 𝐿 − 2𝜌 𝐻 +1>(1−2𝜌 𝐻 ) 2 , thus condition (1) and (2) imply condition (3). Further, condition (1) and 𝜌 𝐻 ≤ 1 2 are equivalent to (1−2𝜌 𝐻 )≥ √ 𝛼 𝐿 𝛼 𝐻 . It remains to show that √ 𝛼 𝐿 𝛼 𝐻 > (1−2𝜌 𝐿 ) 𝛼 𝐿 𝛼 𝐻 , so the previous inequality is also sufficient for Condition (2). Since 𝜌 𝐿 , 𝛼 𝐿 𝛼 𝐻 ∈ (0,1) , we have |1−2𝜌 𝐿 |<1 and √ 𝛼 𝐿 𝛼 𝐻 > 𝛼 𝐿 𝛼 𝐻 , thus √ 𝛼 𝐿 𝛼 𝐻 > 𝛼 𝐿 𝛼 𝐻 >|1−2𝜌 𝐿 | 𝛼 𝐿 𝛼 𝐻 ≥(1− 2𝜌 𝐿 ) 𝛼 𝐿 𝛼 𝐻 . ∎ Proof of Proposition 3 (i) Suppose (1−2𝜌 𝐻 )≥√𝛼 𝐿 /𝛼 𝐻 . Then, f rom Lemma A2, Π 𝑈 (𝛼 𝐻 ;𝜆 )>Π 𝑈 (𝑒 ;𝜆 ) , if and only if 𝜆 𝐻𝐸 1 ≤𝜆 ≤𝜆 𝐻𝐸 2 . And, by definition of 𝜆 𝐻𝐿 , Π 𝑈 (𝛼 𝐻 ;𝜆 )>Π 𝑈 (𝛼 𝐿 ;𝜆 ) if and only if 𝜆 >𝜆 𝐻𝐿 . Since 𝜆 ∈(0,1) , this last condition requires 𝜆 𝐻𝐿 <1 , which we establish next: The condition 1−2𝜌 𝐻 ≥√𝛼 𝐿 /𝛼 𝐻 >0 implies 𝜌 𝐻 < 1 2 , which further implies 1−𝜌 𝐻 >(1−2𝜌 𝐻 ) 2 . Thus, (1−2𝜌 𝐻 ) 2 ≥𝛼 𝐿 /𝛼 𝐻 implies (1−𝜌 𝐻 )>𝛼 𝐿 /𝛼 𝐻 , which is equivalent to 𝜆 𝐻𝐿 <1. Therefore, Π 𝑈 (𝛼 𝐻 ;𝜆 )>max{Π 𝑈 (𝑒 ;𝜆 ),Π 𝑈 (𝛼 𝐿 ;𝜆 )} if and only if max {𝜆 𝐻𝐸 1 ,𝜆 𝐻𝐿 }≤𝜆 ≤𝜆 𝐻𝐸 2 , and (1−2𝜌 𝐻 )≥√𝛼 𝐿 /𝛼 𝐻 . (ii) Suppose the condition of (i) do not hold. Then the uniform price of 𝛼 𝐻 is not optimal and we need only to compare profits Π 𝑈 (𝛼 𝐿 ;𝜆 ) and Π 𝑈 (𝑒 ;𝜆 ) . We show that Π 𝑈 (𝑒 ;𝜆 )> 88 Π 𝑈 (𝛼 𝐿 ;𝜆 )⟺𝜆 >𝜆 𝐿𝐸 , where 𝜆 𝐿𝐸 ∈(0,1) and is unique. This ordering of profits is equivalent to 𝐺 2 (𝜆 )<0 , where 𝐺 2 (𝜆 )≡−[𝜌 𝐻 𝛼 𝐻 −𝜌 𝐿 𝜌 𝐻 𝛼 𝐻 −𝜌 𝐿 𝛼 𝐿 +𝜌 𝐿 2 𝛼 𝐿 ]𝜆 2 − [𝜌 𝐿 𝜌 𝐻 𝛼 𝐻 +2𝜌 𝐿 𝛼 𝐿 −2𝜌 𝐿 2 𝛼 𝐿 −𝜌 𝐻 𝛼 𝐿 ]𝜆 +𝜌 𝐿 (1−𝜌 𝐿 )𝛼 𝐿 . Since 𝐺 2 (.) is continuous, 𝐺 2 (0)=𝜌 𝐿 (1−𝜌 𝐿 )𝛼 𝐿 >0 and 𝐺 2 (1)=−𝜌 𝐻 (𝛼 𝐻 −𝛼 𝐿 )<0, by the intermediate value theorem, there must exist at least a 𝜆 𝐿𝐸 ∈(0,1), such that 𝐺 2 (𝜆 𝐿𝐸 )=0 . Suppose that there exists another 𝜆 𝐿𝐸 ′ ∈(0,1), such that 𝜆 𝐿𝐸 ′ ≠𝜆 𝐿𝐸 , and 𝐺 2 (𝜆 𝐿𝐸 ′)=0. Then because 𝐺 2 (𝜆 𝐿𝐸 )=𝐺 2 (𝜆 𝐿𝐸 ′ )=0, 𝐺 2 is quadratic, and 𝐺 2 (0)>0 , we must have 𝐺 2 (1)>0 , which contradicts with 𝐺 2 (1)<0. Therefore there exists a unique 𝜆 𝐿𝐸 ∈(0,1) .∎ Figure A1 illustrates a numerical example for Proposition 2. Figure A1: Firm's Profit with Uniform Prices Proof of Proposition 4 Since 𝜕 [Π 𝐼 −Π 𝑈 (𝑒 )] 𝜕 𝜌 𝐻 = −(1−𝜆 ) 𝑤 (1−𝜆 )+𝜆 [(𝛼 𝐻 −𝛼 𝐿 )𝜆 (𝑤 +𝜆 )+𝑤 (𝑤 +𝜆 )𝛼 𝐻 (1−𝜆 )+𝑤 𝜆 2 𝛼 𝐿 ]< 0, the ordering of Π 𝐼 and Π 𝑈 (𝑒 ) holds as claimed. The relative profitability of price discrimination is determined by the ordering of 𝜌 𝐻 and 𝜌 1 . We only need to establish one numerical example to show that a feasible parameter space exist for 𝜌 𝐻 >𝜌 1 . Suppose 89 𝛼 𝐻 =2 , 𝛼 𝐿 =1 , 𝛽 𝐻 =1,𝛽 𝐿 =2 , 𝜆 =0.8 , and 𝑚 =0.55 , then 𝜌 𝐻 ≈0.59, and 𝜌 1 ≈ 0.57. Note that 𝜌 𝐿 ≈0.30 , 𝜌 𝐻 ,𝜌 𝐿 ,𝜌 𝐻 +𝜌 𝐿 ∈(0,1) , satisfying all restrictions on the parameter space. Since 𝜌 𝐻 >𝜌 1 on this point, it is feasible Π 𝐼 <Π 𝑈 (𝑒 ) . Moreover, given the continuity of Π 𝐼 −Π 𝑈 (𝑒 ) in all parameter values, there is a feasible range of parameters for which 𝜌 𝐻 >𝜌 1 . ∎ Proof of Proposition 5 Since 𝜌 3 >𝜌 2 , thus the existence of the feasible parameters that 𝜌 1 >𝜌 3 implies 𝜌 1 > 𝜌 3 is a feasible condition. Moreover, since 𝜌 𝐻 >𝜌 2 is equivalent to 𝐶 𝑆 𝐼 >𝐶 𝑆 𝑈 (𝑒 ) , it remains to show that such condition is feasible in the parameter space. Since Assumption 3 implies that 𝜌 𝐻 < 1 1+𝑤 , f easibility requires 1 1+𝑤 > 𝑤 𝜆 +𝑤 (2−𝜆 ) . Because 𝑤 >0, the necessary condition is equivalent to 𝑤 <1, which is a feasible condition. Therefore 𝜌 𝐻 >𝜌 3 is also feasible. ∎ Proof of Proposition 6 When 𝑝 𝑈 ∗ =𝛼 𝐻 , all consumers obtain zero surplus from the uninformed firm. But when the firm uses personalized pricing, L-types pay at 𝑒 >𝛼 𝐿 whenever they are uninformed, and thus they obtain a negative expected surplus from a discriminating firm. However, naïve H-types obtain positive surplus by paying 𝑒 <𝛼 𝐻 with a discriminating firm, when they are uninformed. ∎ 90 Appendix B: Proofs to Chapter Two Proof of Lemma 1 Solve the optimization problem in Equation (1). Note that if 𝑞 3 >0, then the constraints are equivalent to 𝛼 𝐻 𝑞 1 −𝑝 1 =𝛼 𝐻 𝑞 2 −𝑝 2 , 𝑒 𝑞 2 −𝑝 2 =𝑒 𝑞 3 −𝑝 3 , and 𝑒 𝑞 3 =𝑝 3 . Substitute the price conditional functions to the objective function and derive the first order conditions on qualities to obtain that 𝑞̂ 1 =𝛼 𝐻 , 𝑞̂ 2 =𝛼 𝐿 +𝜆 2 (𝛼 𝐻 −𝛼 𝐿 ) , and 𝑞̂ 3 =𝛼 𝐿 − 𝜆 (1+𝜆 ) 1−𝜆 (𝛼 𝐻 −𝛼 𝐿 ) . Clearly 𝑞̂ 1 >0 and 𝑞̂ 2 >0. It remains to derive the firm’s product line strategy in the parameter space where 𝑞̂ 3 ≤0, or equivalently, 𝛼 ≤ 𝜆 +𝜆 2 1+𝜆 2 . If 𝛼 ≤ 𝜆 +𝜆 2 1+𝜆 2 , then since the profit function can be rewritten as a quadratic function of 𝑞̂ 3 given the other parameters satisfying the restrictions, the corner solution must be that 𝑞̂ 3 =0, therefore the firm targets only informed H-types and the uninformed consumers. The optimization problem becomes Max {𝑝 1 ,𝑝 2 ,𝑞 1 ,𝑞 2 } Π Out = 𝜆 2 (𝑝 1 − 𝑞 1 2 2 )+ 1 2 (𝑝 2 − 𝑞 2 2 2 ), s.t. (1): 𝛼 𝐻 𝑞 1 −𝑝 1 ≥Max{𝛼 𝐻 𝑞 2 −𝑝 2 ,0}; (2): 𝑒 𝑞 2 −𝑝 2 ≥Max{𝑒 𝑞 1 −𝑝 1 ,0}; (3): 𝑝 1 ,𝑝 2 ,𝑞 1 ,𝑞 2 ≥0 Solve the maximand to obtain 𝑞̂ 1 =𝛼 𝐻 ; 𝑞̂ 2 =𝛼 𝐿 +𝜆 2 (𝛼 𝐻 −𝛼 𝐿 ) ; 𝑝 ̂ 1 =𝛼 𝐻 (𝑞̂ 1 −𝑞̂ 2 )+ 𝑝 ̂ 2 ;𝑝 ̂ 2 =𝑒 𝑞̂ 2 . ∎ Proof of Lemma 2 First show that the given equilibrium in (i) satisfies sequential rationality and Bayesian rules, when 𝑚 =𝑔 (Step 1). Then show the same for 𝑚 =𝑏 (Step 2). Next show that the equilibrium survives D1 (Step 3) and is unique (Step 4). Step 1: When 𝑚 =𝑔 , only L-types are uninformed (𝑖 =𝐿 ) . Consider an arbitrary deviation Ψ ′ ≠Ψ ∗ . Since Ψ ∗ is firm-optimal when L-types learn their type, Ψ ′ is 91 profitable only if L-types overestimate their type, i.e., 𝜇 𝑖 (Ψ ′ )>0. By (iii), 𝜇 𝑖 (Ψ ′ )>0 only if Π ̂ (𝑔 ,𝑟 𝑖 ,Ψ ′ )≤Π 𝐹 ∗ where 𝑟 𝑖 ∈BR(Ψ′,𝜇 𝑖 =1) . Since 𝜇 𝑖 <1 indicates that the consumer attributes positive probability that 𝑚 =𝑔 , or equivalently, that she is an L-type, 𝜇 𝑖 =1 is the most favorable belief for the firm, i.e., Π ̂ (𝑔 ,𝑟 𝑖 ,Ψ ′ )≥Π ̂ (𝑔 ,𝑟 𝑖 ′ ,Ψ ′ ) for any 𝑟 𝑖 ′ ∈BR(Ψ′,𝜇 𝑖 ≤1) . Therefore Π ̂ (𝑔 ,𝑟 𝑖 ′ ,Ψ ′ )≤Π 𝐹 ∗ : no deviation Ψ ′ is more profitable than Ψ ∗ for 𝜇 𝑖 (Ψ ′ )>0 . In addition, 𝑟 𝐿 ∗ =1 is a best response for L-types, since 𝑝 ∗ (𝑔 ,𝐿 )=𝛼 𝐿 𝑞 𝐿 , whenever it is available. Therefore, the equilibrium satisfies the sequential rationality conditions. Finally, since Ψ ∗ is offered only when 𝑚 =𝑔 , the updated belief 𝜇 𝑖 =0 satisfies Bayes’ rule. Step 2: When 𝑚 =𝑏 , only H-types are uninformed (𝑖 =𝐻 ) . Since H-types are inferring the market state that the firm has observed, they need to estimate Π ̂ (𝑔 ,𝑟 𝑖 ,Ψ′) . When H- types consider that 𝑚 =𝑔 , then they assume the other consumers are informed H-types and will choose the better product with 𝑞 𝐻 ′ and 𝑝 𝐻 ′ ≤𝑞 𝐻 ′ 𝛼 𝐻 , as long as the product line satisfies incentive compatibility constraint. Thus, H-types assess the firm’s deviation profit Π ̂ (𝑔 ,𝑟 𝑖 ,Ψ′) when they are tricked into overpaying, i.e., 𝑟 𝑖 ∈BR(Ψ′,𝜇 𝑖 =1) , as Π ̂ (𝑔 ,𝑟 𝑖 ,Ψ′)=𝑝 𝐻 ′ − (𝑞 𝐻 ′ ) 2 2 , since both types of consumers will purchase the better product. By (iii), 𝜇 𝑖 >0, if and only if 𝑝 𝐻 ′ − (𝑞 𝐻 ′ ) 2 2 ≤Π 𝐹 ∗ . Therefore, it adds as the signaling credibility constraint, in addition to the rational participation and incentive compatibility constraints in the standard product line design problem, to induce uninformed H-types to purchase. Step 2.1: Suppose that 𝜆 <𝛼 , then the firm ’s profit in the full-information standard model is Π 𝐹 ∗ = 𝜆 (𝛼 𝐻 −𝛼 𝐿 ) 2 2(1−𝜆 ) + 𝛼 𝐿 2 2 . By (iii), the firm ’s maximization problem is Max {𝑞 𝐻 ,𝑞 𝐿 ,𝑝 𝐻 ,𝑝 𝐿 } Π=𝜆 (𝑝 𝐻 − (𝑞 𝐻 ) 2 2 )+(1−𝜆 )(𝑝 𝐿 − (𝑞 𝐿 ) 2 2 ), s.t.(1): 𝛼 𝐻 𝑞 𝐻 −𝑝 𝐻 ≥Max{𝛼 𝐻 𝑞 𝐿 −𝑝 𝐿 ,0}; (2):𝛼 𝐿 𝑞 𝐿 −𝑝 𝐿 ≥Max{𝛼 𝐿 𝑞 𝐻 −𝑝 𝐻 ,0}; (3): 𝑝 𝐻 − (𝑞 𝐻 ) 2 2 ≤ 𝜆 (𝛼 𝐻 −𝛼 𝐿 ) 2 2(1−𝜆 ) + 𝛼 𝐿 2 2 Notice from (3) that the nonlinear constrained optimization problem needs convex 92 programming. First, we show that constraint (3) needs to be binding. Prove by contradiction: suppose that (3) is not binding at any parameter point, i.e., 𝑝 ∗ (𝑏 ,𝐻 )− (𝑞 ∗ (𝑏 ,𝐻 )) 2 2 < 𝜆 (𝛼 𝐻 −𝛼 𝐿 ) 2 2(1−𝜆 ) + 𝛼 𝐿 2 2 , then the optimization problem is equivalent to the full- information case, and the optimal product line is thus 𝑞 ∗ (𝑏 ,𝐻 )=𝛼 𝐻 ; 𝑝 ∗ (𝑏 ,𝐻 )= (𝛼 𝐻 −𝛼 𝐿 ) 2 1−𝜆 +𝛼 𝐻 𝛼 𝐿 However, 𝑝 ∗ (𝑏 ,𝐻 )− (𝑞 ∗ (𝑏 ,𝐻 )) 2 2 −( 𝜆 (𝛼 𝐻 −𝛼 𝐿 ) 2 2(1−𝜆 ) + 𝛼 𝐿 2 2 )= (𝛼 𝐻 −𝛼 𝐿 ) 2 2(1−𝜆 ) >0 , which contradicts with the assumption that (3) is not binding at any parameter points. Second, the optimization problem can be simplified using 𝑝 𝐿 = 𝛼 𝐿 𝑞 𝐿 and 𝑝 𝐻 ≤ Min{𝛼 𝐿 𝑞 𝐿 +𝛼 𝐻 (𝑞 𝐻 −𝑞 𝐿 ), 𝜆 (𝛼 𝐻 −𝛼 𝐿 ) 2 2(1−𝜆 ) + 𝛼 𝐿 2 2 + (𝑞 𝐻 ) 2 2 }. If the constraint (1) is not binding, we must have 𝛼 𝐿 𝑞 ∗ (𝑏 ,𝐿 )+𝛼 𝐻 (𝑞 ∗ (𝑏 ,𝐻 )−𝑞 ∗ (𝑏 ,𝐿 ))> 𝜆 (𝛼 𝐻 −𝛼 𝐿 ) 2 2(1−𝜆 ) + 𝛼 𝐿 2 2 + (𝑞 𝐻 ∗ ) 2 2 and 𝑝 ∗ (𝑏 ,𝐻 )= 𝜆 (𝛼 𝐻 −𝛼 𝐿 ) 2 2(1−𝜆 ) + 𝛼 𝐿 2 2 + (𝑞 𝐻 ∗ ) 2 2 , where 𝑞 ∗ (𝑏 ,𝐿 ) solves the following maximization problem: Max {𝑞 𝐿 } Π=𝜆 Π ∗ (𝑔 )+(1−𝜆 )(𝛼 𝐿 𝑞 𝐿 − (𝑞 𝐿 ) 2 2 ) From the first order conditions, 𝑞 ∗ (𝑏 ,𝐿 )=𝛼 𝐿 . Thus 𝛼 𝐿 𝑞 ∗ (𝑏 ,𝐿 )+𝛼 𝐻 (𝑞 ∗ (𝑏 ,𝐻 )− 𝑞 ∗ (𝑏 ,𝐿 ))> 𝜆 (𝛼 𝐻 −𝛼 𝐿 ) 2 2(1−𝜆 ) + 𝛼 𝐿 2 2 + (𝑞 𝐻 ∗ ) 2 2 is equivalent to 1−2𝜆 > (1−𝜆 )(𝑞 𝐻 ∗ −𝛼 𝐻 ) 2 (𝛼 𝐻 −𝛼 𝐿 ) 2 . It is Pareto efficient to set 𝑞 ∗ (𝑏 ,𝐻 )=𝛼 𝐻 . If 𝜆 < 1 2 , then constraint (1) needs not to bind, because 1−2𝜆 >0 implies 𝛼 𝐻 𝑞 ∗ (𝑏 ,𝐻 )−𝑝 ∗ (𝑏 ,𝐻 )>𝛼 𝐻 𝑞 ∗ (𝑏 ,𝐿 )−𝑝 ∗ (𝑏 ,𝐿 ) , for the equilibrium product line 𝑞 ∗ (𝑏 ,𝐿 )=𝛼 𝐿 , 𝑝 ∗ (𝑏 ,𝐿 )=𝛼 𝐿 2 ; 𝑞 ∗ (𝑏 ,𝐻 )=𝛼 𝐻 , 𝑝 ∗ (𝑏 ,𝐻 )= 1 2(1−𝜆 ) (𝛼 𝐻 2 −2𝜆 𝛼 𝐻 𝛼 𝐿 +𝛼 𝐿 2 ) . If λ> 1 2 , then since 1−2𝜆 <0< (1−𝜆 )(𝑞 ∗ (𝑏 ,𝐻 )−𝛼 𝐻 ) 2 (𝛼 𝐻 −𝛼 𝐿 ) 2 , and 𝛼 𝐿 𝑞 ∗ (𝑏 ,𝐿 )+ 𝛼 𝐻 (𝑞 ∗ (𝑏 ,𝐻 )−𝑞 ∗ (𝑏 ,𝐿 ))< 𝜆 (𝛼 𝐻 −𝛼 𝐿 ) 2 2(1−𝜆 ) + 𝛼 𝐿 2 2 + (𝑞 𝐻 ∗ ) 2 2 , both constraints (1) and (3) must be binding. The optimization problem is equivalently Max {𝑞 𝐻 ,𝑞 𝐿 } Π=𝜆 ( 𝛼 𝐿 𝑞 𝐿 +𝛼 𝐻 (𝑞 𝐻 −𝑞 𝐿 )− (𝑞 𝐻 ) 2 2 )+(1−𝜆 )(𝛼 𝐿 𝑞 𝐿 − (𝑞 𝐿 ) 2 2 ), 93 s.t. 𝛼 𝐿 𝑞 𝐿 +𝛼 𝐻 (𝑞 𝐻 −𝑞 𝐿 )− 𝜆 (𝛼 𝐻 −𝛼 𝐿 ) 2 2(1−𝜆 ) − 𝛼 𝐿 2 2 − (𝑞 𝐻 ) 2 2 =0; Using the Lagrangian method, we define 𝐿 =𝛱 +𝛽 (𝛼 𝐿 𝑞 𝐿 +𝛼 𝐻 (𝑞 𝐻 −𝑞 𝐿 )− 𝜆 (𝛼 𝐻 −𝛼 𝐿 ) 2 2(1−𝜆 ) − 𝛼 𝐿 2 2 − (𝑞 𝐻 ) 2 2 ) and solve the first order conditions 𝜕𝐿 𝜕 𝑞 𝐻 = 𝜕𝐿 𝜕 𝑞 𝐿 = 𝜕𝐿 𝜕𝛽 =0 for 𝛽 ≥ 0 , to obtain 𝑞 ∗ (𝑏 ,𝐻 )=𝛼 𝐻 and 𝑞 ∗ (𝑏 ,𝐿 )= 𝛼 𝐻 (1−2𝜆 )+𝛼 𝐿 2(1−𝜆 ) . Since 𝜆 <𝛼 implies (1− 2𝜆 )+𝛼 >0, thus 𝑞 𝐿 ∗ >0. If λ= 1 2 , then 𝑞 ∗ (𝑏 ,𝐿 )= 𝛼 𝐻 (1−2𝜆 )+𝛼 𝐿 2(1−𝜆 ) =𝑞 𝐿 ∗ =𝛼 𝐿 . Step 2.2: Suppose that 𝜆 ≥𝛼 , then Π 𝐹 ∗ =𝜆 𝛼 𝐻 2 2 , the firm ’s optimization problem is Max {𝑞 𝐻 ,𝑞 𝐿 ,𝑝 𝐻 ,𝑝 𝐿 } Π=𝜆 (𝑝 𝐻 − (𝑞 𝐻 ) 2 2 )+(1−𝜆 )(𝑝 𝐿 − (𝑞 𝐿 ) 2 2 ), s.t.(1): 𝛼 𝐻 𝑞 𝐻 −𝑝 𝐻 ≥Max{𝛼 𝐻 𝑞 𝐿 −𝑝 𝐿 ,0}; (2):𝛼 𝐿 𝑞 𝐿 −𝑝 𝐿 ≥Max{𝛼 𝐿 𝑞 𝐻 −𝑝 𝐻 ,0}; (3): 𝑝 𝐻 − (𝑞 𝐻 ) 2 2 ≤𝜆 𝛼 𝐻 2 2 Similarly to step 2.1, co nstraint (3) needs to be binding. Otherwise we have 𝑞 ∗ (𝑏 ,𝐻 )=𝛼 𝐻 , 𝑝 ∗ (𝑏 ,𝐻 )=𝛼 𝐻 2 , and thus 𝑝 ∗ (𝑏 ,𝐻 )− (𝑞 ∗ (𝑏 ,𝐻 )) 2 2 >𝜆 𝛼 𝐻 2 2 contradicting with constraint (3). Moreover, we show that if 𝜆 < 1 2 , constraint (1) is not binding, because 1− 2𝜆 >0 implies 𝛼 𝐻 𝑞 ∗∗ (𝑙,𝐻 )−𝑝 ∗ (𝑏 ,𝐻 )>𝛼 𝐻 𝑞 ∗ (𝑏 ,𝐿 )−𝑝 ∗ (𝑏 ,𝐿 ) , for the equilibrium product line 𝑞 ∗ (𝑏 ,𝐿 )=𝛼 𝐿 ; 𝑞 ∗ (𝑏 ,𝐻 )=𝛼 𝐻 ; 𝑝 ∗ (𝑏 ,𝐿 )=𝛼 𝐿 2 ; 𝑝 ∗ (𝑏 ,𝐻 )= (1+𝜆 ) 2 𝛼 𝐻 2 . If 𝜆 ≥ 1 2 , then all constraints must be binding. Therefore, we can simplify the optimization problem as follows: Max {𝑞 𝐻 ,𝑞 𝐿 } Π=𝜆 (𝜆 𝛼 𝐻 2 2 )+(1−𝜆 )(𝛼 𝐿 𝑞 𝐿 − (𝑞 𝐿 ) 2 2 ), s.t. 𝜆 𝛼 𝐻 2 2 + (𝑞 𝐻 ) 2 2 − 𝛼 𝐿 𝑞 𝐿 −𝛼 𝐻 (𝑞 𝐻 −𝑞 𝐿 )=0. Similarly, using the Lagrangian method with 𝐿 =𝛱 +𝛽 (𝛼 𝐿 𝑞 𝐿 +𝛼 𝐻 (𝑞 𝐻 −𝑞 𝐿 )− 𝜆 𝛼 𝐻 2 2 − (𝑞 𝐻 ) 2 2 ) to solve the first order conditions 𝜕𝐿 𝜕 𝑞 𝐻 = 𝜕𝐿 𝜕 𝑞 𝐿 = 𝜕𝐿 𝜕𝛽 =0 for 𝛽 ≥0 and 94 obtain 𝑞 ∗ (𝑏 ,𝐿 )= (1−𝜆 )𝛼 𝐻 2 2(𝛼 𝐻 −𝛼 𝐿 ) ; 𝑞 ∗ (𝑏 ,𝐻 )=𝛼 𝐻 ; 𝑝 ∗ (𝑏 ,𝐿 )= (1−𝜆 )𝛼 𝐻 2 𝛼 𝐿 2(𝛼 𝐻 −𝛼 𝐿 ) ; 𝑝 ∗ (𝑏 ,𝐻 )= (1+𝜆 ) 2 𝛼 𝐻 2 . Step 2.3: Since the product line equilibrium in (i) satisfies the product line satisfies participation constraint and incentive compatibility constraint, 𝑟 𝐻 ∗ defined in (ii) is a best response. Therefore, the equilibrium satisfy the sequential rationality conditions. Further, by (i), the product line is separating on the market state, there fore the updated belief 𝜇 𝑖 is either one or zero satisfies Bayes rule. Step 3: We show that the equilibrium survives D1. By the definition of D1 (Banks and Sobel 1987; Cho and Kreps 1987; Cho and Sobel, 1990), it is equivalent to show that for any out-of-equilibrium pricing strategy Ψ ′ ≠Ψ ∗ , such that whenever the following condition holds for any 𝑖 ∈{𝐻 ,𝐿 } and 𝑗 ≠𝑘 ∈{𝑔 ,𝑏 }: ⋃ {𝑟 𝑖 |Π ∗ (𝑗 )≤Π ̂ 𝑖 (𝑗 ,Ψ ′ ,𝑟 𝑖 )} 𝜇 ⊊⋃ {𝑟 𝑖 |Π ∗ (𝑘 )<Π ̂ 𝑖 (𝑘 ,Ψ ′ ,𝑟 𝑖 )} 𝜇 , where 𝑟 𝑖 is a best response to Ψ ′ given an arbitrary belief 𝜇 ∈[0,1] , i.e., 𝑟 𝑖 =𝐵𝑅 (Ψ ′ ,𝜇 ∈[0,1]) , consumer 𝑖 updates her belief as 𝜇 𝑖 (Ψ ′ )={ 1, 𝑗 =𝑔 0, 𝑗 =𝑏 . Suppose that 𝑗 =𝑔 and 𝑘 =𝑏 , then for any Ψ ′ whenever the above condition holds, the set ⋃ {𝑟 𝑖 |Π ∗ (𝑏 )<Π ̂ 𝑖 (𝑏 ,Ψ ′ ,𝑟 𝑖 )} 𝜇 ≠∅, which then implies 𝜇 (Ψ ′ )=1 by the belief rule. Next, suppose 𝑗 =𝑏 and 𝑘 =𝑔 . We want to show that 𝜇 (Ψ ′ )=0, whenever ⋃ {𝑟 𝑖 |Π ∗ (𝑏 )≤Π ̂ 𝑖 (𝑏 ,Ψ ′ ,𝑟 𝑖 )} 𝜇 ⊊⋃ {𝑟 𝑖 |Π ∗ (𝑔 )<Π ̂ 𝑖 (𝑔 ,Ψ ′ ,𝑟 𝑖 )} 𝜇 . Prove by contradiction: if 𝜇 (Ψ ′ )≠0 , then by (iii) of Lemma 2, we must have 𝜇 (Ψ ′ )=1 , thus Π ∗ (𝑏 )≤ Π ̂ 𝑖 (𝑏 ,Ψ ′ ,𝑟 𝑖 ) . Because the complementary set is null, i.e., ⋃ {𝑟 𝑖 |Π ∗ (𝑏 )>Π ̂ 𝑖 (𝑏 ,Ψ ′ ,𝑟 𝑖 )} 𝜇 = ∅, it violates the strict inclusion condition. Therefore 𝜇 (Ψ ′ )=0, whenever the condition holds for 𝑗 =𝑏 . Thus, the equilibrium survives D1. Step 4: Clearly, the equilibrium is the unique separating equilibrium that survives D1. It remains to show that any pooling equilibrium fails D1. Denote (Ψ ̃ ,𝑟 ̃ 𝑖 ,𝜇̃) 𝑖 =𝐻 ,𝐿 as a pooling equilibrium such that Ψ ̃ (𝑔 )=Ψ ̃ (𝑏 ) and 𝜇̃ 𝐻 (Ψ ̃ )=𝜇̃ 𝐿 (Ψ ̃ )>0 . Without loss of 95 generality, assume that Ψ ̃ ={𝑞̃ 𝐻 ,𝑝̃ 𝐻 ,𝑞̃ 𝑀 ,𝑝̃ 𝑀 ,𝑞̃ 𝐿 ,𝑝̃ 𝐿 } , where 𝑞̃ 𝐻 ≥𝑞̃ 𝑀 ≥𝑞̃ 𝐿 holds when the firm offers less than three products. Suppose that 𝑃̃ is designed so that the informed H-types purchase 𝑞̃ 𝐻 , uninformed consumers purchase 𝑞̃ 𝑀 , and the in formed L-types purchase 𝑞̃ 𝐿 . Then the equilibrium profit in each market state is Π ̃ (𝑔 )=𝜆 (𝑝̃ 𝐻 − 𝑞̃ 𝐻 2 2 )+ (1−𝜆 )(𝑝̃ 𝑀 − 𝑞̃ 𝑀 2 2 ), and Π ̃ (𝑏 )=𝜆 (𝑝̃ 𝑀 − 𝑞̃ 𝑀 2 2 )+(1−𝜆 )(𝑝̃ 𝐿 − 𝑞̃ 𝐿 2 2 ). Denote Ψ ′ as an out-of-equilibrium product line Ψ ′ ={𝑞̃ 𝐻 ,𝑝̃ 𝐻 ,𝑞 ′ ,𝑝̃ 𝑀 ,𝑞̃ 𝐿 ,𝑝̃ 𝐿 }, where 𝑞 ′ >𝑞̃ 𝑀 , and the corresponding belief is 𝜇 ′ (Ψ ′ ) . First suppose that 𝑚 =𝑔 , then the uninformed consumers are L-types. For Π ̃ (𝑔 )<Π ′ (𝑔 ,Ψ ′ ,𝑟 𝑖 ) , we only need to ensure that the uninformed consumers purchase 𝑞 ′ , which by the rationality participation constraint is equivalent to [𝛼 𝐿 +𝜇 ′ (𝛼 𝐻 −𝛼 𝐿 )]𝑞 ′ −𝑝̃ 𝑀 >0 . Second suppose that 𝑚 =𝑏 , then the uninformed consumers are H-types. For Π ̃ (𝑏 )≤Π ′ (𝑏 ,Ψ ′ ,𝑟 𝑖 ) , by contrast, we need to ensure both the participation constraint and incentive compatibility constraint are satisfied, since the uninformed consumers may purchase the product 𝑞 𝐿 instead. Thus we have [𝛼 𝐿 +𝜇 ′ (𝛼 𝐻 −𝛼 𝐿 )]𝑞 ′ −𝑝̃ 𝑀 ≥[𝛼 𝐿 +𝜇 ′ (𝛼 𝐻 −𝛼 𝐿 )]𝑞̃ 𝐿 −𝑝̃ 𝐿 >0 . Therefore, whenever Π ̃ (𝑏 )≤Π ′ (𝑏 ,𝑃 ′ ,𝑟 𝑖 ) we must have Π ̃ (𝑔 )<Π ′ (𝑔 ,Ψ ′ ,𝑟 𝑖 ) , and by D1, rational consumers should update 𝜇̃ 𝑖 (Ψ ′ )=0. Because Ψ ̃ is arbitrary, this belief update contradicts with the assumption that 𝜇̃ 𝑖 >0. Therefore the pooling equilibrium fails D1. ∎ Proof of Proposition 1 Since 𝑞 ∗ (𝑔 ,𝐿 ) is the same as the equilibrium under full-information, to show that consumers obtain higher quality products (on average) when the firm collects data, it suffices to check whether 𝑞 ∗ (𝑏 ,𝐿 )>𝑞 ∗ (𝑔 ,𝐿 ) . (1) If 𝜆 ≤ 1 2 , then 𝑞 ∗ (𝑏 ,𝐿 )=𝛼 𝐿 >𝑞 ∗ (𝑔 ,𝐿 ) (2) If 𝛼 >𝜆 , and 𝜆 > 1 2 , then 𝑞 ∗ (𝑏 ,𝐿 )= 𝛼 𝐻 (1−2𝜆 )+𝛼 𝐿 2(1−𝜆 ) = 𝛼 𝐻 −𝛼 𝐿 2(1−𝜆 ) +𝑞 ∗ (𝑔 ,𝐿 )>𝑞 ∗ (𝑔 ,𝐿 ) (3) If 𝛼 ≤𝜆 , and 𝜆 > 1 2 , then 𝑞 ∗ (𝑏 ,𝐿 )= (1−𝜆 )𝛼 𝐻 2 2(𝛼 𝐻 −𝛼 𝐿 ) = 𝛼 𝐻 2 [(1−𝛼 ) 2 +(𝛼 −𝜆 ) 2 ] 2(1−𝜆 )(𝛼 𝐻 −𝛼 𝐿 ) +𝑞 ∗ (𝑔 ,𝐿 )> 𝑞 ∗ (𝑔 ,𝐿 ) . 96 It remains to show that 𝑞 ∗ (𝑏 ,𝐿 )≤𝛼 𝐿 . First, 𝛼 𝐻 (1−2𝜆 )+𝛼 𝐿 2(1−𝜆 ) −𝛼 𝐿 = 𝛼 𝐻 −𝛼 𝐿 2(1−𝜆 ) (1−2𝜆 ) . Since 𝑞 ∗ (𝑏 ,𝐿 )= 𝛼 𝐻 (1−2𝜆 )+𝛼 𝐿 2(1−𝜆 ) is feasible when 𝜆 > 1 2 , we have 1−2𝜆 <0 , thus 𝑞 ∗ (𝑏 ,𝐿 )<𝛼 𝐿 . Second, (1−𝜆 )𝛼 𝐻 2 2(𝛼 𝐻 −𝛼 𝐿 ) −𝛼 𝐿 = 𝛼 𝐻 2 2(𝛼 𝐻 −𝛼 𝐿 ) [(1−2𝜆 )(1−𝜆 )+(𝛼 −𝜆 )] . Since 𝑞 ∗ (𝑏 ,𝐿 )= (1−𝜆 )𝛼 𝐻 2 2(𝛼 𝐻 −𝛼 𝐿 ) is feasible when 𝛼 ≤𝜆 , and 𝜆 > 1 2 , we have 𝑞 ∗ (𝑏 ,𝐿 )<𝛼 𝐿 . ∎ Proof of Proposition 2 To derive the conditions in which data collection benefits consumers, check the following conditions: (1) If 𝛼 > 𝜆 +𝜆 2 1+𝜆 2 , then 𝛼 𝐻 +3𝛼 𝐿 −4𝜆 𝛼 𝐻 2(1−𝜆 ) ≥2𝛼 𝐿 − 𝜆 (1+𝜆 +3𝜆 2 −𝜆 3 )(𝛼 𝐻 −𝛼 𝐿 ) 1−𝜆 ; (2) If 𝜆 +𝜆 2 1+𝜆 2 ≥𝛼 >𝜆 , then 𝛼 𝐻 +3𝛼 𝐿 −4𝜆 𝛼 𝐻 2(1−𝜆 ) ≥(1−𝜆 )(𝛼 𝐿 +𝜆 2 (𝛼 𝐻 −𝛼 𝐿 )) ; (3) If 𝛼 ≤𝜆 , then 𝛼 𝐻 2 2 ≥(𝛼 𝐻 −𝛼 𝐿 )(𝛼 𝐿 +𝜆 2 (𝛼 𝐻 −𝛼 𝐿 )) . Note that (1) is equivalent to (1−𝜆 ) 2 +𝜆 2 (1−𝜆 )(1+2𝜆 )+5𝜆 3 ≥0; thus for any 𝛼 > 𝜆 +𝜆 2 1+𝜆 2 , 𝐶𝑆 𝐼𝑛 ∗ ≥𝐶𝑆 𝑂𝑢𝑡 ∗ ; (2) is equivalent to 2(1−𝜆 2 )+(𝛼 −1)[3−2(1− 𝜆 2 )(1−𝜆 ) 2 ]≥0 . Since 𝛼 >λ , and 3−2(1−𝜆 2 )(1−𝜆 ) 2 ≥3−2>0 , we have 2(1−𝜆 2 )+(𝛼 −1)[3−2(1−𝜆 2 )(1−𝜆 ) 2 ]>2(1−𝜆 2 )+(𝜆 −1)[3−2(1− 𝜆 2 )(1−𝜆 ) 2 ]=(1−𝜆 )[(1−𝜆 2 )(1−𝜆 ) 2 +𝜆 3 (2−𝜆 )]>0 . Thus for any 𝛼 >𝜆 , 𝐶𝑆 𝐼𝑛 ∗ ≥𝐶𝑆 𝑂𝑢𝑡 ∗ . Finally, (3) holds if and only if either of two conditions holds: (i) 𝜆 < √2 2 , or (ii) 𝜆 ≥ √2 2 , 𝛼 ≥ √2𝜆 2 −1−(2𝜆 2 −1) 2(1−𝜆 2 ) Since 𝛼 ≤𝜆 , it remains to show that √2𝜆 2 −1−(2𝜆 2 −1) 2(1−𝜆 2 ) ≤𝜆 , which is equivalent to show that if 𝛼 =𝜆 , then 𝛼 𝐻 2 2 ≥(𝛼 𝐻 − 𝛼 𝐿 )(𝛼 𝐿 +𝜆 2 (𝛼 𝐻 −𝛼 𝐿 )) . Substitute 𝛼 𝐿 =𝜆 𝛼 𝐻 to the inequality to obtain that (1− 𝜆 2 )(1−𝜆 ) 2 +𝜆 3 (2−𝜆 )≥0. Therefore √2𝜆 2 −1−(2𝜆 2 −1) 2(1−𝜆 2 ) ≤𝜆 . In summary, we have 𝐶 𝑆 𝐼𝑛 ∗∗ ≥𝐶 𝑆 𝑂𝑢𝑡 ∗∗ if and only if either of two conditions holds: (i) 𝜆 < √2 2 , or (ii) 𝜆 ≥ √2 2 and 𝛼 ≥ √2𝜆 2 −1−(2𝜆 2 −1) 2(1−𝜆 2 ) For the incentives of data collection, 97 it suffices to check (Π 𝐼𝑛 ∗∗ −Π 𝑂𝑢𝑡 ∗∗ ) in the following six regions: (1) If 𝛼 > 𝜆 +𝜆 2 1+𝜆 2 and 𝜆 ≤ 1 2 , then Π 𝐼𝑛 ∗ −Π 𝑂𝑢𝑡 ∗ = (𝛼 𝐻 −𝛼 𝐿 ) 2 𝜆 2 4(1−𝜆 ) 𝐴 , where 𝐴 ≡1−2𝜆 − 2𝜆 2 +𝜆 3 (2) If 𝛼 > 𝜆 +𝜆 2 1+𝜆 2 and 𝜆 > 1 2 , then Π 𝐼𝑛 ∗ −Π 𝑂𝑢𝑡 ∗ = (𝛼 𝐻 −𝛼 𝐿 ) 2 16(1−𝜆 ) [4𝜆 2 𝐴 −(2𝜆 −1) 2 ]; (3) If 𝛼 ∈(𝜆 , 𝜆 +𝜆 2 1+𝜆 2 ] and 𝜆 ≤ 1 2 , then Π 𝐼𝑛 ∗ −Π 𝑂𝑢𝑡 ∗ = 𝛼 𝐻 2 4(1−𝜆 ) [𝜆 2 𝐴 (1−𝛼 ) 2 +((𝜆 2 +1)𝛼 − (𝜆 2 +𝜆 )) 2 ]; (4) If 𝛼 ∈(𝜆 , 𝜆 +𝜆 2 1+𝜆 2 ] and 𝜆 > 1 2 , then Π 𝐼𝑛 ∗ −Π 𝑂𝑢𝑡 ∗ = 𝛼 𝐻 2 16(1−𝜆 )𝐵 [((𝛼 −1)𝐵 +4(1−𝜆 )(1+ 𝜆 2 )) 2 +4(1−𝜆 ) 2 (4𝜆 2 𝐴 −(2𝜆 −1) 2 )] , where 𝐵 ≡3+4𝜆 +8𝜆 2 −8𝜆 3 −4𝜆 4 + 4𝜆 5 >0. (5) If 𝛼 ≤𝜆 ≤ 1 2 , then Π 𝐼𝑛 ∗ −Π 𝑂𝑢𝑡 ∗ = 𝛼 𝐻 2 4 𝜆 (1−𝜆 )[𝜆 (1+𝜆 )(1−𝛼 ) 2 −𝛼 2 ], (6) If 𝛼 ≤𝜆 and 𝜆 > 1 2 , thenΠ 𝐼𝑛 ∗ −Π 𝑂𝑢𝑡 ∗ = 𝛼 𝐻 2 16(1−𝜆 ) [−4𝛼 4 (−1+𝜆 2 ) 2 +8𝛼 3 (1−3𝜆 2 + 2𝜆 4 )−8𝛼 2 (1−𝜆 −3𝜆 2 +3𝜆 4 )+4𝛼 (1−2𝜆 −3𝜆 2 +4𝜆 4 )−1+3𝜆 +𝜆 2 +𝜆 3 − 4𝜆 4 ]. Since 𝜆 ∈(0,1), 𝐴 ≡1−2𝜆 −2𝜆 2 +𝜆 3 >0 if and only if 𝜆 < 3−√5 2 ≈0.38, thus in (1) Π 𝑚 2 𝐼 ∗ >Π 𝑂𝑢𝑡 ∗ is equivalent to 𝜆 < 3−√5 2 . In (2), since 𝜆 > 1 2 , we must have 𝐴 <0, thus 4𝜆 2 𝐴 −(2𝜆 −1) 2 <0 , and Π 𝐼𝑛 ∗∗ <Π 𝑂𝑢𝑡 ∗∗ ; In (3) to (6), Π 𝐼𝑛 ∗∗ >Π 𝑂𝑢𝑡 ∗∗ is equivalent to a condition that α<α(𝜆 )∈(0,1) . Because data collection exists in equilibrium if and only if it incentives consumers to opt in and is profitable for the firm, the above condition implies Proposition 3. ∎ Proof of Proposition 3 Derive H-types’ surplus in the following Table A1. From Table A1, H-types obtain higher expected surplus by opting in if and only if either of the following conditions holds: (1) 𝛼 𝐿 +𝜆 2 (𝛼 𝐻 −𝛼 𝐿 )< 𝛼 𝐻 2 2(𝛼 𝐻 −𝛼 𝐿 ) , if 𝛼 ∈(0,𝜆 ]; 98 (2) 𝛼 𝐿 +𝜆 2 (𝛼 𝐻 −𝛼 𝐿 )< 𝛼 𝐻 +3𝛼 𝐿 −4𝜆 𝛼 𝐻 2(1−𝜆 ) 2 , if 𝛼 ∈(𝜆 , 𝜆 +𝜆 2 1+𝜆 2 ]; (3) 𝛼 𝐿 (1−𝜆 +3𝜆 3 −𝜆 4 )−𝜆 3 (3−𝜆 )𝛼 𝐻 < 𝛼 𝐻 +3𝛼 𝐿 −4𝜆 𝛼 𝐻 2 , if 𝛼 ∈( 𝜆 +𝜆 2 1+𝜆 2 ,1) Table A1: H-types’ Ex post Surplus 𝛼 ∈(0,𝜆 ] 𝛼 ∈(𝜆 , 𝜆 +𝜆 2 1+𝜆 2 ] 𝛼 ∈( 𝜆 +𝜆 2 1+𝜆 2 ,1) Opting Out 𝑚 =𝑔 ,𝑏 (𝛼 𝐻 −𝑒 )[𝛼 𝐿 +𝜆 2 (𝛼 𝐻 −𝛼 𝐿 )] (𝛼 𝐻 −𝛼 𝐿 )[𝛼 𝐿 − 𝜆 3 (3−𝜆 ) 1−𝜆 (𝛼 𝐻 −𝛼 𝐿 )] Opting In 𝑚 =𝑔 0 (𝛼 𝐿 −𝜆 𝛼 𝐻 )(𝛼 𝐻 −𝛼 𝐿 ) (1−𝜆 ) 𝑚 =𝑏 𝛼 𝐻 2 (1−𝜆 ) 2 (𝛼 𝐻 +𝛼 𝐿 −2𝜆 𝛼 𝐻 )(𝛼 𝐻 −𝛼 𝐿 ) 2(1−𝜆 ) First, 𝛼 𝐿 +𝜆 2 (𝛼 𝐻 −𝛼 𝐿 )< 𝛼 𝐻 2 2(𝛼 𝐻 −𝛼 𝐿 ) is equivalent to (1−𝛼 ) 2 (1−2𝜆 2 )+𝛼 2 > 0 . Therefore (1) holds if and only if either 1−2𝜆 2 ≥0 , or 1−2𝜆 2 <0 , and 𝛼 > 1−2𝜆 2 +√2𝜆 2 −1 2−2𝜆 2 Second, 𝛼 𝐿 +𝜆 2 (𝛼 𝐻 −𝛼 𝐿 )< 𝛼 𝐻 +3𝛼 𝐿 −4𝜆 𝛼 𝐻 2(𝛼 𝐻 −𝛼 𝐿 ) 2 is equivalent to 𝛼 > 2(1−𝜆 ) 2 𝜆 2 +4𝜆 −1 3−2(1−𝜆 ) 2 (1−𝜆 2 ) , which is implied by 𝛼 >𝜆 . To see this, we will show 𝜆 > 2(1−𝜆 ) 2 𝜆 2 +4𝜆 −1 3−2(1−𝜆 ) 2 (1−𝜆 2 ) : Since 3−2(1−𝜆 ) 2 (1−𝜆 2 )>0, reorganize the inequality to obtain that [(2−3𝜆 ) 2 + 3𝜆 2 ][1+𝜆 (1−𝜆 )]+𝜆 2 (1−𝜆 ) 2 >0 , which is apparent because 𝜆 ∈(0,1), thus (2) holds. Third, 𝛼 𝐿 (1−𝜆 +3𝜆 3 −𝜆 4 )−𝜆 3 (3−𝜆 )𝛼 𝐻 < 𝛼 𝐻 +3𝛼 𝐿 −4𝜆 𝛼 𝐻 2 is equivalent to 2𝜆 4 −6𝜆 3 +2𝜆 +1− 2(1−𝜆 ) 1−𝛼 <0. If 𝛼 >𝜆 , then − 2(1−𝜆 ) 1−𝛼 <−2 , thus 2𝜆 4 −6𝜆 3 + 2𝜆 +1− 2(1−𝜆 ) 1−𝛼 <2𝜆 4 −6𝜆 3 +2𝜆 −1=−2𝜆 (1−𝜆 ) 3 −(1−2𝜆 ) 2 −2𝜆 2 <0 . Therefore (3) holds. So H-types obtain higher expected surplus by opting in if and only if either of the following conditions hold: 𝜆 < √2 2 ≈0.707, or 𝛼 > 1−2𝜆 2 +√2𝜆 2 −1 2−2𝜆 2 . Since data collection strictly improves L-types’ expected surplus, it suffices to check the conditions under which it improves the firm’s profit: (i) 𝜆 < 3−√5 2 ≈0.382, or 99 (ii) α∈( 1−2𝜆 2 +√2𝜆 2 −1 2−2𝜆 2 ,𝛼 (𝜆 )), where 𝛼 (𝜆 ) is the unique implicit root of Π 𝐼𝑛 ∗ =Π 𝑂𝑢𝑡 ∗ . ∎ Proof of Lemma 3 The equilibrium is derived in the main text. It suffices to show that Π 𝐼 𝑛 ∗∗ >Π 𝑂𝑢𝑡 ∗∗ . Since Π 𝑂𝑢𝑡 ∗∗ = { 1 4 (1+𝜆 )[𝜆 (𝛼 𝐻 −𝑒 ) 2 +𝑒 2 ]+ 1 4 (1−𝜆 )[𝛼 𝐿 − 𝜆 (1+𝜆 ) 1−𝜆 𝛼 𝐿 (𝛼 𝐻 −𝛼 𝐿 )] 2 if 𝛼 > 𝜆 +𝜆 2 1+𝜆 2 1 4 (1+𝜆 )[𝜆 (𝛼 𝐻 −𝑒 ) 2 +𝑒 2 ] if 𝛼 ≤ 𝜆 +𝜆 2 1+𝜆 2 and Π 𝐼𝑛 ∗∗ =Pr[𝑚 =𝑏 ]∗ 𝑒 2 2 +Pr[𝑚 =𝑔 ]∗{ 𝜆 (𝛼 𝐻 −𝛼 𝐿 ) 2 4(1−𝜆 ) + 𝛼 𝐿 2 −𝑒 2 4 𝛼 ≥𝜆 𝜆 𝛼 𝐻 2 4 − 𝑒 2 4 𝛼 <𝜆 . Therefore, it suffices to show the followin g three inequalities to show that Π 𝐼𝑛 ∗∗ > Π 𝑂𝑢𝑡 ∗∗ (1) 𝜆 𝛼 𝐻 2 4 + 𝑒 2 4 > 1+𝜆 4 {𝜆 (𝛼 𝐻 −𝑒 ) 2 +𝑒 2 } for 𝛼 ∈(0,𝜆 ) , (2) 𝜆 (𝛼 𝐻 −𝛼 𝐿 ) 2 4(1−𝜆 ) + 𝛼 𝐿 2 +𝑒 2 4 > 1+𝜆 4 [𝜆 (𝛼 𝐻 −𝑒 ) 2 +𝑒 2 ] for 𝛼 ∈(𝜆 , 𝜆 +𝜆 2 1+𝜆 2 ], and (3) 𝜆 (𝛼 𝐻 −𝛼 𝐿 ) 2 4(1−𝜆 ) + 𝛼 𝐿 2 +𝑒 2 4 > 1+𝜆 4 [𝜆 (𝛼 𝐻 −𝑒 ) 2 +𝑒 2 ]+ 1−𝜆 4 [𝛼 𝐿 − 𝜆 (1+𝜆 ) 1−𝜆 𝛼 𝐿 (𝛼 𝐻 −𝛼 𝐿 )] 2 for 𝛼 ∈( 𝜆 +𝜆 2 1+𝜆 2 ,1) . First, 𝜆 𝛼 𝐻 2 4 + 𝑒 2 4 > 1+𝜆 4 {𝜆 (𝛼 𝐻 −𝑒 ) 2 +𝑒 2 } is equivalent to (𝛼 𝐻 +𝑒 )>(1+ 𝜆 )(𝛼 𝐻 −𝑒 ) . Since (1+𝜆 )(𝛼 𝐻 −𝑒 )=(1−𝜆 2 )(𝛼 𝐻 −𝛼 𝐿 )<𝛼 𝐻 , and 𝛼 𝐻 +𝑒 >𝛼 𝐻 , we must have (𝛼 𝐻 +𝑒 )>(1+𝜆 )(𝛼 𝐻 −𝑒 ) ; Second, since 𝜆 (𝛼 𝐻 −𝛼 𝐿 ) 2 4(1−𝜆 ) + 𝛼 𝐿 2 +𝑒 2 4 > 𝜆 𝛼 𝐻 2 4 + 𝑒 2 4 when 𝛼 >𝜆 , and since 𝜆 𝛼 𝐻 2 4 + 𝑒 2 4 > 1+𝜆 4 {𝜆 (𝛼 𝐻 −𝑒 ) 2 +𝑒 2 } , we must have 𝜆 (𝛼 𝐻 −𝛼 𝐿 ) 2 4(1−𝜆 ) + 𝛼 𝐿 2 +𝑒 2 4 > 1+𝜆 4 {𝜆 (𝛼 𝐻 −𝑒 ) 2 +𝑒 2 } 100 for any 𝛼 ∈(𝜆 , 𝜆 +𝜆 2 1+𝜆 2 ]; Finally, (3) is equivalent to 𝐺 (𝛼 )≡𝜆 (1−𝛼 )(1−3𝜆 −2𝜆 2 +𝜆 3 )+2𝛼 (1− 𝜆 )>0. The inequality holds if (1−3𝜆 −2𝜆 2 +𝜆 3 )≥0. If (1−3𝜆 −2𝜆 2 +𝜆 3 )<0, then 𝜕𝐺 (𝛼 ) 𝜕𝛼 >0. Since 𝛼 > 𝜆 +𝜆 2 1+𝜆 2 , then 𝐺 (𝛼 )>𝐺 ( 𝜆 +𝜆 2 1+𝜆 2 )= 𝜆 (1−𝜆 ) 1+𝜆 2 [3−2𝜆 +𝜆 (1−𝜆 ) 2 ]> 0 ∎ Proof of Lemma 4 The result that opting in reduces L-types’ surplus is obvious. We derive H-types’ surplus in the following Table A2: Table A2: H-types’ Surplus 𝛼 ∈(0,𝜆 ] 𝛼 ∈(𝜆 , 𝜆 +𝜆 2 1+𝜆 2 ] 𝛼 ∈( 𝜆 +𝜆 2 1+𝜆 2 ,1) Opting Out 𝑚 =𝑔 ,𝑏 (𝛼 𝐻 −𝑒 )[𝛼 𝐿 +𝜆 2 (𝛼 𝐻 −𝛼 𝐿 )] (𝛼 𝐻 −𝛼 𝐿 )[𝛼 𝐿 − 𝜆 3 (3−𝜆 ) 1−𝜆 (𝛼 𝐻 −𝛼 𝐿 )] Opting In 𝑚 =𝑔 0 (𝛼 𝐿 −𝜆 𝛼 𝐻 )(𝛼 𝐻 −𝛼 𝐿 ) (1−𝜆 ) 𝑚 =𝑏 (𝛼 𝐻 −𝑒 )𝑒 From Table A2, H-types obtain higher expected surplus by opting in if and only if either of the following conditions holds: (1) 𝛼 𝐿 +𝜆 2 (𝛼 𝐻 −𝛼 𝐿 )< 𝑒 2 , if 𝛼 ∈(0,𝜆 ]; (2) 𝛼 𝐿 +𝜆 2 (𝛼 𝐻 −𝛼 𝐿 )< 𝑒 2 + (𝛼 𝐿 −𝜆 𝛼 𝐻 ) 2(1−𝜆 ) 2 , if 𝛼 ∈(𝜆 , 𝜆 +𝜆 2 1+𝜆 2 ]; (3) 𝛼 𝐿 − 𝜆 3 (3−𝜆 ) 1−𝜆 (𝛼 𝐻 −𝛼 𝐿 )< 𝑒 (1−𝜆 ) 2 + (𝛼 𝐿 −𝜆 𝛼 𝐻 ) 2(1−𝜆 ) , if 𝛼 ∈( 𝜆 +𝜆 2 1+𝜆 2 ,1) First, 𝛼 𝐿 +𝜆 2 (𝛼 𝐻 −𝛼 𝐿 )< 𝑒 2 is equivalent to 𝜆 (1−2𝜆 ) (1−𝜆 )(1+2𝜆 ) >𝛼 . Since 𝛼 >0 , it requires that 𝜆 <0.5. Note that 𝜆 (1−2𝜆 ) (1−𝜆 )(1+2𝜆 ) <𝜆 Therefore (1) holds if 𝛼 < 𝜆 (1−2𝜆 ) (1−𝜆 )(1+2𝜆 ) , 101 and 𝜆 <0.5. Second, 𝛼 𝐿 +𝜆 2 (𝛼 𝐻 −𝛼 𝐿 )< 𝑒 2 + (𝛼 𝐿 −𝜆 𝛼 𝐻 ) 2(1−𝜆 ) 2 is equivalent to 𝛼 > 𝜆 (4−5𝜆 +2𝜆 2 ) 1+(1−𝜆 )(3−2𝜆 )𝜆 , which equals 𝜆 + 𝜆 (1−𝜆 ) 2 (3−2𝜆 ) 1+(1−𝜆 )(3−2𝜆 )𝜆 >𝜆 , and equals 𝜆 +𝜆 2 1+𝜆 2 + 𝜆 (1−𝜆 )(3−6𝜆 +2𝜆 2 ) (1+𝜆 2 )(1+(1−𝜆 )(3−2𝜆 )𝜆 ) < 𝜆 +𝜆 2 1+𝜆 2 , if and only if 3−6𝜆 +2𝜆 2 <0 . Therefore the above condition is equivalent to 𝜆 (4−5𝜆 +2𝜆 2 ) 1+(1−𝜆 )(3−2𝜆 )𝜆 <𝛼 and 3−6𝜆 +2𝜆 2 <0. Third, 𝛼 𝐿 − 𝜆 3 (3−𝜆 ) 1−𝜆 (𝛼 𝐻 −𝛼 𝐿 )< 𝑒 (1−𝜆 ) 2 + (𝛼 𝐿 −𝜆 𝛼 𝐻 ) 2(1−𝜆 ) if and only if 𝜆 (7𝜆 −2𝜆 2 −2) (1−𝜆 ) 3 +(4−𝜆 )𝜆 2 >α . Further, 𝜆 (7𝜆 −2𝜆 2 −2) (1−𝜆 ) 3 +(4−𝜆 )𝜆 2 > 𝜆 +𝜆 2 1+𝜆 2 if and only if 3−6𝜆 +2𝜆 2 <0 . Finally, 3−6𝜆 +2𝜆 2 <0 if and only if 𝜆 > 3−√3 2 ≈0.634. ∎ Proof of Proposition 4 Derive the expected consumer surplus for opting in and opting out: 𝐶𝑆 𝑂𝑢𝑡 ∗∗ ={ 2𝛼 𝐿 − 𝜆 (1+𝜆 +3𝜆 2 −𝜆 3 )(𝛼 𝐻 −𝛼 𝐿 ) 1−𝜆 if 𝛼 > 𝜆 +𝜆 2 1+𝜆 2 (1−𝜆 )[𝛼 𝐿 +𝜆 2 (𝛼 𝐻 −𝛼 𝐿 )] if 𝛼 ≤ 𝜆 +𝜆 2 1+𝜆 2 , 𝐶 𝑆 𝐼𝑛 ∗∗ = 𝜆 (𝛼 𝐻 −𝛼 𝐿 )(𝛼 𝐿 −𝜆 𝛼 𝐻 ) 2(1−𝜆 ) . First, i f 𝛼 ≤ 𝜆 +𝜆 2 1+𝜆 2 , then 𝐶 𝑆 𝐼𝑛 ∗∗ >𝐶𝑆 𝑂𝑢𝑡 ∗∗ is equivalent to 1+𝜆 (1−𝜆 ) 2 2−2𝜆 2 +𝜆 3 <𝛼 . Since 1+𝜆 (1−𝜆 ) 2 2−2𝜆 2 +𝜆 3 < 𝜆 +𝜆 2 1+𝜆 2 is equivalent to (3−𝜆 )𝜆 2 >1, thus 𝐶 𝑆 𝐼𝑛 ∗ >𝐶𝑆 𝑂𝑢𝑡 ∗ and 𝛼 ≤ 𝜆 +𝜆 2 1+𝜆 2 are equivalent to 𝛼 ∈( 1+𝜆 (1−𝜆 ) 2 2−2𝜆 2 +𝜆 3 , 𝜆 +𝜆 2 1+𝜆 2 ] and (3−𝜆 )𝜆 2 >1 Second, if 𝛼 > 𝜆 +𝜆 2 1+𝜆 2 , then 𝐶 𝑆 𝐼𝑛 ∗∗ >𝐶𝑆 𝑂𝑢𝑡 ∗∗ is equivalent to 𝛼 < 𝜆 2 +3𝜆 3 −𝜆 4 1−𝜆 +𝜆 2 +3𝜆 3 −𝜆 4 . Clearly 𝜆 2 +3𝜆 3 −𝜆 4 1−𝜆 +𝜆 2 +3𝜆 3 −𝜆 4 <1, and 𝜆 2 +3𝜆 3 −𝜆 4 1−𝜆 +𝜆 2 +3𝜆 3 −𝜆 4 > 𝜆 +𝜆 2 1+𝜆 2 if and only if (3−𝜆 )𝜆 2 −1>0. thus 𝐶 𝑆 𝐼𝑛 ∗∗ >𝐶𝑆 𝑂𝑢𝑡 ∗∗ and 𝛼 > 𝜆 +𝜆 2 1+𝜆 2 are equivalent to 𝛼 < 𝜆 2 +3𝜆 3 −𝜆 4 1−𝜆 +𝜆 2 +3𝜆 3 −𝜆 4 and (3− 𝜆 )𝜆 2 >1. In summary, by combining the above two regions, it is sufficient and necessary for 102 opting in to be optimal if 𝛼 ∈ 1+𝜆 (1−𝜆 ) 2 2−2𝜆 2 +𝜆 3 , 𝜆 2 +3𝜆 3 −𝜆 4 1−𝜆 +𝜆 2 +3𝜆 3 −𝜆 4 ) and (3−𝜆 )𝜆 2 >1. Note that since 𝜆 ∈(0,1) , (3−𝜆 )𝜆 2 >1 is equivalent to 𝜆 >1−2cos( 4 9 𝜋 )≈0.653. ∎
Abstract (if available)
Abstract
There is certainly economic interest in the welfare implications when marketers have the statistical power to learn superior information about consumers’ preferences, especially with the development of modern data technologies that has enabled marketers to observe vast information about consumers’ usage records and behavior patterns. Whereas consumers observe only their own usage experiences, marketers can observe more consumers, aggregate their data to isolate environmental noises, and make better inferences of consumers’ intrinsic preferences. This situation reverses the standard economic models that assume consumers have private information on their preferences. ❧ I explore the consequences of this reversal in the preference information between consumers and an informed marketer (a seller, a manufacturer, or an intermediary curator). A countervailing force exists whenever the incentives of the consumes and marketer are not perfectly aligned. Marketers argue that collecting superior information help them serve consumers better by offering more relevant products. Consumer protection advocates, by contrast, raise worries that these methods subject consumers to exploitation through better price discrimination. Consumers are also concerned that the marketer may trick them into purchasing over-priced products or accepting irrelevant recommendations. ❧ For these reasons, we desire a better understanding of the interactions of marketers with superior information and sophisticated consumers with rational suspicion. This dissertation provides theoretical insight into the nature of this interaction as well as to how this interaction affects personalized pricing scheme, product line design, and recommendation matching. ❧ Chapter one (coauthored with Anthony Dukes) examines the implications of superior consumer information in first-degree price discrimination. Superior information occurs when consumer data aggregation enables the firm to learn beyond consumers about their willingness to pay. Since consumers may suspect about being overcharged by firms with superior information, effective price discrimination requires the use of a list price to convince consumers of their value. Because list pricing incurs a signaling cost, the firm, even with price discrimination, is unable to appropriate all consumer surplus. Sometimes the firm may be worse off with superior consumer information. However, there also exists conditions under which price discrimination with superior information is a strict Pareto improvement for the firm and every consumer. ❧ Chapter two (coauthored with Anthony Dukes) examines two additional factors: product design and consumer choices in a second-degree price discrimination. In this chapter, superior information is defined as consumers’ marginal utility for product quality. There are two primary research questions in this chapter: Do consumers receive better fitting products or simply have more surplus extracted? Is learning superior information ever unprofitable? Our analysis suggests that data aggregation creates superior information on consumers’ preferences beyond consumers’ prior knowledge. Consumers’ rational suspicion, however, may confound the firm’s ability to price discriminate using the superior information. Consistent with the results in Chapter 1, product line design with superior information may lead to a strict Pareto optimal outcome for both the firm and every consumer. Another contribution of this chapter is that to effectively communicate with the uninformed consumers, the firm may lower the price of the high-quality product and raise the quality of the low-quality product. This tends to reverse the classic quality distortion and thus restore the efficiency in product line design. ❧ In Chapters one and two, consumer preferences are vertically-differentiated to capture the potential conflicted interests from a monopoly seller. Alternatively, in Chapter three, I examine a situation when consumer preferences are horizontally-differentiated. In contrast to the first two chapters which focus on pricing strategies, Chapter three examines the recommendation strategy from an intermediary that obtains superior information on consumer preferences. The intermediary can then match the uninformed consumer with competing sellers and share the sellers’ profits. Therefore, the superior information may affect both the matching strategy, the sellers’ competition, and the sophisticated uninformed consumers’ choices. I illustrate two potential applications on an auction-based product listing platform and a commission-based marketplace. In either application, I show that the superior information on consumer preference can lead to more efficient economic outcomes and improve every player’s payoff, but it requires a careful design of the matching strategy such that it convinces the consumers that the intermediary is not trying to trick them by exploiting their uninformed preferences. ❧ All three chapters make use of a common modeling framework, which is the fundamental innovation of this dissertation. Rather than knowing their preference ex-ante, consumers only observe private signals of their preference with correlated errors due to random market state. This setting enables the marketer who aggregates the consumers’ signals to partition out the errors and obtain superior information. It also implies that the marketing strategy based on the superior information may credibly communicate with the uninformed consumers, convincing them to update their beliefs. When a marketer learns superior information about consumer preferences, the informational advantage may both facilitates its abilities to exploit consumers’ surplus and confounds its ability of exploitation due to the signaling cost.
Linked assets
University of Southern California Dissertations and Theses
Conceptually similar
PDF
Essays on consumer product evaluation and online shopping intermediaries
PDF
Manipulating consumer opinion on social media using trolls and influencers
PDF
Quality investment and advertising: an empirical analysis of the auto industry
PDF
Essays on digital platforms
PDF
Essays on pricing and contracting
PDF
Essays on the role of entry strategy and quality strategy in market and consumer response
PDF
Three essays on agent’s strategic behavior on online trading market
PDF
The essays on the optimal information revelation, and multi-stop shopping
PDF
Price competition among firms with a secondary source of revenue
PDF
Essays on commercial media and advertising
PDF
Essays on competition for customer memberships
PDF
Essays on the luxury fashion market
PDF
Efficient policies and mechanisms for online platforms
PDF
Essays on product trade-ins: Implications for consumer demand and retailer behavior
PDF
Essays on competition between multiproduct firms
PDF
Essays on revenue management with choice modeling
PDF
Essays on information, incentives and operational strategies
PDF
Essays on understanding consumer contribution behaviors in the context of crowdfunding
PDF
How product designs and presentations influence consumers’ post-acquisition decisions
PDF
Essays on online advertising markets
Asset Metadata
Creator
Xu, Zibin
(author)
Core Title
Marketing strategies with superior information on consumer preferences
School
Marshall School of Business
Degree
Doctor of Philosophy
Degree Program
Business Administration
Publication Date
06/27/2019
Defense Date
04/12/2017
Publisher
University of Southern California
(original),
University of Southern California. Libraries
(digital)
Tag
intermediary recommendations,OAI-PMH Harvest,price discrimination,product line design,superior information,uninformed consumer preferences
Language
English
Contributor
Electronically uploaded by the author
(provenance)
Advisor
Dukes, Anthony (
committee chair
), Mayzlin, Dina (
committee member
), Selove, Mathew (
committee member
), Tan, Guofu (
committee member
)
Creator Email
zibin.xu.2016@marshall.usc.edu,zibin.xu@gmail.com
Permanent Link (DOI)
https://doi.org/10.25549/usctheses-c40-391030
Unique identifier
UC11265703
Identifier
etd-XuZibin-5459.pdf (filename),usctheses-c40-391030 (legacy record id)
Legacy Identifier
etd-XuZibin-5459.pdf
Dmrecord
391030
Document Type
Dissertation
Rights
Xu, Zibin
Type
texts
Source
University of Southern California
(contributing entity),
University of Southern California Dissertations and Theses
(collection)
Access Conditions
The author retains rights to his/her dissertation, thesis or other graduate work according to U.S. copyright law. Electronic access is being provided by the USC Libraries in agreement with the a...
Repository Name
University of Southern California Digital Library
Repository Location
USC Digital Library, University of Southern California, University Park Campus MC 2810, 3434 South Grand Avenue, 2nd Floor, Los Angeles, California 90089-2810, USA
Tags
intermediary recommendations
price discrimination
product line design
superior information
uninformed consumer preferences